Evaluating creative work with artificial intelligence : evidence from constrained innovation tasks

Addis, Valerio Fedele ORCID: https://orcid.org/0009-0008-4705-319X, Attanasi, Giuseppe ORCID: https://orcid.org/0000-0003-0848-5770, Di Bartolomeo, Giovanni ORCID: https://orcid.org/0000-0001-5016-8483, Mariella, Michele ORCID: https://orcid.org/0009-0000-1941-6354 and Peruzzi, Valentina ORCID: https://orcid.org/0000-0002-6846-0578 (2026) Evaluating creative work with artificial intelligence : evidence from constrained innovation tasks. Technovation, 155 . DOI 10.1016/j.technovation.2026.103571

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
1MB

Official URL: https://doi.org/10.1016/j.technovation.2026.103571

Abstract

We study whether a large language model can reliably evaluate human creativity in constrained, innovationlike tasks. Using expert-generated creative outputs from a validated experiment with workers in cultural and creative industries, we embed ChatGPT as an evaluator and benchmark its assessments against expert human judgments obtained through the Consensual Assessment Technique. Study 1 supports AI reliability by showing that AI-based creativity evaluations exhibit internal consistency comparable to that of expert judges across repeated and independent runs, even under conservative scenarios. Replacing a human judge with an AI evaluator does not reduce inter-rater reliability across drawing, mathematical, and verbal tasks. Beyond reliability, AI evaluations display three additional features that are difficult to achieve with human-only panels: lower evaluative variability, systematically higher scores consistent with a potentially more inclusive evaluative stance, and task-independence of evaluative standards. Study 2 further supports task-independence by showing that AI evaluations are structured along fluency, flexibility, originality, and elaboration, with dimension weights that adapt to task-specific constraints.

Item Type:

Article

Uncontrolled Keywords:

consensual assessment technique; Creativity evaluations; Artificial intelligence; Constrained creativity tasks; Cultural and creative industry professionals; Innovation-like tasks;

JEL classification:

C91 - Design of Experiments: Laboratory, Individual
D83 - Information, Knowledge, and Uncertainty: Search - Learning - Information and Knowledge - Communication - Belief
M14 - Business Administration: Corporate Culture; Diversity; Social Responsibility
O31 - Innovation and Invention: Processes and Incentives

Divisions:

Corvinus Doctoral Schools
Institute of Entrepreneurship and Innovation

Subjects:

Automatizálás, gépesítés
Computer science

DOI:

10.1016/j.technovation.2026.103571

ID Code:

12860

Deposited By:

MTMT SWORD

Deposited On:

27 May 2026 08:37