AI Outshines Most Humans in Creativity: Study

Large language models like ChatGPT-4 score higher in creativity than the average person but trail highly creative individuals by a significant margin, according to a new study co-authored by Jay Olson, a postdoctoral fellow in the department of psychological and brain sciences at the University of Toronto Mississauga.

Researchers compared LLMs and people on their ability to generate creative ideas using the Divergent Association Task - a test, developed by Olson, that measures verbal creativity and divergent thinking.

Jay Olson (photo by Gabriel Halfant)

The task is simple: name 10 words that are very different from each other. Highly creative individuals choose words that are very different - like galaxy, velvet, hurricane - while those with average creativity might pick more closely linked words like cat, dog and hamster.

The study, published in Scientific Reports , found that LLMs' creativity exceeds that of people with average creativity, but highly creative people surpassed LLMs by a clear margin, with gaps widening in the top 25 per cent of participants and further widening in the top 10 per cent.

These results suggest LLMs may be particularly helpful for less creative people, but raise questions about whether they benefit or hinder highly creative people who work in creative fields, Olson says.

"If people who are highly creative use these kinds of models, are they going to be generating less creative ideas? These models seem creative when you work with them, but there's a big chunk of people that can outperform them on this task," says Olson, who developed the Divergent Association Task while carrying out postdoctoral work at Harvard University.

"Maybe our creative thinking isn't something we should be offloading onto these models."

The study, led by researchers at Université de Montréal, was the largest to date comparing the creativity of humans and LLMs.

Chart comparing mean Divergent Association Task performance of humans and various large language models (Bellemare-Pepin. et al.; Divergent creativity in humans and large language models.)

The Divergent Association Task was chosen as the foundation of the study because previous research found that performance on the exercise correlates with performance on standard creativity tasks, like writing and problem solving.

"These companies all make claims about how this new model is more creative than the last one, or we have the most creative model, but there's no robust metric for assessing that," says Olson. "We thought this task might be one that could be used [to measure LLMs' creativity]."

To do so, researchers repeatedly asked each of the LLMs, including ChatGPT-4 and GeminiPro, to complete the task - and then compared the results with samples from 100,000 participants.

The researchers quantified the "semantic distance" between the words to determine the LLMs' and participants' creativity level.

"Words like cat and dog are very close to each other, so the distance would be smaller - whereas cat and thimble would be further apart," says Olson. "All the task is doing is taking the average semantic distance of the named words."

The human and AI platforms were given the same instructions, and the research team computed their scores the exact same way.

"There's been quite a few studies now that have tested this with different models. This one is much more diverse with a much larger human sample," Olson notes.

He adds that the study reveals the rapid pace of AI development, with new models outperforming their earlier versions. While difficult to predict, Olson says LLMs could potentially continue to increase in creativity as new models are developed - but that might ultimately level out.

"There is speculation that the models have already reached either a plateau or slowing of growth, so I guess we will see what happens," he says. "It's a field where things change very rapidly."

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.