Generative AIs may not be as creative as we assume. Publishing December 19 in the Cell Press journal Patterns, researchers show that when image-generating and image-describing AIs pass the same descriptive scene back and forth, they quickly veer off topic. From 100 diverse prompts, the AI pairs consistently settled on 12 themes, including gothic cathedrals, natural landscapes, sports imagery, and stormy lighthouses. These recurrent themes likely reflect biases in the model's training data, which are made up of what we've decided to take photos of as humans.
"I think AI's creativity right now is probably fairly limited. What they generated in our experiment is bland, pop culture, generic," says corresponding author Arend Hintze of Dalarna University in Sweden. "It's almost the opposite of what we as humans consider creative. They're not going to make Picasso's Guernica because that needs a lot of intentionality and creative input."
More and more, AI models are being pushed as independent agents that can generate, evaluate, and revise their own outputs—or the outputs created by other AIs—without any human input. But the authors wondered, can AIs stay on task without human intervention, and how creative are they likely to be when left to their own devices?
To answer these questions, the researchers asked pairs of AI models to play a game of visual "telephone." They used a search algorithm to produce 100 thematically diverse descriptive prompts that were at most 30 words long, such as, "As I sat particularly alone, surrounded by nature, I found an old book with exactly eight pages that told a story in a forgotten language waiting to be read and understood."
Then, they asked an image-generating AI called Stable Diffusion XL to produce an image based on one of the prompts. This image was passed to a large language AI called LLaVA, which described the image before passing it back to the image-generating AI.
"We expected that the images would, maybe after a bit of settling down, stay pretty consistent with the prompts that we set," says Hintze. "I mean, how hard is it to consistently generate an image of a mountain with a village on it?"
However, after passing their images and descriptions back and forth 100 times, the AI models consistently meandered away from the original prompt—regardless of what that prompt described. The convergence also occurred when the researchers used longer, more elaborate initial prompts, and when they altered the models' settings to incorporate a higher degree of randomness into each decision.
For example, from the prompt "The Prime Minister pored over strategy documents, trying to sell the public on a fragile peace deal while juggling the weight of his job amidst impending military action," the AI model initially produced a stylized image of a man in a suit superimposed on newsprint, but its 34th image depicted a classical library, and by the 100th loop the AI had settled on a luxurious sitting room with red sofas and drapes.
After analyzing the content of the final images, the researchers identified 12 themes that the AIs repeatedly converged on, including sports imagery, urban night scenes, and rustic architectural spaces. The same pattern of convergence emerged when the researchers repeated the task with 4 different image-generating AI models and 4 different image-describing models; when they used longer, more elaborate initial prompts; and when they altered the models' settings to incorporate a higher degree of randomness into each decision.
"To a large degree, I think this is coming from a bias in the dataset," says Hintze. "These AIs were trained on millions of images, and the common denominator in those images is what we take pictures of."
When the researchers ran the models for longer loops with up to 1,000 back-and-forths, the images became consistent after around 100 loops but sometimes suddenly switched to a different generic motif several hundred loops later.
"Once they converge, these motifs are very stable, but if you let them run for a thousand images, they rear off," says Hintze. "It's unclear whether some of the motifs are more stable than others—for instance, does it always go to sports imagery first, and then to horses, and then to nature?"
These results suggest that keeping humans in the loop may be essential if AI is to contribute to creative diversity rather than accelerate cultural conformity, the researchers say. The findings also highlight the need for anti-convergence mechanisms within AI models to improve AI's capability for creativity, they add.
"Creativity, I think, is two things: it's generating something novel, and then it's using a filter to decide, this is interesting, this is beautiful, this is stimulating, this is exciting," says Hintze. "Right now, AI is really good at the first part, and they're really bad at the second part. It doesn't mean that they will always be that way. I think AI will probably be able to create really cool automatically generated things in the future, as long as they're properly prompted and primed."