A new study shows that ChatGPT, an application that uses the GPT-4 AI engine, performed better than most college students in a standard test of creativity. The study suggests that AI is developing the ability to generate ideas that are as good as or better than humans.
Researchers from the University of Montana and collaborators used the Torrance Tests of Creative Thinking (TTCT) to measure the creativity of ChatGPT and human participants. The TTCT is a widely used tool that has been assessing human creativity for decades.
The research team, led by Dr. Erik Guzik, an assistant clinical professor at UM’s College of Business, sent eight responses from ChatGPT and 24 responses from UM students who were taking Guzik’s courses on entrepreneurship and personal finance. The responses were scored by Scholastic Testing Service, which was unaware of the involvement of AI.
The results showed that ChatGPT was among the best in creativity. The AI application ranked in the top percentile for fluency – the ability to produce a large number of ideas – and for originality – the ability to produce new ideas. The AI scored slightly lower – in the 97th percentile – for flexibility, the ability to produce different types and categories of ideas.
“This was the first time that we demonstrated that ChatGPT and GPT-4 performed in the top 1% for originality,” Guzik said. “That was new.”
He also noted that some of his UM students performed in the top 1%. However, ChatGTP outperformed most of the college students nationally.
Guzik conducted the experiment during the spring semester. He was joined by Christian Gilde from UM Western and Christian Byrge from Vilnius University. The researchers presented their work in May at the Southern Oregon University Creativity Conference.
“We didn’t interpret the data much at the conference,” Guzik said. “We just presented the results. But we shared strong evidence that AI seems to be developing creative ability on par with or even exceeding human ability.”
Guzik said he asked ChatGPT what it would mean if it did well on the TTCT. The AI gave a strong answer, which they shared at the conference:
“ChatGPT told us we may not fully understand human creativity, which I believe is correct,” he said. “It also suggested we may need more sophisticated assessment tools that can differentiate between human and AI-generated ideas.”
He said the TTCT is protected proprietary material, so ChatGPT couldn’t access information about the test online or in a public database.
Guzik has a long-standing interest in creativity. He was introduced to the Future Problem Solving process developed by Ellis Paul Torrance, the pioneering psychologist who also created the TTCT, when he was a seventh grader in a program for talented-and-gifted students in Palmer, Massachusetts. He fell in love with brainstorming and how it taps into human imagination, and he remains active with the Future Problem Solving organization – he even met his wife at one of its conferences.
Guzik and his team decided to test ChatGPT’s creativity after experimenting with it during the past year.
“We had all been exploring with ChatGPT, and we noticed it had been doing some interesting things that we didn’t expect,” he said. “Some of the responses were novel and surprising.