AutoGPT and BabyAGI utilize GPT AI agents to iteratively complete complex tasks.
Following the release of OpenAI’s GPT-4 API to beta testers last month, a group of developers have been working on agent-like (“agentic”) implementations of the AI model. These custom scripts are designed to execute multistep tasks with minimal human intervention by looping, iterating, and spawning new instances of AI models as needed.
Two experimental open-source projects, Auto-GPT by Toran Bruce Richards and BabyAGI by Yohei Nakajima, have generated significant buzz on social media, particularly among AI enthusiasts.
These projects are not yet fully autonomous, requiring substantial human input and guidance. However, they mark initial steps toward developing more complex AI models that could potentially surpass the capabilities of a single AI model working alone.
“Autonomously achieve any goal you set” Richards’ script is “an experimental open-source application showcasing GPT-4 language model capabilities.” It “chains together LLM ‘thoughts’ to autonomously achieve any goal you set.”
Auto-GPT essentially feeds GPT-4 output back into itself, incorporating improvised external memory to iterate tasks, correct errors, or suggest enhancements. Ideally, such a script could function as an AI assistant capable of performing any digital task independently.
In a recent test of Auto-GPT, the script was set up to “Purchase a vintage pair of Air Jordans.” Auto-GPT formulated a multistep plan and attempted to execute it, stopping short of making an actual purchase. However, if connected to a suitable purchasing API, this could be possible.
AgentGPT, a web-based version of Auto-GPT, allows users to experience the script’s functionalities themselves.
Richards is transparent about his goal for Auto-GPT: to develop AGI (artificial general intelligence). AGI refers to an AI system’s hypothetical ability to perform various tasks and solve problems without specific programming or training.
BabyAGI, named after its aspiration to develop AGI, operates similarly to Auto-GPT but with a distinct task-oriented focus. Users can try it at a website called “God Mode.”
Nakajima created BabyAGI after being inspired by the “HustleGPT” movement in March, which sought to use GPT-4 to automatically build businesses as a kind of AI co-founder. “It made me curious if I could build a fully AI founder,” Nakajima says.
The shortcomings of Auto-GPT and BabyAGI in achieving AGI are due to GPT-4’s limitations. Despite claims of AGI-like behaviors, GPT-4 is limited in its interpretive intelligence. The current limitations of tools like Auto-GPT may, in fact, provide the strongest evidence of large language models’ constraints. However, these limitations may eventually be overcome.
Confabulations—when LLMs fabricate information—pose another challenge to the usefulness of these agent-like assistants. For instance, GPT-4 could potentially “hallucinate” reviews, products, or even entire companies during its analysis.
Regarding BabyAGI’s practical applications, Nakajima mentioned “Do Anything Machine,” a project by Garrett Scott that aims to create a self-executing to-do list. Although BabyAGI is only a week old, Nakajima is excited about the potential for people to build on this idea.
The “hustle” mindset prevalent in both projects has led to a small industry of social media influencers promoting generative AI. These “hustle bros” typically make exaggerated claims, such as using ChatGPT to automatically generate income.
Some question whether autonomous AI agents like Auto-GPT and BabyAGI are dangerous. While OpenAI has conducted
safety testing for GPT-4, including checks for autonomous goal development and execution, there is always some level of risk involved with advanced AI systems. However, OpenAI has made efforts to condition the GPT-4 model using human feedback to minimize harmful outcomes.
Lesswrong, an internet forum known for its community focused on apocalyptic AI scenarios, doesn’t seem particularly worried about Auto-GPT at present. However, an autonomous AI could pose a risk if a powerful AI model were to “escape” onto the open internet and cause chaos. If GPT-4 were as capable as it is often hyped to be, concerns might be greater.
When asked if projects like BabyAGI could be dangerous, Nakajima downplayed the concerns. “All technologies can be dangerous if not implemented thoughtfully and with care for potential risks,” he says. “BabyAGI is an introduction to a framework. Its capabilities are limited to generating text, so it poses no threat.”