AutoGPT and the Rise of Autonomous AI Agents

When humans talk to themselves (raising my own hand here!), other humans cast a side-eye. When AI does it, it’s heralded as the Next Big Thing.

The past two weeks saw the rise of AutoGPT and similar apps, called "AI agents”. AutoGPT has so far garnered the most attention, so we’ll focus on that one. (Another name you may come across often is BabyAGI.) AutoGPT was created by Toran Bruce Richards, a game designer. And now, like mirrors within mirrors in the dazzling game of illusion that is AI, people are already building apps on top of the agents, too.

But what even is an agent (except for, a bit worryingly, the guardians of the virtual world in The Matrix movies)?

What is an AI agent?

In computer science, an "agent" is a program that functions independently and continuously to perform various tasks. These could include archiving computer files or retrieving electronic messages on a scheduled basis.

Now, what is an AI agent? This video is a good introduction: Is AGI here? ChatGPT + GPT-4 + Voice = AutoGPT.

In contrast with ChatGPT, where a human - a prompt engineer, if you will - have to steer the bot to provide the answers or content we need, an agent steers itself. Besides a few initial prompts from the user, the bot is left to its own devices, prompting itself and automating multi-step projects and tasks. It can even self-generate a second AI to help complete a project. And these projects can include anything from personalized recommendations to financial analysis, content creation, product development, and more.

While running a project, AutoGPT breaks down the AI’s actions into “thoughts”, “reasoning,” and “criticism”, so that the user can understand the thought processes. And you can authorize or interrupt each step and provide feedback to improve the recommendations if you wish. AutoGPT can also access the internet and create files on your machine.

Woman talking on old-fashioned telephone like in movie 'The Matrix'

Technical requirements and limitations

Trying it out, for now, can be a bit tricky for the less technical-minded. The application has to be downloaded from the AutoGPT GitHub repository. To run it, you will need Python 3.8 or later, an OpenAI API key, and a PINECONE API key. If you want to enable the optional text-to-speech feature, you will need an ElevenLabs API as well. After adding the API keys to the relevant file, you can run the program by typing “Python scripts main.py” in the command prompt.

At this point, let’s quickly explain PINECONE. PINECONE is a tool that helps manage and organize data in the cloud. It is especially useful for a type of data called “vector data”, which is a way of representing information with numbers. It allows for quick and efficient storage, retrieval, and searching of vector data. In the context of AutoGPT, PINECONE is used to store and retrieve task results. When a task is completed, its result is stored in PINECONE. Later, when a new task is generated, PINECONE can be used to retrieve relevant information from the previous tasks in order to generate the new task. By using PINECONE, the program can maintain context across multiple tasks and generate new tasks that are informed by previous results.

The program works by continuously executing the following steps: It takes the first task from a list and then uses OpenAI’s API to complete the task based on the context. It then returns the result. PINECONE is used to store and retrieve task results, and the task list, which is used to prioritize and generate new tasks. Together, these components make up a task management system that can create, execute, and prioritize tasks based on the objective and on previous results.

On top of the slight technical barriers, users need to be mindful of their API key limits with OpenAI as well, as the cost of running the application is high. But using the tool itself is free as well as open-source.

Autonomous AI agents such as AutoGPT are an exciting new development in the AI trajectory. They showcase the potential of the GPT-4 language model and have, yet again, the potential to revolutionize various industries. It is remarkable how far we have come in AI and automation in such a short time.

Man With Binary Code Projected on His Face

AutoGPT and Artificial General Intelligence (AGI)

On social media, some of the discussion on agents has been around Artificial General Intelligence (AGI). This is a hypothetical level of AI capable of performing any intellectual task that a human can. But AutoGPT itself is still a form of narrow AI, not AGI. While it is impressive that AutoGPT can generate coherent and diverse texts and plans on various topics, these are still specific tasks. Problem-solving ability alone does not constitute AGI.

Nonetheless, the development of autonomous AI agents such as AutoGPT can be seen as a step towards achieving AGI in the future.

The Machine Mindset

Search This Blog

Choose your Champion! Task-Specific vs. General Models