Most up-to-date AI & Gen AI coaching for senior IT professionals
What are AI Agents? Why are they considered to be a step closer to AGI?
We all have used ChatGPT. Some of you may also have used other Large Language Models (LLMs) as well.
So, we know that LLMs work in isolation, which means, they do not interact with other software. They generate the answers based on the knowledge they have.
- What if they can start interacting with other software applications?
- What if they can simplify a complex task (that involves multiple steps)?
- What if they can self-analyze the answer before providing it to the user?
Well, AI Agents can do all the above and no wonder they are being referred to as baby AGI or a step towards Artificial General Intelligence.
Let us learn more about AI agents in detail but in simple language in this article.
What are AI Agents
We all know that most LLMs are pre-trained, which means they have a knowledge cut-off.
So, if you ask a current affairs question to an LLM, it can not generate the right answer. For example – if you ask ChatGPT, who is the prime minister of the UK? It would give you an answer based on the data it has on Oct’2021.
How can we make ChatGPT answer current affairs questions?
What if we can connect ChatGPT (or any LLM) with Google search?
So, when we ask current affairs questions to ChatGPT, it can access Google using an API and get the right answer.
Cool, isn’t it? Well, that’s what AI Agents can do.
The above example is a very basic AI agent. Let’s take another example
An AI agent can think through and divide complex tasks into multiple smaller simpler tasks
Consider, that you are planning to visit Paris for a week
You can obviously go to ChatGPT or any other LLMs and they can help you with a 7-day itinerary.
However, is a 7-day itinerary enough to plan a trip?
What if you also want some more information like
- What is the weather like when you want to visit?
- Hotel options that are within a certain budget?
- What are the food options within a certain distance from these hotels?
So, the problem statement you gave was –
I am planning to visit Paris for 7 days from September 3 to September 9. Please plan my travel if the weather is worth visiting. I have a budget of USD 150 per day for my hotel stay and I need a vegetarian food option within 500 meters distance of my hotel.
To answer these questions, LLMs must access multiple other tools like weather apps, hotel booking sites, Google Maps, etc.
At the same time, LLMs need to have reasoning capabilities to break down this complex problem into multiple steps and then perform all the steps one by one taking output from previous steps.
Now that’s what LLM Agents can do.
They first reason through the problem and decide what to do. In this case, the problem should be divided into these steps
- Access a weather app and check the weather for mentioned dates. If the weather is good then prepare the itinerary.
- After preparing the itinerary, access a hotel booking website and search for hotels with the budget mentioned
- Shortlist a few hotels and keep them in memory
- Now, for every hotel look for vegetarian food options within 500 meters distance.
- Once found, verify all the details and provide the information to the user
So, this is a proper Travel agent’s job.
What are AI Agents’ tools?
Tools are the other external applications that Agents can use to retrieve information they need to perform their tasks.
In the above example – The weather App, the Hotel booking site and Google Maps were the tools.
There could be many other tools like Google Search, Wikipedia, Python REPL, etc. We will also discuss tools in the later part of the article.
So basically, AI Agent =
LLM
- Tools to access external applications
- Plan to simplify the complex task
- Self-reflect before providing the final answer
So, what is AI Agent’s self-reflection, you ask?
As we saw in previous examples, an AI agent can access an external application, it can also reason through and simplify complex tasks.
The other attribute of an AI agent is that it can self-reflect before providing the final answer to the user.
What this means is that, once an Agent gets an answer, instead of directly showing it to the answer, the agent can self-reflect to validate the answer.
If the Agent thinks it has got the correct answer then only it will show it to the user else it will again try to gather the correct response.
How do LLM Agents do what they do?
Image Source: https://developer.nvidia.com/blog/introduction-to-llm-agents/
What are the different components of Agents:
- Planning
- Tasks
- Tools
- Memory
Agent’s Planning Module
To solve a complex problem, it needs to be divided into smaller manageable tasks. It also needs to design a workflow. LLM agents do the same with the help of their planning module
Agent’s Tasks
As the name suggests, tasks are the jobs that an agent needs to perform.
How to write a well-defined task? Well, think like a manager
- Who would I need to hire to get this done?
- What process that person should follow and what is the output we want?
This is one example of how to define an Agent’s task
Defining Agentic Prompt
Agent’s tools
Agents use tools to access external applications or software.
A tool could be
- Google Search
- Database Lookup
- Python REPL
- LLM Math
- Any health website/ blog
- Wolfram Alpha
- And many more
AI agents use their decision making powers to decide
- Whether to use a tool or not
- If yes, which tool to use (from the list of tools it has in its disposal)
How to define an AI agent’s tool
Adding a tool to the task?
How to pass the tool while defining the agent?
Agent’s Memory
We need Memory for conversational agents. For agents to have a conversation, they need to know the previous context.
The previous context is provided through “Memory”
How to add a conversation memory to an AI agent?
Conclusion
AI Agents are considered to be the step towards achieving AGI. Well, whether we will achieve AGI is still a debatable topic but AI Agents are indeed useful because of their reasoning ability, memory and interaction with other applications.
In case you are looking to learn AI + LLM in a very simple language in a live online class from an instructor, check out the details here
Tailored AI + LLM Coaching for Senior IT Professionals
In case you are looking to learn AI + Gen AI in an instructor-led live class environment, check out these dedicated courses for senior IT professionals here
Pricing for AI courses for senior IT professionals – https://www.aimletc.com/ai-ml-etc-course-offerings-pricing/
My Name is Nikhilesh and if you have any feedback/suggestions on this article, please feel free to connect with me – https://www.linkedin.com/in/nikhileshtayal/
Disclaimer – The images are taken from Deep Learning AI’s course We are just using it for educational purposes. No copyright infringement is intended. In case any part of content belongs to you or someone you know, please contact us and we will give you credit or remove your content.