AI Agents: A comprehensive guide in simple language

Reading Time: 6 minutes

We all have used ChatGPT. Some of you may also have used other Large Language Models (LLMs) as well.

So, we know that LLMs work in isolation, which means, they do not interact with other software. They generate the answers based on the knowledge they have.

What if they can start interacting with other software applications?
What if they can simplify a complex task (that involves multiple steps)?
What if they can self-analyze the answer before providing it to the user?

Well, AI Agents can do all the above and no wonder they are being referred to as baby AGI or a step towards Artificial General Intelligence.

Let us learn more about AI agents in detail but in simple language in this article.

What are AI Agents

We all know that most LLMs are pre-trained, which means they have a knowledge cut-off.

So, if you ask a current affairs question to an LLM, it can not generate the right answer. For example – if you ask ChatGPT, who is the prime minister of the UK? It would give you an answer based on the data it has on Oct’2021.

How can we make ChatGPT answer current affairs questions?

What if we can connect ChatGPT (or any LLM) with Google search?

So, when we ask current affairs questions to ChatGPT, it can access Google using an API and get the right answer.

Cool, isn’t it? Well, that’s what AI Agents can do.

The above example is a very basic AI agent. Let’s take another example

An AI agent can think through and divide complex tasks into multiple smaller simpler tasks

Consider, that you are planning to visit Paris for a week

You can obviously go to ChatGPT or any other LLMs and they can help you with a 7-day itinerary.

However, is a 7-day itinerary enough to plan a trip?

What if you also want some more information like

What is the weather like when you want to visit?
Hotel options that are within a certain budget?
What are the food options within a certain distance from these hotels?

So, the problem statement you gave was –

I am planning to visit Paris for 7 days from September 3 to September 9. Please plan my travel if the weather is worth visiting. I have a budget of USD 150 per day for my hotel stay and I need a vegetarian food option within 500 meters distance of my hotel.

To answer these questions, LLMs must access multiple other tools like weather apps, hotel booking sites, Google Maps, etc.

At the same time, LLMs need to have reasoning capabilities to break down this complex problem into multiple steps and then perform all the steps one by one taking output from previous steps.

Now that’s what LLM Agents can do.

They first reason through the problem and decide what to do. In this case, the problem should be divided into these steps

Access a weather app and check the weather for mentioned dates. If the weather is good then prepare the itinerary.
After preparing the itinerary, access a hotel booking website and search for hotels with the budget mentioned
Shortlist a few hotels and keep them in memory
Now, for every hotel look for vegetarian food options within 500 meters distance.
Once found, verify all the details and provide the information to the user

So, this is a proper Travel agent’s job.

What are AI Agents’ tools?

Tools are the other external applications that Agents can use to retrieve information they need to perform their tasks.

In the above example – The weather App, the Hotel booking site and Google Maps were the tools.

There could be many other tools like Google Search, Wikipedia, Python REPL, etc. We will also discuss tools in the later part of the article.

So basically, AI Agent =

LLM

Tools to access external applications
Plan to simplify the complex task
Self-reflect before providing the final answer

So, what is AI Agent’s self-reflection, you ask?

As we saw in previous examples, an AI agent can access an external application, it can also reason through and simplify complex tasks.

The other attribute of an AI agent is that it can self-reflect before providing the final answer to the user.

What this means is that, once an Agent gets an answer, instead of directly showing it to the answer, the agent can self-reflect to validate the answer.

If the Agent thinks it has got the correct answer then only it will show it to the user else it will again try to gather the correct response.

How do LLM Agents do what they do?

Image Source: https://developer.nvidia.com/blog/introduction-to-llm-agents/

What are the different components of Agents:

Planning
Tasks
Tools
Memory

Agent’s Planning Module

To solve a complex problem, it needs to be divided into smaller manageable tasks. It also needs to design a workflow. LLM agents do the same with the help of their planning module

Agent’s Tasks

As the name suggests, tasks are the jobs that an agent needs to perform.

How to write a well-defined task? Well, think like a manager

Who would I need to hire to get this done?
What process that person should follow and what is the output we want?

This is one example of how to define an Agent’s task

Defining Agentic Prompt

Agent’s tools

Agents use tools to access external applications or software.

A tool could be

Google Search
Database Lookup
Python REPL
LLM Math
Any health website/ blog
Wolfram Alpha
And many more

AI agents use their decision making powers to decide

Whether to use a tool or not
If yes, which tool to use (from the list of tools it has in its disposal)

How to define an AI agent’s tool

Adding a tool to the task?

How to pass the tool while defining the agent?

Agent’s Memory

We need Memory for conversational agents. For agents to have a conversation, they need to know the previous context.

The previous context is provided through “Memory”

How to add a conversation memory to an AI agent?

Conclusion

AI Agents are considered to be the step towards achieving AGI. Well, whether we will achieve AGI is still a debatable topic but AI Agents are indeed useful because of their reasoning ability, memory and interaction with other applications.

In case you are looking to learn AI + LLM in a very simple language in a live online class from an instructor, check out the details here

Tailored AI + LLM Coaching for Senior IT Professionals

In case you are looking to learn AI + Gen AI in an instructor-led live class environment, check out these dedicated courses for senior IT professionals here

Pricing for AI courses for senior IT professionals – https://www.aimletc.com/ai-ml-etc-course-offerings-pricing/

My Name is Nikhilesh and if you have any feedback/suggestions on this article, please feel free to connect with me – https://www.linkedin.com/in/nikhileshtayal/

Disclaimer – The images are taken from Deep Learning AI’s course We are just using it for educational purposes. No copyright infringement is intended. In case any part of content belongs to you or someone you know, please contact us and we will give you credit or remove your content.

Post Views: 625

What are AI Agents? Why are they considered to be a step closer to AGI?

What are AI Agents

An AI agent can think through and divide complex tasks into multiple smaller simpler tasks

What are AI Agents’ tools?

So basically, AI Agent =

So, what is AI Agent’s self-reflection, you ask?