Database AI Agent: Let us Learn how to design it

Reading Time: 5 minutes

Can you access a database using a Large language model?

Before you say “Yes”, read on.

A part of the question is – can LLM write SQL queries? The answer is obviously yes.

However, there are 2 problems with accessing a database using LLMs.

Does LLM know the exact table and column names
How do we know whether the query LLM has generated is right or wrong

The first challenge is easy to understand. The LLM does not have information about the exact table name and column names.

And regarding the second challenge, when we ask an LLM to write a code, it can generate that and then we have an option to check whether it’s right or wrong.

However, when we ask an LLM to generate a query to access a database, writing a query is an intermediary step. We are not getting an SQL query as an output but the result is execution of that query.

And there could be a case that even if the query is wrong, we could get a result. So, how will we ensure that the query is right?

With these 2 challenges, I ask the question again – Can we access a database using an LLM?

Well, the answer is yes. However, the solution is not as straightforward as asking an LLM to write a query and access the database.

But we need to make an AI agent to do that.

In this article, we will talk about how a Database AI agent can write SQL queries and access the database.

Database AI Agent Solution – Think about how a “data analyst”

If you ask a Data analyst a the same job of accessing an unknown database then what would be his/her approach?

Data analysts would normally follow these 3 major steps

They will first understand what the database looks like, what are the tables, their column names, and their data type. etc.
They make exploratory queries to the ‘top k’ rows, or look at summary stats
If they hit an error, they would analyze the error and write a new query.

So, we need to design our Agent in such a way that can imitate the above behaviour and steps. Let’s dive in.

Building a Database AI agent

Step 1 – We will use SQLDatabaseToolkit from Langchain.

What is the purpose you ask?

SQLDatabaseToolkit is used to interact with a database using natural language.

Step 2 – Writing the instructions and role for our Agent

We need to write instructions for our agents. You can write your own prompt. The below is for reference purposes. However, we would like you to note a few instructions:

You are an agent designed to interact with an SQL database.
Given an input question, create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.
You have access to tools for interacting with the database.
You MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.
DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.
DO NOT MAKE UP AN ANSWER OR USE PRIOR KNOWLEDGE, ONLY USE THE RESULTS OF THE CALCULATIONS YOU HAVE DONE.
ALWAYS, as part of your final answer, explain how you got to the answer on a section that starts with: “Explanation:”. Include the SQL query as part of the explanation section.

Step 3 – Writing the format instructions for our Agent

The common way to write this is

Question
Thought
Action
Action Input
Observation
Final Answer

Question is what Agent needs to answer. This is what we want to achieve through the agent.

Thought is what the Agent needs to think to answer the question. It needs to decide whether to use a tool or not.

Action is when the Agent accesses the tool. It could be accessing Google search or a database or a Python Interpreter or any other website.

Action Input is what the Agent access the tool with.

Observation is when the Agent gets the information after accessing the tool

Thought is where the Agent self-analyses the answer. If it thinks the answer is not correct, it could again access the tool to get the correct answer.

Final Answer – Once the agent is sure that it has got the correct answer then it goes ahead and provides the final answer

You could also give an example or two for the agent to understand this complete flow and follow this.

Step 4 – Defining the Agent

Now, let us see how this Agent will go about it’s work

How Database AI agent would work

Agent’s Step 1 – Check the Table

As we discussed earlier, the first thing a data analyst would do is understand the database, i.e. tables, column names, data types, etc.

So, that is what the Agent would also do

Agent’s Step 2 – Check the first few rows

After that agent would go ahead and check the first few rows of data. In the below example, you can see, that it has access first 3 rows from the relevant table

Agent’s Step 3 – Verify the query

As we can see, the agent has first written an SQL query and then it got it verified from the query checker.

Agent’s Step 4 – Provide the output

Once the query is verified, as a final step, the agent will provide the final answer by executing the query.

Conclusion

Writing code is pretty straightforward using an LLM while writing an SQL query or accessing a database is not. LLM needs to think and work like a data analyst.

AI + Gen AI Course for Senior IT Professionals

In case you are looking to learn AI + Gen AI in an instructor-led live class environment, check out these dedicated courses for senior IT professionals here

Pricing for AI courses for senior IT professionals – https://www.aimletc.com/ai-ml-etc-course-offerings-pricing/

My Name is Nikhilesh and if you have any feedback/ suggestion on this article, please feel free to connect with me – https://www.linkedin.com/in/nikhileshtayal/

Disclaimer – The code is taken from Deep Learning AI’s Building Your Own Database Agent course. We are just using it for educational purposes. No copyright infringement is intended. In case any part of content belongs to you or someone you know, please contact us and we will give you credit or remove your content.

Post Views: 251

Can we access a database using an LLM? Learn how to design a Database AI Agent.

However, there are 2 problems with accessing a database using LLMs.

Database AI Agent Solution – Think about how a “data analyst”

Building a Database AI agent

Step 1 – We will use SQLDatabaseToolkit from Langchain.

Step 2 – Writing the instructions and role for our Agent