DEV Community

Cover image for Chatting with your database: What did I learn by letting AI generate SQL?
Srikanth
Srikanth

Posted on

5 1 1 1 1

Chatting with your database: What did I learn by letting AI generate SQL?

This is one of the intriguing use cases of ChatGPT and other LLMs.

You pass the information about your database and a question to a large language model. In return you get an SQL query, which you can execute to get the answer.

As a cricket fan, I had an sqlite database with match statistics so I made AI answer some questions. Here's what I learned:

  • Prompt Engineering is a serious thing. When it comes to utilising LLMs, packaging the right amount of context into the prompts will influence your results. Before using LangChain I wrote custom logic to create context for db schema, and I found myself iterating over minute details to make the queries as accurate as possible.

  • LangChain is not just a wrapper. It is a very smart tool that glues together various moving parts. I found my results to be more consistent using it.

  • Denormalised schema will fetch more consistent results. As you can see in the video, one of the queries returned player uuid instead of name. This was because my data model was designed for a different use case where as it will make it easier for GPT-4 if I had the column as player_name instead of id.

  • There is a risk of data leak if you want to use this in production. Exposing such an interface publicly is risky. We can mitigate this risk my restricting the number of tables shared with AI model and pre-processing the SQL queries generated before executing them. Using a restricted read-only user will be another wise guard rail.

I will continue exploring this use case and share my learning here in future.

Thanks for reading.

API Trace View

How I Cut 22.3 Seconds Off an API Call with Sentry

Struggling with slow API calls? Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (1)

Collapse
 
androaddict profile image
androaddict

Is there any git repo available?

Billboard image

Try REST API Generation for Snowflake

DevOps for Private APIs. Automate the building, securing, and documenting of internal/private REST APIs with built-in enterprise security on bare-metal, VMs, or containers.

  • Auto-generated live APIs mapped from Snowflake database schema
  • Interactive Swagger API documentation
  • Scripting engine to customize your API
  • Built-in role-based access control

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay