DEV Community

Surya
Surya

Posted on

Automating Report Generation with LLMs: A Generic Solution

Introduction

Building dynamic reports often requires writing SQL queries, which can be inefficient and difficult to scale in a production environment. To streamline this process, I built a solution using Large Language Models (LLMs) that generates reports from natural language queries.

I implemented this in Ruby on Rails 💎, my preferred language, but the approach can be applied in any language like Python 🐍, Java ☕, or JavaScript 📜.


The Problem

Users need to retrieve data insights from databases without writing SQL queries manually. A system should:

  1. Understand user intent 🧠 – Identify the relevant database table.
  2. Generate the appropriate SQL query 📝 – Convert natural language queries into executable SQL.
  3. Fetch and return results in a structured format 📊 – Output the data in an easy-to-consume format.

This approach ensures a seamless experience for non-technical users while leveraging the power of LLMs to dynamically generate reports.


The Solution

I built a modular system that consists of three primary components:

1. Llm::Chat - The OpenAI Chat API Wrapper

This class handles communication with OpenAI's Chat API.

require 'net/http'
require 'json'

class Llm::Chat
  OPENAI_API_URL = "https://api.openai.com/v1/chat/completions"
  API_KEY = ENV['OPENAI_API_KEY']

  def initialize(payload:)
    @payload = payload
  end

  def call
    response = request_openai
    parse_response(response)
  end

  private

  def request_openai
    uri = URI(OPENAI_API_URL)
    http = Net::HTTP.new(uri.host, uri.port)
    http.use_ssl = true
    request = Net::HTTP::Post.new(uri.path, headers)
    request.body = @payload.to_json
    http.request(request)
  end

  def headers
    {
      "Content-Type" => "application/json",
      "Authorization" => "Bearer #{API_KEY}"
    }
  end

  def parse_response(response)
    JSON.parse(response.body)["choices"]&.first["message"]["content"].strip
  rescue
    nil
  end
end
Enter fullscreen mode Exit fullscreen mode

2. Identifying the Relevant Table

The first step is to determine the database table that best matches the user's query. This is handled by TableIdentifier.

class TableIdentifier
  def initialize(query:)
    @query = query
  end

  def call
    chat
  end

  private

  def chat
    Llm::Chat.new(payload: chat_payload).call
  end

  def chat_payload
    {
      "model": "gpt-4",
      "messages": [
          { "role": "system", "content": "Given a user query, determine the most relevant table or tables from [users, departments, tickets]. If the query involves multiple tables (e.g., grouping users by department), return a comma-separated list of table names. Only return the table name(s) with no extra text." },
          { "role": "user", "content": "#{@query}" }
      ],
      "max_tokens": 100
    }
  end
end
Enter fullscreen mode Exit fullscreen mode

Example Table Structures

For reference, here are example table structures:

Users Table 👥

CREATE TABLE users (
  id INT PRIMARY KEY,
  name VARCHAR(255),
  email VARCHAR(255),
  status VARCHAR(50),
  department_id INT,
  created_at TIMESTAMP,
  updated_at TIMESTAMP
);
Enter fullscreen mode Exit fullscreen mode

Departments Table 🏢

CREATE TABLE departments (
  id INT PRIMARY KEY,
  name VARCHAR(255),
  manager_id INT,
  created_at TIMESTAMP,
  updated_at TIMESTAMP
);
Enter fullscreen mode Exit fullscreen mode

Tickets Table 🎫

CREATE TABLE tickets (
  id INT PRIMARY KEY,
  user_id INT,
  subject VARCHAR(255),
  status VARCHAR(50),
  created_at TIMESTAMP,
  updated_at TIMESTAMP
);
Enter fullscreen mode Exit fullscreen mode

3. Generating and Executing the SQL Query

Once the table is identified, the next step is generating a valid SQL query. This is done using ReportGenerator.

class ReportGenerator
  require "#{Rails.root}/lib/rts/sender"
  require "#{Rails.root}/lib/llm"

  def initialize(query:)
    @query = query
  end

  def call
    report
  end

  private

  def report
    [
      { type: "text", data: "Here is your report" },
      { type: "table", data: ActiveRecord::Base.connection.select_all(query).to_a }
    ]
  end

  def table_structure
    ActiveRecord::Base.connection.execute("SHOW CREATE TABLE #{table_name}").first[1]
  end

  def table_name
    TableIdentifier.new(query: @query).call
  end

  def query
    Llm::Chat.new(payload: query_payload).call
  end

  def query_payload
    {
      "model": "gpt-4",
      "messages": [
          { "role": "system", "content": "Generate a MySQL query based on the table structure: #{table_structure}. Support queries involving multiple tables where applicable (e.g., grouping users by department). Only return the SQL query as plain text with no formatting, explanations, or markdown." },
          { "role": "user", "content": "#{@query}" }
      ],
      "max_tokens": 1000
    }
  end
end
Enter fullscreen mode Exit fullscreen mode

Example Usage 🛠️

With this setup, generating a report is as simple as making a method call:

ReportGenerator.new(query: "count of inactive users").call
ReportGenerator.new(query: "list of active users").call
ReportGenerator.new(query: "number of users per department").call
Enter fullscreen mode Exit fullscreen mode

For a query like "count of inactive users", the system will generate:

SELECT COUNT(*) FROM users WHERE status = 'inactive';
Enter fullscreen mode Exit fullscreen mode

For a query like "number of users per department", the system will generate:

SELECT d.name, COUNT(u.id) 
FROM users u 
JOIN departments d ON u.department_id = d.id 
GROUP BY d.name;
Enter fullscreen mode Exit fullscreen mode

Disclaimer ⚠️

The LLM-generated prompts may require tweaking based on trial and error to achieve optimal results. You may need to fine-tune them based on your database schema and specific reporting requirements.


Benefits of This Approach 🚀

  1. No manual SQL writing ✅ – Users can generate reports using natural language.
  2. Adaptability 🔄 – The model can be fine-tuned to support new tables or complex queries.
  3. Security 🔒 – The system ensures only relevant tables and safe queries are executed.
  4. Scalability 📈 – Works across multiple datasets without custom development for each request.

Conclusion 🎯

By leveraging LLMs, we can automate the process of translating user intent into SQL queries, making data retrieval seamless and efficient. This approach eliminates the need for manual query writing while ensuring accuracy and adaptability.

Would you implement a similar solution in your application? Let me know your thoughts!

Heroku

Built for developers, by developers.

Whether you're building a simple prototype or a business-critical product, Heroku's fully-managed platform gives you the simplest path to delivering apps quickly — using the tools and languages you already love!

Learn More

Top comments (0)

AWS Security LIVE!

Tune in for AWS Security LIVE!

Join AWS Security LIVE! for expert insights and actionable tips to protect your organization and keep security teams prepared.

Learn More

👋 Kindness is contagious

If this post resonated with you, feel free to hit ❤️ or leave a quick comment to share your thoughts!

Okay