Did you know that you can build an AI receipt reader in under 10 minutes? In this tutorial, we will explore how to extract structured data from a PDF receipt using the Docuglean SDK. I will guide you step-by-step through the setup process, it's easy and only requires a little knowledge!
๐ Visit and star Docuglean SDK
1- Prerequisites:
In order to be able to use the Docuglean AI SDK you need a little bit of JavaScript knowledge, you can review some JavaScript here , other than that youโll need Node.js and NPM installed in your machine, to check if you have them already, in Windows, click the Windows key + R on your keyboard, and type cmd, then type the commands:
node -v
and:
npm -v
If you are able to see the versions of node and npm, good! Youโre ready!, instead, you can simply go here and install the node.js windows installer (msi), after the setup is complete, check the versions again in cmd, and you will see that you are ready!
2- Installing the Docuglean SDK
To start, you will have to create a directory (a folder) that will contain your receipt reader project, you can do this by pressing Windows + R on your keyboard and typing cmd, use the command mkdir followed by the project name, for example:
mkdir my-receipt-extractor
Now navigate to that directory using the command:
cd my-receipt extractor
Once youโre there, simply run:
npm i docuglean
And hooray! Now your directory has all the features that Docuglean AI provides!
Another thing youโll need is an API key, currently, the available API keys are from these providers:
OpenAI: gpt-4.1-mini, gpt-4.1, gpt-4o-mini, gpt-4o, o1-mini, o1, o3, o4-mini
Mistral: mistral-ocr-latest
Once you get your API key from one of the providers, donโt share it, this is your very own API key, save it in a safe place!
Want to use an API key from another provider? More API keys are coming soon! check the Docuglean repository to know more, and star it to stay up to date!
๐ Visit and star Docuglean SDK
3- Creating a Zod Schema for Receipts
The Zod schema is your blueprint for the data you want to extract from a receipt, it tells Docuglean AI the exact structure and types of information to look for, which ensures you get consistent, predictable data back every time.
What it is: A defined structure (like a template) for the data Docuglean should extract.
Why it's important:
- Predictable Output: Guarantees data comes back in the format you expect.
- Type Safety: Ensures fields are the correct type (date as a string, total as a numberโฆ)
- Guides the AI: Helps the AI understand what specific pieces of information to pull out.
Here is an example Zod Schema for a Receipt:
import { z } from 'zod';
// Define the structure for a single item on the receipt
const ReceiptItemSchema = z.object({
name: z.string().describe('The name of the item purchased.'),
price: z.number().describe('The price of this specific item.')
});
// Define the overall structure for the entire receipt
const ReceiptSchema = z.object({
date: z.string().describe('The date of the receipt in YYYY-MM-DD format.'),
total: z.number().describe('The grand total amount shown on the receipt.'),
currency: z.string().optional().describe('The currency symbol or code (e.g., "$", "EUR").'), // Optional field
vendorName: z.string().optional().describe('The name of the store or business.'),
items: z.array(ReceiptItemSchema).describe('A list of all individual items purchased, with their names and prices.')
});
4- Writing the Extraction Script
This script is the core of your application, it brings together your receipt, your API key, and the Zod schema to tell Docuglean AI what to do, it calls Docuglean's extract function to process your receipt and return structured data
Here is a simple Extraction Script Example:
import { extract } from 'docuglean';
import { z } from 'zod';
import * as dotenv from 'dotenv'; // Tool to load API keys securely
dotenv.config(); // Loads variables from a .env file
// Define your Zod Schema (from Part 3)
const ReceiptItemSchema = z.object({
name: z.string().describe('The name of the item purchased.'),
price: z.number().describe('The price of this specific item.')
});
const ReceiptSchema = z.object({
date: z.string().describe('The date of the receipt in YYYY-MM-DD format.'),
total: z.number().describe('The grand total amount shown on the receipt.'),
currency: z.string().optional().describe('The currency symbol or code (e.g., "$", "EUR").'),
vendorName: z.string().optional().describe('The name of the store or business.'),
items: z.array(ReceiptItemSchema).describe('A list of all individual items purchased, with their names and prices.')
});
async function runReceiptExtraction() {
const apiKey = process.env.OPENAI_API_KEY; // Ensure you set this in a .env file!
const receiptFilePath = './receipt_example.pdf'; // Ensure this file exists!
if (!apiKey) {
console.error("Error: API key not found. Please set OPENAI_API_KEY in your .env file.");
return;
}
try {
console.log("Starting receipt data extraction...");
const extractedData = await extract({
filePath: receiptFilePath,
apiKey: apiKey,
provider: 'openai', // Or 'mistral'
responseFormat: ReceiptSchema, // Our blueprint for the output
prompt: 'Extract the date, total, currency, vendor name, and a list of items with their names and prices from this receipt.'
});
console.log("Extraction successful!");
// Print the result in a nicely formatted way
console.log(JSON.stringify(extractedData, null, 2));
} catch (error) {
console.error("An error occurred during extraction:", error);
}
}
runReceiptExtraction(); // Run the function
5- Understanding the Output
Once your script runs successfully, Docuglean will return a JavaScript object containing all the extracted information, perfectly matching your Zod schema, you will get a JavaScript object that is guaranteed to have the structure you defined in your ReceiptSchema.
Example Output (based on our schema):
{
"date": "2024-07-09",
"total": 55.75,
"currency": "USD",
"vendorName": "SuperMart",
"items": [
{
"name": "Organic Bananas",
"price": 3.49
},
{
"name": "Milk (1 Gallon)",
"price": 4.99
},
{
"name": "Avocado (Each)",
"price": 2.50
}
]
}
6- Letโs wrap up!
In this article, we have learned the essentials of using Docuglean for receipt extraction, this is a powerful skill that automates tedious data entry and unlocks new possibilities for document processing!
What's Coming Soon to Docuglean:
summarize()
: Get quick summaries (TLDRs) of long documents.
translate()
: Built-in support for processing multilingual documents.
classify()
: Automatically detect the type of document (receipt, invoice, ID, etc.).
search(query)
: Search across your documents using powerful AI.
More AI Models: Integrations with other providers like Meta's Llama, Together AI, and OpenRouter for more choice and flexibility.
Keep an eye on Docuglean's updates by and star the GitHub repository to get notified when these exciting new features become available!
๐ Visit and star Docuglean SDK
โญ Want More?
Check out the full Docuglean repository on GitHub and star the project to support future updates!
๐ Visit and star Docuglean SDK
Have questions/requests? Drop them the comments! ๐๏ธ
Top comments (0)