Hackceleration

Posted on Apr 14 • Originally published at hackceleration.com

Building an AI Receipt Scanner with n8n, Gemini Vision API, and Google Sheets

#n8n #api #automation #tutorial

Building an AI Receipt Scanner with n8n, Gemini Vision API, and Google Sheets

You're building an expense tracking system and need to extract structured data from receipt images automatically. Every developer who's tried parsing receipts manually knows the pain: faded thermal paper, inconsistent formats, manual transcription errors that corrupt your financial data.

This n8n workflow integrates Google Drive, Google Gemini 2.5 Flash vision API, and Google Sheets to create a daily receipt processing pipeline. The architecture monitors a Drive folder, sends images to Gemini for OCR and extraction, then appends structured JSON to a spreadsheet. Here's how to architect this integration in n8n.

Architecture Overview

The workflow implements a scheduled batch processing pattern:

1. Schedule Trigger (midnight daily)
   ↓
2. Google Drive Search API (list unprocessed receipts)
   ↓
3. Loop iterator (process one receipt at a time)
   ↓
4. Google Drive Download API (fetch image binary)
   ↓
5. Gemini Vision API (extract structured data from image)
   ↓
6. JSON parser (clean AI response)
   ↓
7. Google Sheets Append API (write to spreadsheet)
   ↓
8. Google Drive Move API (archive processed file)

Why this architecture? Processing receipts individually prevents API rate limiting issues with Gemini, ensures clean data separation in Sheets, and provides atomic archiving so failed extractions don't corrupt your workflow state. The loop + move pattern creates idempotency — rerunning the workflow won't duplicate data because processed files are relocated.

Google Drive API Integration

The workflow uses three Google Drive API operations. Authentication uses OAuth2 with Drive API scope.

Searching for Receipt Files

// Google Drive Search node configuration
{
  "resource": "File/Folder",
  "operation": "Search",
  "searchMethod": "Search File/Folder Name",
  "returnAll": true,
  "filters": {
    "folderId": "1WpaNqUwm93jEntzmlMyFIrjdo-x5vRu4" // your inbox folder ID
  }
}

The API returns an array of file objects:

[
  {
    "id": "1ABC...xyz",
    "name": "receipt_2024_03_15.jpg",
    "mimeType": "image/jpeg"
  }
]

Downloading Binary Data

Downloading files requires the file ID from the search results. The binary data is stored in n8n's internal format for passing to Gemini:

// Dynamic file ID from loop context
{
  "fileId": "{{ $json.id }}", // references current item in loop
  "operation": "Download"
}

Moving Processed Files

After successful extraction and sheet append, move the file to prevent reprocessing:

{
  "operation": "Move",
  "fileId": "{{ $('Loop Over Items').item.json.id }}",
  "parentFolderId": "1XYZ...processed" // your archive folder ID
}

Gotcha: If the Sheets append fails, the file won't move. This fail-safe prevents data loss but means you'll need error handling or manual intervention for extraction failures.

Gemini Vision API Integration

Gemini 2.5 Flash handles document OCR and structured extraction. The API accepts binary image data and a text prompt defining the extraction schema.

Authentication

Get an API key from Google AI Studio. In n8n, configure a "Google Gemini API" credential with your key.

Extraction Request

// Gemini Analyze Image node configuration
{
  "model": "models/gemini-2.5-flash",
  "operation": "Analyze Image",
  "inputType": "Binary File(s)",
  "inputDataFieldName": "data", // n8n's binary field name
  "textInput": "Extract all data from this receipt and return as JSON..."
}

The prompt is critical. You're instructing the model exactly what schema to return:

Extract all data from this receipt and return as JSON with these fields:
{
  "company_name": "string",
  "company_address": "string",
  "transaction_date": "YYYY-MM-DD",
  "transaction_time": "HH:MM",
  "items": [
    {
      "name": "string",
      "quantity": number,
      "unit_price": number,
      "category": "food|hygiene|household|electronics|other"
    }
  ],
  "subtotal": number,
  "tax_rate": number,
  "tax_amount": number,
  "total": number,
  "currency": "string",
  "payment_method": "string"
}
Return ONLY valid JSON, no markdown formatting.

Response Handling

Gemini returns text, sometimes wrapped in markdown code fences:

// Response structure
{
  "content": {
    "parts": [{
      "text": "```

json\n{\"company_name\":\"Store Name\"}\n

```"
    }]
  }
}

Parse and clean this in a Code node:

const raw = $input.item.json.content.parts[0].text;
const cleaned = raw.replace(/```
{% endraw %}
json|
{% raw %}
```/g, '').trim();
const parsed = JSON.parse(cleaned);
return { json: parsed };

Rate limits: Free tier is 15 requests/minute. The one-at-a-time loop prevents hitting limits for typical daily receipt volumes.

Error codes:

400: Malformed request or unsupported image format
429: Rate limit exceeded
500: Model processing error (retry with exponential backoff)

Google Sheets API Integration

The Sheets node appends extracted data as new rows. Authentication uses OAuth2 with Sheets API scope.

Sheet Setup

Create a spreadsheet with column headers matching your Gemini extraction schema:

| company_name | transaction_date | total | payment_method | currency |

Append Configuration

{
  "operation": "Append Row",
  "documentId": "1ABC...spreadsheet_id",
  "sheetName": "Sheet1",
  "mappingMode": "Map Automatically" // matches JSON keys to column headers
}

The automatic mapping requires exact field name matches. If your Gemini extraction returns company_name but your column is Company Name, the mapping fails silently.

Handling nested data: The items array needs flattening. Options:

Stringify the array as JSON in a single cell
Use a Code node to create one row per item (better for analysis)
Summarize items into a single string: "3 items: coffee, muffin, water"

Implementation Gotchas

Missing data handling: Not all receipts contain all fields. Gemini returns null or omits fields when data isn't visible. Set default values in the Code node:

const parsed = JSON.parse(cleaned);
return {
  json: {
    company_name: parsed.company_name || "Unknown",
    total: parsed.total || 0,
    currency: parsed.currency || "USD",
    ...parsed
  }
};

Image format compatibility: Gemini handles JPEG, PNG, HEIC. If you're processing PDFs, add a conversion step using ImageMagick or similar.

Authentication token refresh: OAuth2 tokens expire. n8n handles refresh automatically, but if your workflow hasn't run in 6 months, you may need to re-authenticate manually.

Cost optimization: Gemini 2.5 Flash is free up to quota limits. For production at scale, consider:

Caching: Store file hashes and skip re-processing duplicates
Batching: Process receipts weekly instead of daily to reduce API calls
Monitoring: Track API usage in Google Cloud Console

Data validation: Add a Code node after parsing to validate critical fields:

if (!parsed.total || parsed.total <= 0) {
  throw new Error(`Invalid total: ${parsed.total}`);
}
if (!parsed.transaction_date) {
  throw new Error('Missing transaction date');
}
return { json: parsed };

This prevents garbage data from reaching your spreadsheet.

Prerequisites

Required accounts:

n8n instance (self-hosted or cloud)
Google Cloud Platform account (for Gemini API key)
Google account (for Drive and Sheets)

API credentials:

Google Drive OAuth2 credential in n8n
Google Sheets OAuth2 credential in n8n
Google Gemini API key from AI Studio

Folder setup:

Create "Receipts - Inbox" folder in Google Drive (note the folder ID from URL)
Create "Receipts - Processed" folder (note the folder ID)

Spreadsheet setup:

Create a new Google Sheet
Add column headers matching extraction schema
Note the spreadsheet ID from URL

Documentation:

Get the Complete n8n Workflow

This tutorial covers the API integration architecture and key implementation patterns. For the complete n8n workflow JSON with all node configurations, detailed field mappings, and a video walkthrough showing the entire setup process, check out the full implementation guide.

DEV Community

Building an AI Receipt Scanner with n8n, Gemini Vision API, and Google Sheets

Building an AI Receipt Scanner with n8n, Gemini Vision API, and Google Sheets

Architecture Overview

Google Drive API Integration

Searching for Receipt Files

Downloading Binary Data

Moving Processed Files

Gemini Vision API Integration

Authentication

Extraction Request

Response Handling

Google Sheets API Integration

Sheet Setup

Append Configuration

Implementation Gotchas

Prerequisites

Get the Complete n8n Workflow

Top comments (0)