DEV Community

Hackceleration
Hackceleration

Posted on • Originally published at hackceleration.com

Building an AI Business Card Scanner with n8n, Gemini Vision, and Google Sheets

Building an AI Business Card Scanner with n8n, Gemini Vision, and Google Sheets

You just returned from a conference with 30 business cards. You photograph them all with good intentions. But then what? You're staring at manual data entry—typing names, emails, phone numbers one by one into your CRM. Most of those cards never make it into your database.

This integration eliminates that entire process. Using n8n as the orchestration layer, Google Drive for storage, Gemini AI's vision capabilities for OCR, and Google Sheets as the structured database, you can automate business card digitization end-to-end. Here's the technical architecture behind building this workflow.

Architecture Overview

The workflow follows this data flow:

1. Schedule Trigger (daily at midnight)
   ↓
2. Google Drive API: Search folder for new images
   ↓
3. Loop: Iterate through each file
   ↓
4. Google Drive API: Download image binary
   ↓
5. Gemini Vision API: Extract structured data
   ↓
6. JavaScript: Parse JSON response
   ↓
7. Google Sheets API: Append row
   ↓
8. Google Drive API: Move to processed folder
Enter fullscreen mode Exit fullscreen mode

This architecture uses API-first tools with clear separation of concerns: Google Drive handles file operations, Gemini handles vision processing, n8n handles orchestration logic, and Google Sheets provides the queryable database layer.

Why this stack? n8n provides visual workflow building without vendor lock-in (open-source, self-hostable). Gemini 2.5 Flash offers reliable OCR with complex layout handling. Google Sheets gives you instant searchability without database setup.

API Integration Deep-Dive

Google Drive API (File Operations)

Authentication: OAuth2 with scope https://www.googleapis.com/auth/drive. Configure in n8n's credential manager.

Search Request (Finding New Cards):

GET https://www.googleapis.com/drive/v3/files
Params: {
  q: "'FOLDER_ID' in parents",
  fields: "files(id, name, mimeType)"
}
Enter fullscreen mode Exit fullscreen mode

Download Request (Fetching Image Binary):

GET https://www.googleapis.com/drive/v3/files/{fileId}?alt=media
Headers: { "Authorization": "Bearer {access_token}" }
Enter fullscreen mode Exit fullscreen mode

Key parameter: alt=media forces binary content response instead of metadata.

Move Request (Archiving Processed Files):

PATCH https://www.googleapis.com/drive/v3/files/{fileId}
Body: {
  "addParents": "PROCESSED_FOLDER_ID",
  "removeParents": "SOURCE_FOLDER_ID"
}
Enter fullscreen mode Exit fullscreen mode

Rate Limits: 1,000 requests per 100 seconds per user. This workflow processes files sequentially, so you'd need 1,000+ cards in a single run to hit limits.

Gemini Vision API (OCR Extraction)

Authentication: API key from Google Cloud Console. Enable the Generative Language API.

Analysis Request:

POST https://generativelanguage.googleapis.com/v1/models/gemini-2.5-flash:generateContent
Headers: { "x-goog-api-key": "YOUR_API_KEY" }
Body: {
  "contents": [{
    "parts": [
      { "text": "Extract business card data as JSON with fields: first_name, last_name, job_title, company_name, website, phone, email, address, linkedin, twitter, instagram, facebook. Return only valid JSON." },
      { "inline_data": { "mime_type": "image/jpeg", "data": "BASE64_IMAGE" }}
    ]
  }]
}
Enter fullscreen mode Exit fullscreen mode

Response Structure:

{
  "candidates": [{
    "content": {
      "parts": [{
        "text": "{\"first_name\":\"John\",\"last_name\":\"Smith\",\"email\":\"john@example.com\"}"
      }]
    }
  }]
}
Enter fullscreen mode Exit fullscreen mode

Critical gotcha: Gemini sometimes wraps JSON responses in markdown code fences (json ...). Your parsing code must strip these.

Rate Limits: 60 requests per minute for free tier. For high-volume processing, batch uploads or use a paid tier.

Google Sheets API (Database Layer)

Authentication: OAuth2 with scope https://www.googleapis.com/auth/spreadsheets.

Append Request:

POST https://sheets.googleapis.com/v4/spreadsheets/{spreadsheetId}/values/{range}:append
Params: { "valueInputOption": "RAW" }
Body: {
  "values": [[
    "John", "Smith", "CEO", "Example Corp", "example.com",
    "+1-555-0100", "john@example.com", "123 Main St",
    "linkedin.com/in/johnsmith", "@johnsmith", "", ""
  ]]
}
Enter fullscreen mode Exit fullscreen mode

Key parameter: valueInputOption=RAW preserves exact formatting (vs USER_ENTERED which parses formulas).

Response:

{
  "spreadsheetId": "abc123",
  "updates": {
    "updatedRange": "Sheet1!A2:L2",
    "updatedRows": 1
  }
}
Enter fullscreen mode Exit fullscreen mode

Rate Limits: 100 requests per 100 seconds per user. Sequential processing prevents hitting limits.

Implementation Gotchas

Handling Missing Data

Business cards often lack social media handles or physical addresses. The AI prompt explicitly instructs Gemini to return empty strings for missing fields:

const contactData = {
  first_name: parsed.first_name || "",
  last_name: parsed.last_name || "",
  email: parsed.email || "",
  // ... etc
};
Enter fullscreen mode Exit fullscreen mode

Without these fallbacks, Google Sheets receives undefined values, breaking column alignment.

API Rate Limiting Strategy

For batch processing 100+ cards:

  1. Add delay between iterations: Insert a "Wait" node (2 seconds) after the Move File step
  2. Batch scheduling: Instead of processing all files in one run, limit the loop to 50 items and run hourly
  3. Error handling: Wrap the Gemini call in a try-catch to handle temporary API failures

Data Validation Challenges

Email format validation:

const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
if (parsed.email && !emailRegex.test(parsed.email)) {
  parsed.email = ""; // Clear invalid emails
}
Enter fullscreen mode Exit fullscreen mode

Phone number normalization: Gemini returns varied formats: (555) 123-4567, +1-555-123-4567, 555.123.4567. Consider adding a normalization function:

function normalizePhone(phone) {
  return phone.replace(/[^0-9+]/g, '');
}
Enter fullscreen mode Exit fullscreen mode

Authentication Errors and Debugging

Google API 401 Errors: OAuth tokens expire. n8n auto-refreshes, but if you see persistent 401s, delete and re-add your Google credentials.

Gemini API 403 Errors: Check that:

  • Generative Language API is enabled in Google Cloud Console
  • API key has no IP restrictions blocking your n8n instance
  • You haven't hit free tier quota limits

Debugging technique: Add a "Set" node after Gemini to log the raw response:

{{ $json.candidates[0].content.parts[0].text }}
Enter fullscreen mode Exit fullscreen mode

Inspect output to see if JSON structure changed.

Cost Optimization Tips

Google Drive API: Free tier covers typical usage. No optimization needed unless processing 10,000+ cards/month.

Gemini API:

  • Free tier: 60 requests/minute, ~1,500 requests/day
  • Paid tier: $0.00025 per image (4,000 images = $1)
  • Optimization: Batch-process cards weekly instead of daily to stay in free tier

Google Sheets API: Free. No cost considerations.

n8n Cloud: $20/month for 2,500 executions. Self-hosted = free (just server costs).

Common Failure Modes

Scenario: Workflow processes same cards repeatedly.
Cause: Move File step failed, cards remain in source folder.
Fix: Add error handling on Move File node. If move fails, append a "processed_date" column to Sheets and check for duplicates.

Scenario: JSON parsing fails with "Unexpected token" error.
Cause: Gemini returned markdown-wrapped JSON.
Fix: The workflow includes regex cleaning:

const cleaned = raw.replace(/```
{% endraw %}
json|
{% raw %}
```/g, '').trim();
Enter fullscreen mode Exit fullscreen mode

If failures persist, add more aggressive cleaning:

const cleaned = raw.replace(/```
{% endraw %}
[a-z]*\n?|
{% raw %}
```/g, '').trim();
Enter fullscreen mode Exit fullscreen mode

Scenario: Spreadsheet shows misaligned data.
Cause: Column headers don't match JSON field names exactly.
Fix: Headers must be: first_name, last_name, etc. No spaces, no capital letters, underscores required.

Prerequisites

Required Accounts:

  • n8n instance (cloud or self-hosted)
  • Google account with Drive, Sheets access
  • Google Cloud project with Gemini API enabled

API Keys and Credentials:

  • Google Drive OAuth2 credential in n8n
  • Google Sheets OAuth2 credential in n8n
  • Gemini API key from Google AI Studio

Setup Steps:

  1. Create "Business Cards - Inbox" folder in Google Drive (copy folder ID from URL)
  2. Create "Business Cards - Processed" folder (copy folder ID)
  3. Create Google Sheets spreadsheet with column headers: first_name, last_name, job_title, company_name, website, phone, email, address, linkedin, twitter, instagram, facebook
  4. Note spreadsheet ID from URL
  5. Enable Gemini API in Google Cloud Console, generate API key

Estimated API Costs:

  • Free tier: 0–1,500 cards/month at $0
  • Paid tier: $0.00025 per card processed

Documentation Links:

Get the Complete Workflow Configuration

This tutorial covers the API integration architecture for building an automated business card scanner. For the complete n8n workflow file (ready to import), detailed node configuration screenshots, and a video walkthrough, check out the full implementation guide.

Top comments (0)