The simplest way to extract document data in your n8n workflows

#ai #api #productivity #tutorial

Hi DEV.to community 👋

My name is Felix, and this project started out of pure frustration.

A while back I needed a simple way to extract structured data from PDF documents – invoices, contracts, IDs, that kind of thing. I tested tool after tool. Most were either way too complex to set up, required a data science background to configure, or locked you into vendor infrastructure that made GDPR compliance a headache.

So my team and I built what we wish had existed: easybits – the simplest way to extract structured data from documents, without the overhead.

How it works:

Upload an example document, map the fields you want to extract, and easybits generates a pipeline with its own API endpoint. That's it. No model training. No complex configuration. You can connect it to your existing workflows in minutes.

We just shipped auto-mapping – easybits now automatically detects and suggests the fields to extract based on your document, so you don't have to map anything manually. Upload a doc, confirm the suggestions, and your pipeline is ready. It's the fastest way we've found to go from "I have a document" to "I have structured data." If you want to see how auto-mapping works, feel free to check out this short video:

👉 https://youtu.be/BxWhMBgKXto

Why easybits is different:

→ Auto-mapping – fields are detected automatically, no manual setup needed

→ Works with PDFs, PNGs, and JPEGs – including poor quality scans

→ Fully GDPR compliant – your data stays where it should

→ Simple REST API – works with any automation tool (n8n, Make, Zapier, custom code)

→ Per-pipeline API keys – separate pipelines for invoices, contracts, IDs, whatever you need

Connecting it to n8n takes three steps:

Sign up at http://www.easybits.tech
Create your first pipeline – upload a doc (PDF/PNG/JPEG), let auto-mapping do its thing, confirm the fields.
Grab your Pipeline API URL and API Key from the pipeline details page.
Drop an HTTP Request node into n8n, set the method to POST, set up a Bearer Auth credential using your API Key, and send your document as a URL or base64-encoded file.

To make it even easier to get started, here's a ready-to-use n8n workflow you can import directly. It uses an On form submission trigger so you can upload a document and test extraction instantly:

{
  "name": "easybits' Extractor Workflow",
  "nodes": [
    {
      "parameters": {
        "operation": "binaryToPropery",
        "binaryPropertyName": "image",
        "options": {}
      },
      "type": "n8n-nodes-base.extractFromFile",
      "typeVersion": 1.1,
      "position": [
        224,
        16
      ],
      "id": "5c7b1c55-dd54-4574-89c5-fc71605cf9fd",
      "name": "Extract from File"
    },
    {
      "parameters": {
        "formTitle": "Image Upload",
        "formFields": {
          "values": [
            {
              "fieldLabel": "image",
              "fieldType": "file"
            }
          ]
        },
        "options": {}
      },
      "type": "n8n-nodes-base.formTrigger",
      "typeVersion": 2.5,
      "position": [
        -64,
        16
      ],
      "id": "45f05103-ac27-4f18-8ae9-9e7b2a4e807b",
      "name": "On form submission"
    },
    {
      "parameters": {
        "assignments": {
          "assignments": [
            {
              "id": "540141e7-42d3-4011-b681-8335d9105044",
              "name": "data",
              "value": "=data:{{ $('On form submission').first().binary.image.mimeType }};base64,{{ $json.data }}",
              "type": "string"
            }
          ]
        },
        "options": {}
      },
      "type": "n8n-nodes-base.set",
      "typeVersion": 3.4,
      "position": [
        512,
        16
      ],
      "id": "e03f7ddb-5f0a-43dd-8424-b3dac8ac2875",
      "name": "Edit Fields"
    },
    {
      "parameters": {
        "method": "POST",
        "url": "https://extractor.easybits.tech/...",
        "authentication": "predefinedCredentialType",
        "nodeCredentialType": "httpBearerAuth",
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={\n  \"files\": [\n    \"{{ $json.data }}\"\n  ]\n} ",
        "options": {}
      },
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.3,
      "position": [
        800,
        16
      ],
      "id": "e89dbb99-b7bd-401e-9af2-3d9ba868c810",
      "name": "easybits Extractor"
    },
    {
      "parameters": {
        "content": "## 📋 Form Upload\nAccepts a file upload via a **web form**. Supports **PDF, PNG, and JPEG**.",
        "height": 304,
        "width": 256,
        "color": 7
      },
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -144,
        -128
      ],
      "typeVersion": 1,
      "id": "07b79e87-5d7f-4e32-8560-4584f0c19367",
      "name": "Sticky Note"
    },
    {
      "parameters": {
        "content": "## 📄 Extract to Base64\nConverts the uploaded **binary file** into a base64-encoded string stored in `data`.",
        "height": 304,
        "width": 256,
        "color": 7
      },
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        144,
        -128
      ],
      "typeVersion": 1,
      "id": "1e15fc8c-dd32-4c7e-a1f3-14a5ebcccc8a",
      "name": "Sticky Note1"
    },
    {
      "parameters": {
        "content": "## 🔗 Build Data URI\nDynamically reads the **MIME type** from the uploaded file and prepends it as a base64 data URI.",
        "height": 304,
        "width": 256,
        "color": 7
      },
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        432,
        -128
      ],
      "typeVersion": 1,
      "id": "51c1cbfa-9b63-444b-9e65-31fb7752715e",
      "name": "Sticky Note2"
    },
    {
      "parameters": {
        "content": "## 🚀 Send to easybits\nPOSTs the data URI to the **easybits Extractor API** pipeline for processing. Uses **Bearer token** auth.",
        "height": 304,
        "width": 256,
        "color": 7
      },
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        720,
        -128
      ],
      "typeVersion": 1,
      "id": "96a53f26-788e-4e2e-afb5-2d6a2e36acb5",
      "name": "Sticky Note3"
    },
    {
      "parameters": {
        "content": "# 📤 easybits' Extractor Workflow\n\n## How It Works\nThis workflow lets users upload a file (PDF, PNG, JPEG) via a web form. The file is automatically converted to base64, wrapped in a data URI with the correct MIME type, and sent to the **easybits Extractor API** for structured data extraction.\n\n**Flow overview:**\n1. A user uploads a file through the hosted web form\n2. The binary file is extracted and converted to base64\n3. The MIME type is detected automatically and a data URI is built\n4. The data URI is sent to your easybits Extractor pipeline\n\n---\n\n## Setup Guide\n\n### 1. Set Up Your easybits Extractor Pipeline\n1. Go to [extractor.easybits.tech](https://extractor.easybits.tech), sign up and click **\"Create a Pipeline\"** on your dashboard.\n2. Fill in the **Pipeline Name** and **Description** – describe the type of document you're processing.\n3. Upload a **sample document** as your reference.\n4. Click **\"Map Fields\"** and define the fields you want to extract or use our auto-mapping feature to map all fields automatically.\n5. Click **\"Save & Test Pipeline\"** to verify the extraction works correctly.\n6. Go to **Pipeline Details → View Pipeline** and copy your **Pipeline ID** and **API Key**.\n\n### 2. Connect the easybits Node in n8n\n1. Open the **easybits Extractor** node in the workflow.\n2. Replace the Pipeline ID in the URL with your own.\n3. Set up a **Bearer Auth** credential with your API Key.\n\n### 3. Activate the Workflow\n1. Click the **\"Active\"** toggle in the top-right corner of n8n.\n2. Open the form URL and upload a test file.\n3. Check the execution log to verify the extracted data comes back correctly.",
        "height": 896,
        "width": 624
      },
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -800,
        -416
      ],
      "typeVersion": 1,
      "id": "9c78086c-0d4f-4c8d-9487-8ca60629b1dc",
      "name": "Sticky Note4"
    }
  ],
  "pinData": {},
  "connections": {
    "Extract from File": {
      "main": [
        [
          {
            "node": "Edit Fields",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "On form submission": {
      "main": [
        [
          {
            "node": "Extract from File",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Edit Fields": {
      "main": [
        [
          {
            "node": "easybits Extractor",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "active": false,
  "settings": {
    "executionOrder": "v1",
    "availableInMCP": false,
    "timeSavedMode": "fixed",
    "callerPolicy": "workflowsFromSameOwner"
  },
  "meta": {},
  "tags": []
}

To use it, just import the JSON into n8n, add your Pipeline API URL and API Key, and run it. Submit a document through the form, and the extraction output will appear directly in the workflow – no additional configuration needed.

We've tested this against some genuinely painful documents – crumpled receipts, 36-page contracts with handdrawn marker annotations baked into the scan, you name it. The goal was always to build something robust enough for the real world, not just clean demo PDFs.

If you're spending time manually copying data out of documents, or you've tried other extraction tools and bounced off the complexity – give easybits a shot. We have a free tier and I'm personally happy to help you get your first pipeline running.

Drop any questions in the comments – I'll be here all day. 🙌 Also, I'd love to hear from you: what data extraction tools have you tried before, and what made you move on from them? Always looking to learn from what others have been through.

Best,
Felix

DEV Community

The simplest way to extract document data in your n8n workflows

Top comments (0)