DEV Community

CDCSaaS
CDCSaaS

Posted on

How to Enrich a Google Sheet with Company Data Using n8n (No-Code Pipeline)

You have a spreadsheet full of website URLs. You want company names, industries, tech stacks, and contact emails — without spending hours on manual research.

Here's how to wire up a fully automated enrichment pipeline in about 20 minutes using n8n and a RapidAPI enrichment endpoint.


What you'll build

A 4-node n8n workflow that:

  1. Reads a list of website URLs from a Google Sheet
  2. Calls a company enrichment API for each URL
  3. Writes back structured data (name, sector, description, contact email, tech stack) into the same sheet

No code. No server. Runs on demand or on a schedule.


Prerequisites

  • An n8n cloud account (free trial works)
  • A Google account (for Sheets)
  • A RapidAPI account subscribed to the AI Live Company Enrichment & Tech Detector API

Step 1 — Set up your Google Sheet

Create a new spreadsheet with these columns in row 1:

website | domain_status | fetched_at | company_name | sector | description | contact_email | tech_stack | linkedin
Enter fullscreen mode Exit fullscreen mode

Add a few website URLs in the website column to test with:

https://stripe.com
https://notion.so
https://linear.app
Enter fullscreen mode Exit fullscreen mode

Note your Spreadsheet ID from the URL — it looks like:
https://docs.google.com/spreadsheets/d/YOUR_SPREADSHEET_ID/edit


Step 2 — Connect credentials in n8n

You need two credentials before building the workflow.

Google Sheets OAuth2

  1. In n8n, go to Credentials → New → Google Sheets OAuth2
  2. Follow the OAuth flow to connect your Google account
  3. Name it Google Sheets account

RapidAPI Header Auth

  1. Go to Credentials → New → Header Auth
  2. Set Name to x-rapidapi-key (exactly — this is the HTTP header name)
  3. Set Value to your RapidAPI key
  4. Save as RapidAPI – Company Enrichment

⚠️ Common mistake: setting the credential "Name" field to a display name like "My RapidAPI Key" instead of the actual header name x-rapidapi-key. n8n uses this field as the HTTP header name, so it must match exactly.


Step 3 — Build the workflow

Node 1: Manual Trigger

Add a Manual Trigger node. This lets you run the workflow on demand by clicking "Execute workflow." You can swap it for a Schedule trigger later.

Node 2: Google Sheets — Read

Add a Google Sheets node:

  • Operation: Read rows
  • Document ID: paste your Spreadsheet ID
  • Sheet: Sheet1 (tab index 0)
  • Credential: Google Sheets account

This node outputs one item per row — each item is a JSON object with your column values.

Node 3: HTTP Request — Enrich

Add an HTTP Request node:

  • Method: GET
  • URL:
  https://ai-live-company-enrichment-tech-detector.p.rapidapi.com/v1/enrich?url={{ $json.website }}
Enter fullscreen mode Exit fullscreen mode
  • Authentication: Generic Credential Type → Header Auth
  • Credential: RapidAPI – Company Enrichment
  • Additional header: x-rapidapi-host = ai-live-company-enrichment-tech-detector.p.rapidapi.com

The {{ $json.website }} expression dynamically injects the URL from the previous node for each row. n8n automatically loops this node over all rows from the Sheets node.

Node 4: Google Sheets — Update

Add a second Google Sheets node:

  • Operation: Update row
  • Document ID: same Spreadsheet ID
  • Sheet: Sheet1
  • Matching column: website (used to find the right row)
  • Columns to update:
Column Value
website {{ $json.url }}
domain_status {{ $json.domain_status }}
fetched_at {{ $json.fetched_at }}
company_name {{ $json.data ? $json.data.company_name : "" }}
sector {{ $json.data ? $json.data.sector : "" }}
description {{ $json.data ? $json.data.description : "" }}
contact_email {{ $json.data ? $json.data.contact_email : "" }}
tech_stack {{ $json.data && Array.isArray($json.data.tech_stack) ? $json.data.tech_stack.join(", ") : "" }}
linkedin {{ $json.data && $json.data.social_links ? $json.data.social_links.linkedin : "" }}

The $json.data ? guards handle cases where the API returns null for a field — the cell just stays empty instead of crashing the workflow.

Connect the nodes

Wire them in sequence:

Manual Trigger → Google Sheets (Read) → HTTP Request (Enrich) → Google Sheets (Update)
Enter fullscreen mode Exit fullscreen mode

Step 4 — Run it

Click Execute workflow. Watch the green checkmarks light up node by node.

After it completes, open your Google Sheet — each row should now have company name, sector, description, contact email, and tech stack filled in.


What the API returns

The enrichment endpoint returns a JSON object like this:

{
  "url": "https://stripe.com",
  "cached": false,
  "fetched_at": "2026-06-13T12:31:10.566368+00:00",
  "domain_status": "active",
  "data": {
    "company_name": "Stripe",
    "sector": "Financial Infrastructure and Payments",
    "description": "Stripe provides financial infrastructure enabling businesses to accept payments, offer financial services, and implement custom revenue models globally.",
    "contact_email": null,
    "tech_stack": ["Nginx", "Next.js"],
    "social_links": {
      "linkedin": "https://www.linkedin.com/company/stripe",
      "twitter": "https://twitter.com/stripe",
      "github": "https://github.com/stripe"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Three fields make this API stand out, and the workflow above captures them:

  • fetched_at — an ISO timestamp of when the site was actually read. Your sheet records exactly how fresh each row is, instead of trusting an invisible database refresh cycle.
  • domain_statusactive, parked, or dead. Filter your sheet on this column to instantly drop dead or parked domains before they reach your CRM. A dead domain returns a clean 200 with data: null, so the workflow never crashes on a bad URL.
  • social_links — extracted deterministically from the live page, never guessed. A network that isn't present comes back null.

For well-known companies the data is usually complete. For smaller or less-indexed sites, some fields may be null — the API returns null rather than inventing a value, which is exactly why the guards in the expressions matter.


Going further

Schedule it: Swap the Manual Trigger for a Schedule Trigger to enrich new rows automatically every night.

Filter only empty rows: Add an IF node between the Read and Enrich nodes to skip rows that already have a company_name, saving API calls.

Drop dead domains automatically: Add an IF node after Enrich that keeps only rows where domain_status is active — parked and dead domains never reach your CRM.

Add error handling: Wrap the HTTP Request in an Error Trigger workflow to catch failed enrichments and log them to a separate sheet.

Export to a CRM: Add a HubSpot or Airtable node after the update step to push enriched companies straight into your sales pipeline.


The workflow JSON

You can import this directly into n8n (Workflows → Import from file). Replace YOUR_GOOGLE_SHEET_ID with your actual spreadsheet ID, then connect your credentials.

{
  "name": "Company Enrichment Pipeline",
  "nodes": [
    {
      "id": "node-001",
      "name": "Start",
      "type": "n8n-nodes-base.manualTrigger",
      "typeVersion": 1,
      "position": [0, 0],
      "parameters": {}
    },
    {
      "id": "node-002",
      "name": "Google Sheets – Read",
      "type": "n8n-nodes-base.googleSheets",
      "typeVersion": 4.7,
      "position": [224, 0],
      "parameters": {
        "documentId": { "__rl": true, "mode": "id", "value": "YOUR_GOOGLE_SHEET_ID" },
        "sheetName": { "__rl": true, "mode": "id", "value": "0" },
        "options": {}
      }
    },
    {
      "id": "node-003",
      "name": "HTTP Request – Enrich",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [448, 0],
      "parameters": {
        "method": "GET",
        "url": "=https://ai-live-company-enrichment-tech-detector.p.rapidapi.com/v1/enrich?url={{ $json.website }}",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [{ "name": "x-rapidapi-host", "value": "ai-live-company-enrichment-tech-detector.p.rapidapi.com" }]
        },
        "options": {}
      },
      "credentials": { "httpHeaderAuth": { "name": "RapidAPI – Company Enrichment" } }
    },
    {
      "id": "node-004",
      "name": "Google Sheets – Update",
      "type": "n8n-nodes-base.googleSheets",
      "typeVersion": 4.7,
      "position": [672, 0],
      "parameters": {
        "operation": "update",
        "documentId": { "__rl": true, "mode": "id", "value": "YOUR_GOOGLE_SHEET_ID" },
        "sheetName": { "__rl": true, "mode": "id", "value": "0" },
        "columns": {
          "mappingMode": "defineBelow",
          "value": {
            "website": "={{ $json.url }}",
            "domain_status": "={{ $json.domain_status }}",
            "fetched_at": "={{ $json.fetched_at }}",
            "company_name": "={{ $json.data ? $json.data.company_name : '' }}",
            "sector": "={{ $json.data ? $json.data.sector : '' }}",
            "description": "={{ $json.data ? $json.data.description : '' }}",
            "contact_email": "={{ $json.data ? $json.data.contact_email : '' }}",
            "tech_stack": "={{ $json.data && Array.isArray($json.data.tech_stack) ? $json.data.tech_stack.join(', ') : '' }}"
          },
          "matchingColumns": ["website"]
        },
        "options": {}
      },
      "credentials": { "googleSheetsOAuth2Api": { "name": "Google Sheets account" } }
    }
  ],
  "connections": {
    "Start": { "main": [[{ "node": "Google Sheets – Read", "type": "main", "index": 0 }]] },
    "Google Sheets – Read": { "main": [[{ "node": "HTTP Request – Enrich", "type": "main", "index": 0 }]] },
    "HTTP Request – Enrich": { "main": [[{ "node": "Google Sheets – Update", "type": "main", "index": 0 }]] }
  },
  "active": false,
  "settings": { "executionOrder": "v1" }
}
Enter fullscreen mode Exit fullscreen mode

Why I built this

I'm building CDCSaaS — an API for enriching company profiles from a URL. This pipeline is one of the most common use cases: you have a list of leads or prospects as websites, and you need structured data about each one fast.

The n8n workflow makes it accessible to anyone — no Python, no ETL tooling, just a visual canvas and a few credential inputs.

If you try it, let me know in the comments what dataset you used it on. 👇


Part of the "Building CDCSaaS" series — documenting the build of a bootstrapped API product in public.

Top comments (0)