DEV Community

Marcos Marx
Marcos Marx

Posted on

Enrich HubSpot Companies with Apollo, Output.ai and Zapier SDK No OAuth Required

If you've ever wired a workflow into HubSpot, you know the pain: OAuth flows, token refresh, scopes, and an SDK you have to keep current with every HubSpot API change. This post walks through a different approach — one that combines a direct REST API for rich enrichment data, the Zapier SDK for the long tail of CRM writes, and an LLM as the semantic glue between them.

The workflow takes a company website, enriches it with Apollo, maps the returned industry string to a valid HubSpot enum via Claude Haiku, and upserts the record into HubSpot. Four steps, two of which run in parallel, and zero HubSpot auth code.

Why this shape?

HubSpot's industry field is a predefined dropdown, not free text. Apollo returns industries as freeform strings like "Internet Software & Services". Hardcoding a mapping table is brittle — HubSpot adds and removes choices, and Apollo's taxonomy is huge.

So the workflow fetches the current HubSpot enum list at runtime via the Zapier SDK and asks Claude Haiku to pick the closest semantic match. Swap HubSpot for Salesforce, Pipedrive, or any other Zapier-supported CRM by changing an app key — the mapping pattern still works.

The four steps

  1. Enrich with Apollo — Extract the domain from the input website and call the Apollo organization API for industry, employee count, funding stage, LinkedIn, location, and keywords.
  2. Fetch HubSpot industries — Use the Zapier SDK's listInputFieldChoices helper to pull the live set of valid HubSpot industry enum values.
  3. Map industry with Claude — Send Apollo's raw industry string and the full HubSpot enum list to claude-haiku-4-5 for the single best semantic match. Skipped when Apollo returns no industry.
  4. Upsert to HubSpot — Hand the enriched payload (with the mapped HubSpot industry) to the Zapier SDK's search_or_write action.

Steps 1 and 2 run in parallel — they're independent calls, so there's no reason to serialize them.

File structure

zapier_hubspot_company_enrichment/
├── workflow.ts          # Orchestration — parallel fetch, then map, then upsert
├── steps.ts             # 4 steps: enrich, fetchIndustries, mapIndustry, upsert
├── types.ts             # Zod schemas and TypeScript types
├── prompts/
│   └── map_hubspot_industry@v1.prompt  # Haiku — semantic industry mapping
└── scenarios/
    └── stripe.json      # Test input: Stripe
Enter fullscreen mode Exit fullscreen mode

Two shared clients do the I/O:

  • apollo.tsenrichOrganization enriches a company by domain via the Apollo API
  • zapier.tscreateZapierClient instantiates the Zapier SDK with credentials loaded from @outputai/credentials

workflow.ts

The Apollo enrichment and HubSpot industry fetch run in parallel with Promise.all. The LLM mapping step is conditional: if Apollo didn't return an industry, there's nothing to map, and the workflow passes undefined straight to the upsert. Every call is wrapped in step() so Output can trace, retry, and cache each one.

import { workflow } from '@outputai/core';
import {
  enrichCompanyWithApollo,
  fetchHubspotIndustries,
  mapHubspotIndustry,
  upsertHubspotCompany,
} from './steps.js';
import { workflowInputSchema, workflowOutputSchema } from './types.js';

export default workflow({
  name: 'zapier_company_enrichment',
  description:
    'Enriches a company profile using Apollo via REST API and upserts the result into HubSpot via Zapier SDK',
  inputSchema: workflowInputSchema,
  outputSchema: workflowOutputSchema,
  fn: async (input) => {
    // Steps 1 + 2 -- Enrich via Apollo and fetch HubSpot industry choices in parallel
    const [apolloData, { industries }] = await Promise.all([
      enrichCompanyWithApollo({ website: input.website }),
      fetchHubspotIndustries(),
    ]);

    // Step 3 -- Map Apollo's raw industry string to a valid HubSpot enum (if any)
    const hubspotIndustry = apolloData.industry
      ? (
          await mapHubspotIndustry({
            industry: apolloData.industry,
            hubspotIndustries: industries,
          })
        ).hubspotIndustry
      : undefined;

    // Step 4 -- Upsert the enriched + mapped company into HubSpot via Zapier
    const { hubspotCompanyId, action } = await upsertHubspotCompany({
      ...apolloData,
      hubspotIndustry,
    });

    return {
      companyName: apolloData.name,
      website: input.website,
      hubspotCompanyId,
      apolloData,
      action,
    };
  },
});
Enter fullscreen mode Exit fullscreen mode

steps.ts

The Apollo step extracts the domain from the input URL before calling the client — Apollo lookups are keyed by domain, not full URL. fetchHubspotIndustries paginates through the live enum values for the industry dropdown on HubSpot's company object. mapHubspotIndustry delegates to a prompt file rather than inlining the system/user text. The upsert step uses search_or_write with name as the search key, so Zapier either updates an existing HubSpot company by name or creates a new one. The connectionId identifies which user's HubSpot account the write goes to.

import { step } from '@outputai/core';
import { generateText, Output } from '@outputai/llm';
import { enrichOrganization } from '../../shared/clients/apollo.js';
import { createZapierClient } from '../../shared/clients/zapier.js';
import {
  enrichCompanyInputSchema,
  apolloCompanySchema,
  fetchHubspotIndustriesOutputSchema,
  mapHubspotIndustryInputSchema,
  mapHubspotIndustryOutputSchema,
  hubspotUpsertInputSchema,
  hubspotUpsertOutputSchema,
  zapierHubspotResponseSchema,
} from './types.js';

const HUBSPOT_CONNECTION_ID = 'your-hubspot-connection-id';

function extractDomain(website: string): string {
  const url = new URL(website);
  return url.hostname.replace(/^www\./, '');
}

// --- Step 1: Enrich company via Apollo REST API ---

export const enrichCompanyWithApollo = step({
  name: 'enrich_company_with_apollo',
  description: 'Enriches company data using Apollo REST API directly',
  inputSchema: enrichCompanyInputSchema,
  outputSchema: apolloCompanySchema,
  fn: async ({ website }) => {
    const domain = extractDomain(website);
    const org = await enrichOrganization(domain);

    if (!org?.name) {
      throw new Error(`Apollo returned no data for domain: ${domain}`);
    }

    return {
      name: org.name,
      website: org.website_url ?? website,
      domain: org.primary_domain ?? domain,
      industry: org.industry ?? undefined,
      employeeCount: org.estimated_num_employees ?? undefined,
      estimatedRevenue: org.annual_revenue_printed ?? undefined,
      description: org.short_description ?? undefined,
      linkedinUrl: org.linkedin_url ?? undefined,
      city: org.city ?? undefined,
      country: org.country ?? undefined,
      keywords: Array.isArray(org.keywords) ? org.keywords : undefined,
      totalFunding: org.total_funding ?? undefined,
      latestFundingRound: org.latest_funding_round_date ?? undefined,
      fundingStage: org.latest_funding_stage ?? undefined,
    };
  },
});

// --- Step 2: Fetch HubSpot industry enum choices via the Zapier SDK ---

export const fetchHubspotIndustries = step({
  name: 'fetch_hubspot_industries',
  description: 'Fetches available HubSpot industry field choices via Zapier SDK',
  outputSchema: fetchHubspotIndustriesOutputSchema,
  fn: async () => {
    const zapier = createZapierClient();

    const industries: string[] = [];
    for await (const item of zapier
      .listInputFieldChoices({
        appKey: 'hubspot',
        actionType: 'search_or_write',
        actionKey: 'company_crmSearch',
        inputFieldKey: 'industry',
        connectionId: HUBSPOT_CONNECTION_ID,
      })
      .items()) {
      const value = item.value ?? item.key ?? item.label;
      if (value) industries.push(value);
    }

    return { industries };
  },
});

// --- Step 3: Map Apollo's industry string to a HubSpot enum via LLM ---

export const mapHubspotIndustry = step({
  name: 'map_hubspot_industry',
  description: 'Maps a raw industry string to a valid HubSpot industry enum value using an LLM',
  inputSchema: mapHubspotIndustryInputSchema,
  outputSchema: mapHubspotIndustryOutputSchema,
  fn: async ({ industry, hubspotIndustries }) => {
    const { output } = await generateText({
      prompt: 'map_hubspot_industry@v1',
      variables: {
        industry,
        hubspotIndustries: hubspotIndustries.join(', '),
      },
      output: Output.object({ schema: mapHubspotIndustryOutputSchema }),
    });

    return output;
  },
});

// --- Step 4: Upsert into HubSpot via the Zapier SDK ---

export const upsertHubspotCompany = step({
  name: 'upsert_hubspot_company',
  description:
    'Creates or updates a HubSpot company record using enriched Apollo data via Zapier SDK',
  inputSchema: hubspotUpsertInputSchema,
  outputSchema: hubspotUpsertOutputSchema,
  fn: async (input) => {
    const zapier = createZapierClient();

    const domain = input.domain ?? extractDomain(input.website ?? '');

    const inputs = {
      first_search_property_name: 'name',
      first_search_property_value: input.name,
      name: input.name,
      domain: domain ?? '',
      website: input.website ?? '',
      city: input.city ?? '',
      country: input.country ?? '',
      industry: input.hubspotIndustry ?? '',
      numberofemployees: input.employeeCount ? String(input.employeeCount) : '',
      description: input.description ?? '',
      linkedin_company_page: input.linkedinUrl ?? '',
      total_money_raised: input.totalFunding ? String(input.totalFunding) : '',
    };

    const { data: result } = await zapier.apps.hubspot.search_or_write.company_crmSearch({
      inputs,
      connectionId: HUBSPOT_CONNECTION_ID,
    });

    const [record] = zapierHubspotResponseSchema.parse(result);

    return {
      hubspotCompanyId: record.id,
      action: record.isNew ? 'created' : 'updated',
    };
  },
});
Enter fullscreen mode Exit fullscreen mode

types.ts

The Apollo response has many optional fields, so the schema uses .optional() generously. hubspotUpsertInputSchema extends the Apollo schema with a single hubspotIndustry field — the LLM-mapped value. The workflow output includes an action discriminator (created or updated) so callers know whether the upsert inserted a new record — useful for downstream triggers like "notify sales when a new company lands."

import { z } from '@outputai/core';

export const workflowInputSchema = z.object({
  companyName: z.string().describe('The name of the company to enrich'),
  website: z.string().url().describe('The company website URL (e.g. https://acme.com)'),
});

export const apolloCompanySchema = z.object({
  name: z.string(),
  website: z.string().optional(),
  domain: z.string().optional(),
  industry: z.string().optional(),
  employeeCount: z.number().optional(),
  estimatedRevenue: z.string().optional(),
  description: z.string().optional(),
  linkedinUrl: z.string().optional(),
  city: z.string().optional(),
  country: z.string().optional(),
  keywords: z.array(z.string()).optional(),
  totalFunding: z.number().optional().describe('Total funding raised in USD'),
  latestFundingRound: z.string().optional(),
  fundingStage: z.string().optional(),
});

export const workflowOutputSchema = z.object({
  companyName: z.string(),
  website: z.string(),
  hubspotCompanyId: z.string(),
  apolloData: apolloCompanySchema,
  action: z.enum(['created', 'updated']),
});

export const hubspotUpsertInputSchema = apolloCompanySchema.extend({
  hubspotIndustry: z.string().optional(),
});
Enter fullscreen mode Exit fullscreen mode

The prompt

claude-haiku-4-5 is plenty for a constrained-vocabulary classification task. temperature: 0 keeps the mapping deterministic for the same input. The full HubSpot enum list is interpolated into the system message at runtime, so the model has the exact vocabulary it must pick from — no risk of it hallucinating an industry value that HubSpot will reject.

---
provider: anthropic
model: claude-haiku-4-5
temperature: 0
maxTokens: 256
---

<system>
You are an expert at mapping company industry strings to HubSpot's predefined industry ENUM values.

Given an industry category return the single best-matching HubSpot industry value.

Valid HubSpot industry values:
{{ hubspotIndustries }}

Rules:
- Return EXACTLY one value from the list above
- Pick the closest semantic match even if it's not an exact string match
- If no reasonable match exists, return the closest category
</system>

<user>
Map this industry to a HubSpot industry value:

Industry: {{ industry }}
</user>
Enter fullscreen mode Exit fullscreen mode

About the Zapier SDK

The Zapier SDK is a TypeScript library that provides programmatic access to Zapier's 9,000+ app integrations. Instead of managing OAuth flows, token refresh, and per-app API quirks yourself, the SDK runs actions through the user's existing Zapier connections.

Concept Description
App Key Identifier for an integrated app (hubspot, slack, google_calendar, ...)
Connection A user-authenticated account linked to a specific app, identified by connection ID
Action search (find), write (create/update), read (list), or search_or_write (upsert)

Authentication uses a client ID and secret pair:

import { createZapierSdk } from '@zapier/zapier-sdk';

const zapier = createZapierSdk({
  credentials: { clientId: '...', clientSecret: '...' },
});
Enter fullscreen mode Exit fullscreen mode

Actions are invoked through a chained zapier.apps.<appKey>.<actionType>.<actionKey>() pattern:

const { data: result } = await zapier.apps.hubspot.search_or_write.company_crmSearch({
  inputs: {
    first_search_property_name: 'name',
    first_search_property_value: 'Stripe',
    name: 'Stripe',
    domain: 'stripe.com',
    industry: 'COMPUTER_SOFTWARE',
  },
  connectionId: HUBSPOT_CONNECTION_ID,
});
Enter fullscreen mode Exit fullscreen mode

Beyond running actions, the SDK exposes metadata helpers. listInputFieldChoices paginates through the live set of accepted values for a dropdown/enum input — so the workflow always sees the current vocabulary instead of a stale hardcoded list.

The pattern worth stealing

Any time you're writing to a downstream system with dropdown fields (lifecycle stage, lead source, deal stage, ticket priority), listInputFieldChoices + a cheap model makes your integration survive upstream vocabulary changes without a code release.

The broader pattern — direct API for rich enrichment data, Zapier SDK for the long tail of CRM writes, LLM for the semantic glue between them — scales to any integration where you need deep data from one source and broad reach into another.

Swap HubSpot for Salesforce, Pipedrive, or any of Zapier's 9,000+ integrated apps by changing a single app key. Or keep HubSpot and add a Slack step that notifies sales when action === 'created', so reps learn about new accounts the moment they land.

You can check the complete code here: https://github.com/growthxai/output-examples/tree/main/src/workflows/zapier_hubspot_company_enrichment
and if you like this tutorial check Output https://github.com/growthxai/output

Top comments (0)