DEV Community

Cover image for Publishing One Package to Five Registries with GitHub Actions
Iurii Rogulia
Iurii Rogulia

Posted on • Originally published at iurii.rogulia.fi

Publishing One Package to Five Registries with GitHub Actions

Every developer building for European markets hits the same wall eventually. EU VAT rates are public data — the European Commission publishes them — but getting that data into your codebase reliably is a mess. You hardcode the rates, they change, your invoices are wrong. You scrape a website, the HTML structure changes, your scraper breaks. You pay $50–200/month for vatstack or vatlayer to access data the EC gives away for free.

I built eu-vat-rates-data to fix this: a free, open-source dataset of EU VAT rates, published as native packages for npm, PyPI, Go Module, RubyGems, and Packagist — updated daily from the official source, with zero manual steps in the publishing pipeline. The full project overview is in the eu-vat-rates-data project card.

This is how it works, and what I learned building a single dataset across five language ecosystems.

Why Five Registries

The first question I get: why not just publish JSON and let developers fetch it themselves?

Because that turns every consumer into a maintenance burden. They have to write the fetch logic, handle errors, decide on caching, deal with network failures at build time, and parse the response format. A native package means import { getRate } from "eu-vat-rates-data" — the data is bundled, typed, and versioned. No network calls, no parsing, no surprises.

The second question: why not just npm? Because not every EU-facing project is JavaScript. The finance team's tooling might be PHP (common in European accounting software). The microservice doing VAT calculation might be Go. The data pipeline might be Python. Publishing to one registry excludes everyone else.

So: five registries. Same data, same logical API, each package idiomatic for its ecosystem.

The Source of Truth

The data comes from the European Commission TEDB — Taxes in Europe Database — a SOAP web service at ec.europa.eu/taxation_customs/tedb/ws/. Each day, a Python script sends a typed XML request for all 28 countries (EU-27 plus UK) with situationOn set to today's date:

soap_body = f"""<v1:retrieveVatRatesReqMsg>
  <types:memberStates>
    {''.join(f'<types:isoCode>{c}</types:isoCode>' for c in COUNTRY_CODES)}
  </types:memberStates>
  <types:situationOn>{today}</types:situationOn>
</v1:retrieveVatRatesReqMsg>"""
Enter fullscreen mode Exit fullscreen mode

The response is parsed into a canonical JSON structure stored in data/eu-vat-rates-data.json. This file is the single source of truth — all five packages read from it.

A few non-obvious edge cases the script handles:

Greece uses EL in TEDB, not the ISO standard GR. The response comes back with EL codes. Explicit mapping: TEDB_TO_ISO = {"EL": "GR"}. Miss this and your Greek VAT lookups will silently return nothing.

The UK is not in TEDB after Brexit. GB rates (20% standard, 5% reduced) are hardcoded as a static fallback in the script and updated manually when Westminster changes them.

TEDB returns non-numeric rate types. EXEMPTED, OUT_OF_SCOPE, NOT_APPLICABLE — all filtered out. Only positive floats make it into the dataset.

France, Portugal, and Spain have territorial special rates that appear as duplicates in the SOAP response. Aggregated into a Python set() before writing.

SOAP namespace stripping is manual. XML tags arrive as {urn:ec.europa.eu:taxud:tedb:services:types}vatRateResults. Stripped with el.tag.split("}")[-1] — no XML library that handles namespaces automatically because we want to keep the script dependency-free.

Repository Architecture

The JavaScript/TypeScript package is the primary repo: vatnode/eu-vat-rates-data. It owns the Python ingestion script, the canonical JSON file, and the npm publishing workflow.

The other four language packages are separate repositories that each pull the canonical JSON directly from the primary repo:

vatnode/eu-vat-rates-data          ← source of truth, npm
vatnode/eu-vat-rates-data-python   ← PyPI
vatnode/eu-vat-rates-data-php      ← Packagist
vatnode/eu-vat-rates-data-go       ← Go Module (pkg.go.dev)
vatnode/eu-vat-rates-data-ruby     ← RubyGems
Enter fullscreen mode Exit fullscreen mode

I considered a monorepo but rejected it. The Go and Ruby toolchains expect certain directory structures and version tagging conventions that conflict with each other and with the JavaScript toolchain. Separate repos with separate workflows is more work to set up but far less brittle to maintain.

The Publishing Workflow

Primary Repo (JavaScript, runs at 07:00 UTC)

# .github/workflows/update-rates.yml
name: Update VAT Rates

on:
  schedule:
    - cron: "0 7 * * *"
  workflow_dispatch:

jobs:
  update:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Fetch rates from EC TEDB
        run: python scripts/fetch-rates.py

      - name: Check for rate changes
        id: diff
        run: |
          # Compare rates only, not the version date field
          python scripts/compare-rates.py
          echo "changed=$?" >> $GITHUB_OUTPUT

      - name: Bump version and publish
        if: steps.diff.outputs.changed == '1'
        run: |
          npm version patch --no-git-tag-version
          npm run build
          npm publish --access public
        env:
          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}

      - name: Commit and push
        run: |
          git config user.name "github-actions[bot]"
          git config user.email "github-actions[bot]@users.noreply.github.com"
          git add data/eu-vat-rates-data.json package.json
          git diff --staged --quiet || git commit -m "chore: update VAT rates $(date +%Y-%m-%d)"
          git push
Enter fullscreen mode Exit fullscreen mode

The key design decision: the workflow checks whether the actual rates changed, not just whether the date changed. compare-rates.py extracts the rates object from both the old and new JSON and does a deep comparison — if the numbers are identical, no version bump happens and the downstream repos won't trigger either. This prevents a flood of patch versions on days when nothing changed.

Dependent Repos (run at 08:00 UTC, one hour later)

Each dependent repo has a simple workflow that fetches the canonical JSON from the primary repo via curl and runs the publish script:

# .github/workflows/sync.yml (Python example)
- name: Fetch canonical data
  run: |
    curl -sL https://raw.githubusercontent.com/vatnode/eu-vat-rates-data/main/data/eu-vat-rates-data.json \
      -o eu_vat_rates_data/data.json

- name: Check for changes
  id: diff
  run: python scripts/compare_versions.py

- name: Publish to PyPI
  if: steps.diff.outputs.changed == '1'
  run: |
    python -m build
    python -m twine upload dist/*
  env:
    TWINE_USERNAME: __token__
    TWINE_PASSWORD: ${{ secrets.PYPI_TOKEN }}
Enter fullscreen mode Exit fullscreen mode

The one-hour delay is intentional: the primary repo's workflow needs to commit the updated JSON before the dependent repos try to fetch it. GitHub Actions schedule times are approximate (can be up to 15 minutes late), so one hour gives comfortable margin.

Language-Specific Implementation Details

All five packages expose the same logical API: getRate(countryCode), getStandardRate(countryCode), isEUMember(countryCode), dataVersion. But the implementation differs by ecosystem idioms.

TypeScript — Overloaded Signatures for Type Safety

The TypeScript package uses overloaded function signatures so the return type narrows based on whether you pass a known CountryCode or a plain string:

// packages/js/src/index.ts
export type CountryCode =
  | "AT"
  | "BE"
  | "BG"
  | "CY"
  | "CZ"
  | "DE"
  | "DK"
  | "EE"
  | "ES"
  | "FI"
  | "FR"
  | "GB"
  | "GR"
  | "HR"
  | "HU"
  | "IE"
  | "IT"
  | "LT"
  | "LU"
  | "LV"
  | "MT"
  | "NL"
  | "PL"
  | "PT"
  | "RO"
  | "SE"
  | "SI"
  | "SK";

export function getRate(countryCode: CountryCode): VatRate;
export function getRate(countryCode: string): VatRate | undefined;
export function getRate(countryCode: string): VatRate | undefined {
  return data.rates[countryCode as CountryCode];
}

// isEUMember is a type guard — narrows string to CountryCode
export function isEUMember(code: string): code is CountryCode {
  return code in data.rates;
}
Enter fullscreen mode Exit fullscreen mode

This means if you've already called isEUMember, TypeScript knows the subsequent getRate call returns VatRate, not VatRate | undefined. The user doesn't need a null check they're going to forget anyway.

Published as dual CJS + ESM with .d.ts declarations so it works in both Node.js and browser bundlers.

Go — Embedded Data, No File I/O at Runtime

Go uses //go:embed to bundle the JSON into the binary at compile time:

// eu_vat_rates.go
import _ "embed"

//go:embed data/eu-vat-rates-data.json
var rawData []byte

var dataset Dataset

func init() {
    if err := json.Unmarshal(rawData, &dataset); err != nil {
        panic("eu-vat-rates-data: failed to parse embedded data: " + err.Error())
    }
}

// Nullable fields use pointer types — standard Go pattern
type VatRate struct {
    Country      string    `json:"country"`
    Currency     string    `json:"currency"`
    Standard     float64   `json:"standard"`
    Reduced      []float64 `json:"reduced"`
    SuperReduced *float64  `json:"super_reduced"` // nil if not applicable
    Parking      *float64  `json:"parking"`        // nil if not applicable
}
Enter fullscreen mode Exit fullscreen mode

The init() panic on parse failure is intentional. If the embedded JSON is somehow corrupted at build time, you want to find out immediately — not at the moment a production service tries to look up a VAT rate at 2am.

Python — importlib.resources for Wheel Compatibility

The Python package uses importlib.resources.files() to locate the bundled JSON, not __file__-relative path construction:

# eu_vat_rates_data/__init__.py
import importlib.resources
import json
from typing import TypedDict, Optional

class VatRate(TypedDict):
    country: str
    currency: str
    standard: float
    reduced: list[float]
    super_reduced: Optional[float]
    parking: Optional[float]

def _load_data() -> dict:
    ref = importlib.resources.files("eu_vat_rates_data") / "data.json"
    with importlib.resources.as_file(ref) as path:
        with open(path) as f:
            return json.load(f)

_dataset = _load_data()
Enter fullscreen mode Exit fullscreen mode

The __file__-relative approach (os.path.join(os.path.dirname(__file__), "data.json")) breaks inside zip archives and certain wheel installations. importlib.resources is the correct modern approach — it works in all packaging scenarios including frozen executables.

slug="automation-workflows"
text="Need a zero-touch publish pipeline, scheduled data sync, or multi-registry release automation? I build these as part of the automation work I do for EU-facing products."
/>

Versioning Strategy

Version format: YYYY.M.D — for example 2026.3.15. If the workflow runs twice on the same day (manual trigger plus scheduled), a counter suffix: 2026.3.15.1, 2026.3.15.2.

I deliberately chose date-based versioning rather than semantic versioning for this package. The "change" is always data, never API. There are no breaking changes in the API across versions — the function signatures are stable. So semver's major.minor.patch distinction is meaningless here. A date version tells users exactly when the data was last updated without them having to read a changelog.

One consequence: the dependent repos need to match the version before publishing. Each dependent workflow parses the version string from the primary repo's package.json, checks if it already published that version, and skips if so.

Gotchas From Production

The TEDB SOAP endpoint is occasionally unavailable. Around EU public holidays, it returns 503. The fetch script has a retry loop — three attempts with 60-second sleep between them — before it fails the workflow. A failed workflow generates a GitHub Actions email notification. The data from the previous successful run remains valid; most EU VAT rate changes happen on January 1st or a country-specific date, not randomly.

GitHub Actions schedule is not precise. The 07:00 UTC trigger might fire at 07:12. The 08:00 UTC trigger for dependent repos might fire at 07:58. When I originally set the gap at 30 minutes, I got intermittent failures where the Python workflow pulled the old JSON. One hour is the safe minimum.

Packagist (PHP) requires a composer.json version field that matches the git tag. The other registries accept a version number in a config file without a corresponding tag. For Packagist, I have to tag the commit — v2026.3.15 — before the composer.json version bump is visible. The workflow creates the tag first, then commits.

PyPI and npm reject re-publishing the same version. This means if the publish step fails for a transient reason (network timeout, registry hiccup), re-running the workflow will fail again at publish because the version already exists. I handle this by catching the specific "version already exists" error and treating it as success — if it published, the goal is achieved.

Results

Metric Value
Packages published 5 (npm, PyPI, Packagist, Go, RubyGems)
Countries covered 28 (EU-27 + UK)
Rate types 4 (standard, reduced, super_reduced, parking)
Update frequency Daily (automated)
Manual intervention Zero (unless EC TEDB changes its API)
Publishing cost €0/month

The git history of data/eu-vat-rates-data.json is a side-effect that turned out to be genuinely useful: it's a complete, automatically maintained audit trail of every EU VAT rate change since the project launched. Finland's rate changed from 24% to 25.5% in September 2024 — there's a commit for it with the exact date.


If you're building a product that handles EU pricing, VAT calculation, or tax compliance, the package is free to use. github.com/vatnode/eu-vat-rates-data.

If you need a developer who understands EU VAT compliance at the code level — from VIES validation to multi-country rate handling to accurate invoice generation — the production VAT SaaS is vatnode.dev. For automation workflows or API integrations that feed data between systems reliably — get in touch. I'm available for freelance projects and long-term engagements.


Related projects: eu-vat-rates-data open source dataset — the project described in this article. vatnode.dev VAT Validation API — production SaaS built on top of this dataset.

Related reading: Data Format Comparison: JSON, YAML, TOML, CSV, Protobuf — the format decisions that underpin a multi-ecosystem package. | UUID v7 in Production: Why Your Database Hates v4 — the identifier strategy used in the vatnode SaaS that consumes this dataset.

Top comments (0)