Jani Giannoudis

Posted on Mar 21

Payroll Engine: From Open-Source Prototype to Production-Ready

#opensource #dotnet #payroll #architecture

Two and a half years ago, I introduced Payroll Engine in this series — an open-source payroll calculation framework written in C#. The architecture was in place, regulation layers worked, the first Swiss payroll calculations were running. A working prototype.

A lot has happened since. On the road to version 1.0, I had to answer the question every framework developer eventually faces: What's still missing between "it works" and "it's production-ready"?

The answer was: quite a lot.

Security as the Foundation

A payroll system without robust security is not software — it's a liability. This was the area that demanded the biggest overhaul.

Authentication and Authorization

The backend now supports three configurable authentication modes:

None — for local development and testing
ApiKey — simple header-based authentication with environment variable fallback
OAuth — full OAuth 2.0 integration with configurable authority, audience, and client secret

On startup, the system validates the OAuth configuration to prevent token confusion. This sounds like an edge case, but it's exactly the kind of bug that becomes a disaster in multi-tenant environments.

{
  "Authentication": {
    "Mode": "OAuth",
    "Authority": "https://auth.example.com",
    "Audience": "payroll-api"
  }
}

Script Safety Analysis

Payroll Engine executes C# scripts at runtime — that's the core of its regulation flexibility. But flexibility without control is a security risk. The new Script Safety Analysis statically checks every script for banned API calls:

System.IO — no file access
System.Net — no network access
System.Diagnostics — no process control
System.Reflection — no metaprogramming

The feature is opt-in (ScriptSafetyAnalysis: true) because static analysis slows down compilation. For production environments, however, it's strongly recommended.

Cryptography and Input Validation

Under the hood, SHA1 hashes were replaced with SHA256, combined with constant-time comparisons to prevent timing attacks. Password validation was hardened with regex timeouts against ReDoS attacks. Small changes, but essential in a system handling payroll data.

Scaling: From 10 to 10,000 Employees

The prototype calculated payroll sequentially — employee by employee, synchronously. For a company with 20 employees, that's fine. For one with 10,000, it's a showstopper.

Asynchronous Job Processing

Payrun jobs now run through an asynchronous background queue:

The job is pre-persisted in the database
A bounded channel queue (capacity: 100) provides backpressure
A BackgroundService dequeues and processes jobs
The API call immediately returns HTTP 202 with a location header
A webhook fires on completion or abort

On unhandled exceptions or server shutdown, the job is cleanly aborted — no orphaned state in the database.

Parallel Employee Processing

The new MaxParallelEmployees configuration controls the degree of parallelism:

Value	Behavior
`0` / `off`	Sequential (default)
`half`	Half of available CPU cores
`max` / `-1`	All available cores
`1`–`N`	Explicit count

The key to thread safety was introducing PayrunEmployeeScope — an isolated state envelope per employee. Add thread-safe progress reporting with batched DB persistence (every 10 employees), and a payroll calculator cache using Lazy<T> with a composite key (calendar + culture).

Bulk Operations

For initial onboarding of large tenants, there's now a bulk endpoint:

POST .../employees/bulk

Internally, it uses SqlBulkCopy in 5,000-item chunks. This isn't REST purism — it's pragmatism. Inserting 10,000 employees through individual requests simply isn't practical.

Developer Experience

A framework is only as good as it feels to work with. Three features fundamentally improved the developer experience.

Payrun Job Preview

Perhaps the most useful new feature: a synchronous preview endpoint that runs a payroll calculation for a single employee — without writing anything to the database.

POST .../payruns/jobs/preview
→ PayrollResultSet (wage types, collectors, payrun results)

The preview accepts any RetroPayMode but responds with HTTP 422 if a retroactive calculation would be triggered. This gives developers immediate feedback without touching the dataset — ideal for regulation development and UI integration.

Excel-Based Regulation Import

Regulations could always be imported as JSON exchange files. But not every payroll specialist thinks in JSON. The new Excel import supports all regulation objects:

Cases, case fields, case relations, collectors, wage types, lookups, lookup values, reports, report parameters, report templates, and scripts.

This significantly lowers the barrier to entry for professionals defining regulations.

CI/CD and Docker

The entire release pipeline has been automated:

Wave-based builds — dependencies are built in the correct order
Version guard — prevents accidental overwrites of existing releases
Single-click release — one GitHub Actions workflow for all libraries and applications
Docker images — Backend, Console, and WebApp as Linux containers on ghcr.io/payroll-engine/*
Swagger — auto-generated and attached to every release

A dry-run mode allows testing the pipeline without side effects.

MCP Server

Payroll Engine now ships an MCP Server — a lightweight bridge that exposes the PE REST API as tools for any MCP-compatible AI client (Claude Desktop, Cursor).

The v0.1-preview ships with seven read-only tools: GetTenants, GetEmployees, GetPayrolls, GetPayrunJobs, GetPayrunResults, GetEmployeeCaseValues, and GetWageTypes. Built on the existing Client.Services interfaces, it adds no new business logic — it simply makes the existing API queryable in natural language.

Claude Desktop / WebApp
        │  MCP Protocol (stdio)
        ▼
PayrollEngine.McpServer
        │  PE HTTP Client
        ▼
PayrollEngine Backend (REST API)

Local setup takes one entry in claude_desktop_config.json. Hosted deployment via SSE transport is planned for a later release.

To make this concrete: here's a real session with Claude Desktop against a live payroll database:

Q: What was the gross salary of all employees as of December 31, 2024?
A: (structured list of employees with their wage type results for the period)

Q: What changed in the employee data of mario.nunez in January 2025?
A: (list of case value mutations with effective dates)

Q: What is the tax rate for an income of 85,000 in the TaxRates lookup?
A: (resolved lookup value across all regulation layers — no SQL, no join, no code)

The last example is particularly interesting: the AI resolves the lookup across all stacked regulation layers automatically. There is no configuration required on the client side.

Still early — stdio-only, read-only, no write operations. But already useful enough to change how payroll data is explored during development and support.

Integrated Load Tests

Dedicated load test commands are now built into the Console:

LoadTestGenerate — scales exchange files from any regulation template
LoadTestSetup — imports employees via the bulk API
PayrunLoadTest — executes payruns with warmup, measured repetitions, and CSV report

This enables reproducible performance measurements integrated directly into the development cycle.

Stability: The Invisible Work

No release post is just about new features. The truth is: a large part of the work was finding and fixing bugs that never surface under normal conditions — but lead to catastrophic failures under load or in edge cases.

Some examples:

Inverted filter logic — GetCaseValuesAsync was excluding values matching the requested slot instead of keeping them. The classic off-by-negation bug that doesn't show up in single-slot tests.

Race condition in code cache — CodeFactory.CodeFiles used a Dictionary that corrupted under concurrent access. Replaced with ConcurrentDictionary.

Non-deterministic culture fallback — Payroll calculation used CultureInfo.CurrentCulture, which can vary by thread and server. Now deterministically falls back to en-US.

Sync-over-async in the scripting layer — .Result calls causing deadlocks under load. Replaced with .ConfigureAwait(false).GetAwaiter().GetResult().

Timer leak in assembly cache — Missing thread-safe initialization led to duplicate timers that were never cleaned up.

These aren't glamorous fixes. But they're the difference between "works on my machine" and "works reliably in production."

Additional Improvements

Beyond the core themes, there are several further enhancements:

Rate limiting — configurable limits per endpoint, with a dedicated policy for the payrun start endpoint
CORS configuration — disabled by default, fine-grained configuration available
Granular audit trail — separate controls for Script, Lookup, Input, Payrun, and Report instead of a single toggle
Database collation check — verified on startup before the schema check to prevent silent data integrity issues
Retro payrun limit — MaxRetroPayrunPeriods as a safety net against runaway retroactive calculations
MySQL support — a separate DbContext implementation alongside the existing SQL Server persistence layer, making self-hosted deployments more accessible across different infrastructure environments
Employee timing logs — per-employee duration and summary for performance analysis

What's Next

Payroll Engine 1.0 ships in April 2026. The architecture is stable, the API is solid, and the regulation layer is battle-tested.

The project is open source (MIT license) and targets developers who want to embed payroll functionality into existing HR and ERP systems — not as SaaS, but as a self-hosted framework where payroll logic lives in configurable regulations rather than hardcoded rules.

Links:

Feedback and contributions are welcome. If you have questions about the architecture or integration, open an issue on GitHub or drop a comment below.

DEV Community