SysLayer

Posted on May 15

Coding Cat Oran S2 Ep6 — The Abstract Table and the Gateway

#career #webdev #sql #database

While three departments argued about column names,
IT built the thing anyway.
It passed the audit. Barely.
Barely is enough.

Day 87. Three days before re-audit.

In Conference Room B, the departments were in a meeting
about whether the report should show defect count or defect rate.
This meeting had been running, in various forms,
for eleven days.

Oran was not in the meeting.

He was in the server room
with his two developers —
Marcus, who knew SQL,
and Kevin, who was very enthusiastic —
and a whiteboard
covered in three tables.

Three tables, Oran had said on Day 60.
That's all we're building.
Everything else is a column.

The design.

The problem with every schema they had tried
was the same problem:
it assumed the data had a fixed shape.

Fixed columns. Fixed relationships. Fixed definitions.

But the data didn't have a fixed shape.
QA's data had a different shape than Engineering's.
Product A's data had a different shape than Product B's.
The shape kept changing because the specs kept changing
because the products kept evolving
because that's what trial production to mass production actually looks like.

You cannot design a fixed schema for a moving target.

So Oran stopped trying to design for the data.
He designed for the events.

-- Table 1: Something happened.
CREATE TABLE production_events (
  event_id      BIGINT        PRIMARY KEY IDENTITY(1,1),
  batch_id      VARCHAR(50)   NOT NULL,
  event_type    VARCHAR(50)   NOT NULL,   -- 'inspection', 'process', 'output', 'material_intake'
  occurred_at   DATETIME      NOT NULL,
  recorded_by   VARCHAR(50),
  source        VARCHAR(50)   NOT NULL    -- 'csv', 'excel', 'api', 'manual'
);

-- Table 2: Here's what we know about it.
CREATE TABLE event_attributes (
  attribute_id  BIGINT        PRIMARY KEY IDENTITY(1,1),
  event_id      BIGINT        NOT NULL REFERENCES production_events(event_id),
  key           VARCHAR(100)  NOT NULL,
  value         NVARCHAR(MAX),
  unit          VARCHAR(20)
);

-- Table 3: Here's what each key means — and when that meaning changed.
CREATE TABLE attribute_definitions (
  definition_id BIGINT        PRIMARY KEY IDENTITY(1,1),
  event_type    VARCHAR(50)   NOT NULL,
  key           VARCHAR(100)  NOT NULL,
  label         VARCHAR(200),
  data_type     VARCHAR(20),
  valid_from    DATE          NOT NULL,
  valid_to      DATE,                     -- NULL = currently valid
  added_by      VARCHAR(50),
  notes         TEXT
);

Three tables.
Every department's data fits.

QA's inspection: a production_event of type inspection,
with attributes for stage, result, defect_type, defect_count, inspector_id.

Engineering's process data: a production_event of type process,
with attributes for temperature, pressure, spec_version, compliance.

Manufacturing's output: a production_event of type output,
with attributes for quantity, shift, line_id, operator_id.

New column needed? Add a new key to event_attributes. No migration.
Spec changed? Add a new row to attribute_definitions with a new valid_from. Old data stays valid under the old definition.
New product line in Q3 with completely different parameters? Same three tables. Different keys.

Kevin said: "This is like... a database inside a database."
Marcus said: "It's EAV. Entity-Attribute-Value. It has tradeoffs."
Kevin said: "What tradeoffs?"
Marcus said: "Querying is slower. You have to pivot the attributes to get a report format.
You lose type safety at the column level.
It's harder to write ad-hoc queries."
Kevin said: "So why are we doing it?"

Oran looked at the whiteboard.
He thought about the spec that changed eleven months ago with no documentation.
He thought about QA's four inspection stages that might become five.
He thought about the new product line in Q3.
He thought about Conference Room B, where the meeting was still running.

"Because the alternative," he said,
"is redesigning the schema every time something changes.
And something always changes."

The gateway.

The schema solved the storage problem.
The gateway solved the input problem.

Three departments. Three data formats. Three ways of working
that were not going to change just because IT asked nicely.

Manufacturing would never use an API.
Their shift supervisors had been using Excel since 2009
and would continue using Excel until they retired.

QA had fourteen Excel templates. Different ones for different products.
They were not going to consolidate them.
Oran had stopped asking.

Engineering's test equipment could output to CSV automatically
but the engineers had also written some internal tools
that could call an HTTP endpoint if one existed.

So Oran built three doors to the same room.

[ Manufacturing Excel files ]  →  CSV drop folder  ┐
[ QA Excel templates        ]  →  Excel endpoint   ├→ Gateway → production_events
[ Engineering test equipment]  →  HTTP API         ┘            event_attributes

The gateway parsed each format,
validated the data against attribute_definitions,
rejected rows that didn't match,
logged everything,
and wrote clean events and attributes to the database.

Each department thought they were using their own system.
They were using the same system.
They didn't need to know.

Day 89. 11:47pm.

Marcus pushed the last fix.
Kevin ran the import script for the sixth time.
It finished without errors.

Oran ran the audit report query.

SELECT
  pe.batch_id,
  pe.occurred_at,
  MAX(CASE WHEN ea.key = 'output_qty'     THEN ea.value END) AS output_qty,
  MAX(CASE WHEN ea.key = 'defect_count'   THEN ea.value END) AS defect_count,
  MAX(CASE WHEN ea.key = 'final_result'   THEN ea.value END) AS inspection_result,
  MAX(CASE WHEN ea.key = 'material_lot'   THEN ea.value END) AS material_lot,
  MAX(CASE WHEN ea.key = 'spec_version'   THEN ea.value END) AS spec_version
FROM production_events pe
JOIN event_attributes ea ON ea.event_id = pe.event_id
WHERE pe.batch_id LIKE 'PA-%'
  AND pe.occurred_at >= '2024-01-01'
GROUP BY pe.batch_id, pe.occurred_at
ORDER BY pe.occurred_at;

It returned 847 rows.
Each one traceable from raw material to finished goods.
Each one stamped with the spec version that was in effect at the time.
Each one showing which system it came from, which operator recorded it, when.

Oran saved the result as a PDF.

He did not celebrate.
He had learned not to celebrate before the audit.

He closed his laptop.
He went home.
He slept four hours.

Day 90.

Ms. Chen arrived at 9am.
She opened her iPad.
She ran through the checklist.

Single source of truth: pass.
Traceability, raw material to finished goods: pass.
Audit trail for changes: pass.
Data in non-editable, auditable system: pass.
Report format matches customer requirements: pass.

She closed her iPad.

"This is sufficient for certification," she said.
"For now."

For now.

The CEO shook Oran's hand.
The VP of Manufacturing said: "See? We told you we'd cooperate."
The VP of QA was already asking about new fields for Phase 2.
The VP of Engineering wanted to know about the Q3 product line.

Oran opened his notebook.
He wrote the date.
He drew a line under Phase 1.
Below it, he wrote: Phase 2.
Below that: ?

He walked back to his desk.

On the production floor, he passed the QA station.
He passed the line supervisor's workstation.
He passed the shift handover desk.

On the shift handover desk,
open on the screen,
was a new Excel file.

It had three columns.
Someone had just started filling in the second row.

Oran looked at it for a moment.

Then he kept walking.

He had work to do.

The audit was passed.
The system was live.
The data was clean.

And somewhere in the factory,
someone had already started a new Excel file.

That wasn't a failure.
That was just the beginning of Phase 2.

What Oran learned — and what comes next

The EAV pattern solved the immediate problem.
But it introduced new ones that Oran hasn't solved yet:

The permission problem.
QA should not see Engineering's process parameters.
Engineering should not see QA's internal scoring.
The current system has no row-level permissions.
Everyone can query everything.
Oran knows exactly how to fix this.
He wrote a guide about it.
Nobody has read it yet.

The spec versioning problem.
attribute_definitions tracks when each key's meaning changed.
But querying "what did this batch comply with, at the time of production"
still requires careful joins that are easy to get wrong.

The real-time problem.
Everything is batch import. Every four hours.
Engineering wants live process data from the machines.
That's a stream. That's a different architecture.

The customer portal.
The big customer now wants direct access.
Not a report. A live portal. Their login. Their data only.
By end of Q3.

Oran opened his notebook to a fresh page.

Phase 2, he wrote.

Then he started a new list.

← Ep1: The Excel Republic
← Ep2: The Big Customer
← Ep3: The Auditor Arrives · Ep4 · Ep5*

Coding Cat Oran is a serialized fiction about building real production systems inside real companies.
The EAV schema is real. The gateway pattern is real. The Excel file on the handover desk is very, very real.
The cat is fictional. Phase 2 is not.

By SysLayer · dev.to/syslayer

DEV Community

Coding Cat Oran S2 Ep6 — The Abstract Table and the Gateway

What Oran learned — and what comes next

Top comments (0)