TLDR:
- Use
json_build_object
to convert relations (tables or CTE/WITH clauses) to JSON. - Write a helper JavaScript function that abstracts the approach.
Background and problem
Imagine you are working on something exciting, colorful, and wonderful like a warehouse management system.
As the system evolves, new features, requirements, and processing logic keep expanding,
SQL queries start becoming more complex, so you resort to using WITH
clauses to keep everything organized and maintainable. It works great.
However, when it comes to debugging intermediate steps/states/variables,
while in the application layer, you start reaching out to structured logging libraries like JavaScript's Pino or Golang's slog.
The idea
Postgres has advanced JSON support, so why not do something like the following?
JSON.stringify({
data1: data1,
data2: data2
});
In SQL, that would translate to:
SELECT json_build_object(
'data1', data1,
'data2', data2
)
FROM data1, data2;
Or, in a way to avoid cross joins and using JSONB (supposedly both are better for performance):
SELECT jsonb_build_object(
'data1', (SELECT jsonb_agg(data1) FROM data1),
'data2', (SELECT jsonb_agg(data2) FROM data2)
);
Example
Imagine your project manager assigns the following user story to you:
User Story: Inventory Reorder Alert
As a warehouse manager,
I need a report identifying products requiring immediate reorder based on low total stock quantities across all locations.Requirements:
- Use the provided sample data (locations: S1, S2, S3; products and quantities).
- Calculate the total quantity per product across all locations.
- Flag products as Reorder if total stock is below 50 units; otherwise, mark as Sufficient.
- Display only products needing reorder in the final output.
Acceptance Criteria:
- Output columns:
product
,stock_status
.- Only include rows where
stock_status
is Reorder.
You could solve it using a single SQL query.
Notice that we are using CTEs and are only interested in Step 1 and Step 2.
-- Define raw data using the VALUES clause.
WITH raw_data(location, product, quantity) AS (
VALUES
('S1', 'Cocoa beans', 40),
('S2', 'Cocoa beans', 10),
('S2', 'Sugar', 20),
('S3', 'Vanilla', 75)
),
-- Aggregation: Get total sum per product.
step1 AS (
SELECT product,
SUM(quantity) AS quantity
FROM raw_data
GROUP BY product
),
-- Transformation: Add a stock status based on total quantity.
step2 AS (
SELECT
product,
quantity,
CASE
WHEN quantity < 50 THEN 'Reorder'
ELSE 'Sufficient'
END AS stock_status
FROM step1
),
-- Filter
step3 AS (
SELECT
product,
stock_status
FROM step2
WHERE stock_status = 'Reorder'
),
logged_steps AS (
SELECT jsonb_build_object(
'step1', (SELECT jsonb_agg(step1) FROM step1),
'step2', (SELECT jsonb_agg(step2) FROM step2)
) AS json_payload
)
SELECT *
FROM step3, logged_steps
Better DX
We can reuse this logic with the help of query builders like Knex and its modify
method.
Helper:
function withLoggedSteps(qb, ...relations) {
qb.crossJoin('logged_steps');
qb.select('logged_steps.json_payload');
const pairs = relations
.map(rel => `'${rel}', (SELECT jsonb_agg(${rel}) FROM ${rel})`)
.join(', ');
qb.with(
'logged_steps',
knex.raw(`SELECT jsonb_build_object(${pairs}) AS json_payload`)
);
}
Usage:
knex
.select('step3.*')
.with('generated_data', '(VALUES ...)')
.with('step1', '...')
.with('step2', '...')
.with('step3', '...')
.from('step3')
.modify(withLoggedSteps, 'step1', 'step2')
.then(rows => console.log(rows));
Interactive demo
See: Live demo
Using:
PGLite in-memory embedded Postgres database.
-
Knex and a PGLite "driver"
Some REPL library (
vue-live
)
Performance Disclaimer
Let's be honest - in my day-to-day development, I'm typically working with small, filtered datasets.
Most of the time, I'm debugging queries for a specific product or a narrow time range.
So while these JSON logging techniques work wonderfully in my development environment, I haven't extensively tested them on massive datasets that might process millions of rows.
When you're in development mode, logging overhead is practically unimportant.
We're here to understand why our queries behave strangely and to figure out where your complex query goes wrong.
I am aware of alternative approaches like using RAISE NOTICE
or inserting intermediate results into temporary tables. I have not tested them.
Conclusion
And there you have it, implementing JSON structured logging in SQL thanks to Postgres features.
END.
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments. Some comments have been hidden by the post's author - find out more