Confluence Docs Lie. Tie Your Documentation to Code Instead📘

#java #documentation #postgres #kafka

Every team has that Confluence page. The one that was carefully written to explain what the service does, what the API looks like, what each DB column means. Someone spent real time on it. It was accurate when it was written.

Six months later, it's fiction.

This isn't a discipline problem. I've seen it at teams with strong engineering culture, with good processes, with people who genuinely care about documentation. The Confluence page still goes stale. The root cause is structural, and until you treat it that way, nothing changes.

Why external docs always drift

When documentation lives in a different place than the code, there's no mechanism that forces them to stay in sync. It relies entirely on people remembering to update two separate things every time anything changes. A column gets renamed, an endpoint response adds a field, a Kafka message format evolves — the ticket gets closed, the code gets merged, and the Confluence page stays where it was.

AI writing more of your code makes this worse. More code ships faster now. The documentation debt compounds faster too.

The fix isn't better processes or more reminders in the PR template. It's to stop maintaining documentation as a separate artifact and tie it directly to the code.

Three tools that keep docs honest

Swagger on your REST controllers

If you're using Spring Boot with springdoc-openapi, you already have the infrastructure. You just need to use it properly.

@RestController
@RequestMapping("/api/orders")
@Tag(name = "Orders", description = "Order lifecycle management")
public class OrderController {

    @Operation(
        summary = "Get order by ID",
        description = "Returns a single order. 404 if the order doesn't exist or belongs to a different customer."
    )
    @ApiResponse(responseCode = "200", description = "Order found")
    @ApiResponse(responseCode = "404", description = "Order not found")
    @GetMapping("/{id}")
    public ResponseEntity<OrderDto> getOrder(
        @Parameter(description = "Internal order ID") @PathVariable Long id
    ) {
        return orderService.findById(id)
            .map(ResponseEntity::ok)
            .orElse(ResponseEntity.notFound().build());
    }
}

The Swagger UI becomes a living contract. It's always current because it's generated from the code. Frontend devs, QA, external teams can check it themselves without asking you anything.

If your controllers are getting buried in annotations, the previous post on moving Swagger to interfaces covers that pattern.

PostgreSQL column comments in DDL

PostgreSQL supports COMMENT ON natively. It's been there forever, almost nobody uses it.

CREATE TABLE orders (
    id          BIGSERIAL    PRIMARY KEY,
    customer_id BIGINT       NOT NULL,
    status      VARCHAR(20)  NOT NULL,
    total_cents BIGINT       NOT NULL,
    created_at  TIMESTAMPTZ  NOT NULL DEFAULT now()
);

COMMENT ON TABLE orders IS 'Customer purchase orders. One row per order, regardless of item count.';

COMMENT ON COLUMN orders.status IS
    'Order lifecycle state. Valid values: PENDING, CONFIRMED, SHIPPED, DELIVERED, CANCELLED. '
    'Transitions are enforced in OrderService, not at the DB level.';

COMMENT ON COLUMN orders.total_cents IS
    'Order total in cents. Always positive. Divide by 100 for display. '
    'Never store as decimal — rounding bugs.';

These comments ship with the schema. They live in your migration files. When the column changes, the comment gets updated in the same commit, by the same person, in the same PR review.

You can query them directly:

SELECT
    c.column_name,
    pgd.description
FROM pg_catalog.pg_statio_all_tables AS st
JOIN pg_catalog.pg_description pgd ON pgd.objoid = st.relid
JOIN information_schema.columns c
    ON c.table_name = st.relname
    AND ordinal_position = pgd.objsubid
WHERE c.table_schema = 'public'
  AND c.table_name = 'orders';

And any decent DB client (DBeaver, DataGrip) surfaces them automatically when you hover over a column. No page to open. The schema explains itself.

doc attribute in Avro schemas

If you're using Avro for Kafka messages, the doc field is part of the spec. It's not optional in any meaningful sense if you care about your consumers understanding the contract.

{
  "type": "record",
  "name": "OrderCreatedEvent",
  "namespace": "com.example.events",
  "doc": "Published when a new order is placed. Consumed by inventory-service, billing-service, and notification-service.",
  "fields": [
    {
      "name": "orderId",
      "type": "long",
      "doc": "Internal order ID from the orders table."
    },
    {
      "name": "customerId",
      "type": "long",
      "doc": "References users.id. The customer who placed the order."
    },
    {
      "name": "totalCents",
      "type": "long",
      "doc": "Order total in cents. Always positive. Same semantics as orders.total_cents."
    },
    {
      "name": "status",
      "type": "string",
      "doc": "Initial status at publish time. Always PENDING for this event."
    }
  ]
}

When another team's developer needs to understand what this event carries, they read the schema. If the schema is registered in a Schema Registry (Confluent or otherwise), the docs are browsable there. No Confluence page, no hunting down who owns the topic.

Who else benefits

The obvious beneficiaries are engineers. Less time spent answering "what does this field mean" questions, better context when onboarding, less cognitive overhead when reading unfamiliar code.

But the compounding effect shows up with less technical people.

Give your BA a dump of the database DDL — just the schema files, no actual data. Or give them read-only access to a dev environment DB. With an AI assistant they can load those schemas and start asking questions: what do we store, what are the valid states for this field, how are orders and customers related. They get answers without pinging a developer. The schema is always current because it's in version control.

Same with Swagger. A product manager who can open Swagger UI and see the actual endpoints, parameters, and response shapes during a design discussion is a product manager who isn't blocked waiting for you to write up an email.

Your AI coding assistant also benefits. An LLM reading your schema with doc fields and COMMENT ON COLUMN entries has significantly more context than one reading bare column names. The quality of generated code improves with it.

The honest trade-off

Writing proper Swagger annotations, DDL comments, and Avro doc fields takes time upfront. Real time. When you're under pressure to ship the feature by Friday, adding good COMMENT ON COLUMN entries to the migration doesn't feel like a priority.

It's also harder to enforce than a Confluence page. With Confluence, you can at least point to a URL and say "write it here." With code-tied docs, you need to make it part of the review culture — PRs that add columns without comments don't get merged.

That's a real overhead, and I won't pretend otherwise.

But the alternative is paying that cost continuously. Every time someone joins the team and has to ask what a column means. Every time a BA opens a Jira ticket that could have been answered by looking at the schema. Every time your Confluence page sends someone down the wrong path because nobody updated it after the last refactor.

The upfront investment is a one-time cost per artifact. Stale docs are a recurring cost forever.

Write them once. Keep them with the code. They won't lie.

Do you have a practice for keeping docs in sync with code, or is it still Confluence pages and hope? Curious what's actually working at different team sizes.