Efficient Progress Aggregation in a Deeply Nested Task Hierarchy with Real-Time Sync (PostgreSQL + PowerSync)

#database #help #powersync

I’m developing a task management application with a deeply nested task hierarchy (mother-child tasks). Each task has the following structure (PostgreSQL schema):

sql
id UUID PRIMARY KEY,
creator_id UUID,
task TEXT,
is_done BOOLEAN,
mother_task UUID REFERENCES tasks(id),
estimated_progress_duration_in_seconds INTEGER,
current_progress_duration INTEGER

📂 Example Task Hierarchy

- x project (100h)
  - website (5h)
    - design (1h)
    - code (3h)
      - home (30m)
        - side menu (10m)
        - center screen (20m)
  - deploy (1h)

🧩 The Problem

Let’s say I work on a leaf task like “side menu” and update its current_progress_duration every few seconds.
Should I immediately propagate this progress update to all parent tasks (e.g., home → code → website → x project)?

This may result in 5–10 write operations per update, which can lead to performance issues — both in terms of local disk IO and syncing overhead (see next section).

⸻

⚙️ Context
• I’m using PowerSync, which keeps a local client-side database in sync with the server-side PostgreSQL DB.
• The client app handles most of the logic locally, and all local writes are synced automatically to the server via PowerSync.
• Frequent updates to nested tasks may cause excessive writes and degrade performance, especially when propagating changes upward in the hierarchy.

⸻

🧠 Design Alternatives I’ve Considered
• Eager Propagation: Immediately update all parent tasks up the chain on every child progress update.
• Lazy Aggregation: Only compute and propagate totals on-demand (e.g., when rendering UI or fetching aggregate data).
• Partial Propagation: Only update the immediate parent and optionally batch or throttle updates to higher-level tasks.
• Selective Caching: Cache progress only for the currently visible or expanded branches of the task tree.
• Triggers or Recursive CTEs: Use SQL-side recursive logic to compute aggregate durations only when queried.
• Hierarchy Flattening: Maintain a materialized path or denormalized ancestry column to simplify recursive operations.

⸻

❓ My Question

What are the best architectural patterns or strategies to manage progress aggregation efficiently in a deeply nested task hierarchy, especially in a real-time syncing setup like PowerSync?

Any architectural tips, data modeling suggestions, or real-world experiences are welcome!