Filipp Mishchenko

Posted on Jun 13

What I Learned After Running My AI Task Queue on Real Work

#python #ai #productivity #opensource

I previously open-sourced Personal Task Assistant:

https://github.com/J3d1-fm/Personal-Task-Assistant

The first version was built around a simple idea:

Stop manually deciding what to delegate to AI.

The app separates work into queues:

human-owned tasks;
AI-ready tasks;
tasks waiting for review;
blocked tasks;
unassigned tasks that still need triage.

After running it on real work streams, the biggest lesson is that the task list
is not the hard part.

The hard part is ingestion.

Real work streams are messy

Tasks do not arrive as clean task objects.

They appear in:

emails;
Slack threads;
Telegram-style chats;
GitHub notifications;
documents;
follow-up messages inside old conversations.

If I import everything, the task system becomes noise.

If I import too little, important work stays hidden in inboxes.

So the product direction changed from "store tasks" to "decide what deserves to
become a task."

The current ingestion contract

The import flow now follows a stricter pattern:

Read recent source items.
Check exact timestamps instead of trusting broad date filters.
Compare against existing tasks, including completed ones.
Create only normalized actionable tasks.
Report source status separately.

That last point matters.

A source can be:

checked, with no actionable work;
unavailable because auth is broken;
failed because the connector did not start;
checked and converted into new tasks.

Those are very different outcomes.

An AI system should not flatten them into "nothing found."

Dedupe matters more than it looks

The first naive version of this kind of workflow is easy:

"Search recent messages and create tasks."

That works until the same work appears in several places:

a GitHub review notification and an email;
a Slack follow-up and an existing task;
a same-day email that is older than the actual cutoff;
a reminder in a thread that already produced a task.

The system now dedupes against existing tasks before creating new ones.

It also treats follow-up messages carefully: a follow-up inside an old blocker
thread should update the existing task unless it introduces a materially new
action item.

This makes the queue smaller, but much more useful.

"Unavailable" is a valid result

One of the most useful changes was making source health explicit.

If Gmail auth is expired, that should be reported as an unavailable source.

If a chat connector cannot start, that is a failed source.

If a source was checked and had no actionable messages, that is a normal empty
result.

Those states should not look the same.

For a personal AI workflow, this is important because the user needs to know
whether the assistant actually checked the source or silently skipped it.

The human remains part of the loop

The execution side already supports an agent handoff:

a worker can claim AI-ready tasks;
multiple workers can run in parallel;
each worker can report structured results;
if it needs a decision, access, or missing context, it returns a needs_input question.

Running the product on real work made the same principle clearer for ingestion:

The assistant should act when the task is clear.

It should ask when a human decision is needed.

It should say exactly what is blocked when it cannot continue.

What I would build next

The most useful next work is not another generic AI chat interface.

It is small, reliable source adapters:

Slack;
Telegram;
Gmail;
Linear;
Jira;
Asana;
YouTrack;
Trello;
Google Calendar.

Each adapter should be boring in the best way:

read only what it needs;
dedupe before creating tasks;
avoid private credentials inside the core app;
report source status clearly;
create only actionable work.

The current takeaway

After dogfooding the project, I think the useful abstraction is not just a task
tracker.

It is a human/AI work surface with explicit boundaries:

what the human owns;
what an AI agent can take;
what waits for review;
what is blocked;
which source was checked;
which source needs reconnecting;
what question needs a human answer.

That is where AI agents become more practical for real work.

Not by pretending to be autonomous everywhere.

By knowing when to act, when to dedupe, when to report, and when to ask.

I would like feedback from people building with coding agents or personal
automation:

Which adapter would be most useful first: Slack, Telegram, Gmail, Linear, Jira,
Asana, YouTrack, Trello, or Calendar?

And where should the line be between "agent can act" and "agent must ask"?

Top comments (2)

HARD IN SOFT OUT • Jun 13

This is such a grounded take. The shift from “store everything” to “decide what deserves to be a task” is the kind of boring insight that actually saves weekends. (Also, the bit about follow‑up messages inside old blocker threads — I felt that one in my notifications.)

A couple of things that might be worth poking at:

The “unavailable” source status is great, but I've seen similar systems where auth expires silently and users only notice days later. Would a simple heartbeat check — a periodic ping to each source's auth endpoint — let you flip the status proactively? Maybe even trigger a reauth flow via a webhook, so the assistant doesn't just report “unavailable” but actually tries to fix it.
Dedupe against completed tasks is smart, but what about tasks that were rejected (manually closed as “not actionable”)? If the same Slack thread surfaces again, the assistant might re‑create the same noise. A small “ignore list” of permanently skipped source‑item IDs could save future cycles.

One tiny suggestion (almost a joke): the “boring adapters” are exactly what every tool claims to have but secretly struggles with. If you ever open‑source a reference adapter with the full status‑reporting contract, I'd happily steal it. 😄

I asked my AI to triage my inbox.

It created 14 tasks, flagged 2 as “unavailable”, and asked for clarification on “lunch with mom”.

I closed the laptop. The AI sent a follow‑up: “Task ‘close laptop’ is blocked — please specify which hand you used.”

Anyway, this is the kind of real‑world engineering that rarely gets written up. Thanks for sharing the actual lessons, not just the demo.

Filipp Mishchenko • Jun 29

Thanks, this is very useful feedback.

I agree on both points. Source health should probably be proactive, not only reported after a failed run. A small heartbeat per adapter makes sense: last successful check, auth status, last error, and whether the system can suggest or trigger a reauth flow.

The rejected/not-actionable case is also a good catch. Dedupe against completed tasks is not enough if the same source item keeps resurfacing. I think the adapter contract should include a durable source item id plus a terminal decision like ignored/rejected, so the system can avoid recreating the same noise unless the source item materially changes.

And yes, the reference adapter is probably the next useful artifact. A boring adapter that shows source status, dedupe, ignored items, and normalized task creation would explain the idea better than another generic demo.