DEV Community

Cover image for Hawkeye update: multi-agent orchestration, remote tasks, and local model support
mlaminekane
mlaminekane

Posted on

Hawkeye update: multi-agent orchestration, remote tasks, and local model support

A few weeks ago, I posted Hawkeye here for the first time.

At the time, the core idea was simple:

Hawkeye is a flight recorder for AI agents.

It records what an agent does in a repo, helps detect drift, adds guardrails, and gives you a dashboard to inspect what happened when a run goes wrong.

Since then, I shipped a much bigger update, and Hawkeye feels less like a passive recorder now and more like an operational layer for agent workflows.

What Hawkeye is trying to solve

Once agents start doing real work in a repository, basic questions become surprisingly hard to answer:

  • What exactly did the agent do?
  • When did it start drifting?
  • Why did it fail?
  • What files did it touch?
  • How much did that run cost?
  • Was this run better or worse than the last one?

That is the problem Hawkeye is built for.

It is still local-first, still SQLite-backed, and still focused on making agent work inspectable instead of magical.

What changed since the first post

1. Multi-agent orchestration

This is probably the biggest shift.

Hawkeye now has a real Swarm mode for coordinating multiple agents in parallel.

Instead of treating each run as an isolated session, you can now work with multiple agents at once and monitor them from one place.

What that unlocks:

  • multiple agents working in parallel
  • isolated responsibilities
  • live output monitoring
  • live drift and cost visibility
  • conflict-aware orchestration
  • a better overview of what the whole room is doing

The interesting part is not just “spawn more agents”.
It is being able to see what each one is doing and keep the system legible while they are running.

That became a recurring theme while building Hawkeye:
the moment you have more than one agent, visibility matters more than raw generation quality.

2. Remote tasks + daemon

I also added a stronger task workflow.

You can now queue tasks, let them run in the background or overnight, and inspect the results from the dashboard.

That includes:

  • a task daemon
  • retry flows
  • cancel support
  • better failure reporting
  • more useful output handling
  • cleaner separation between live execution and finished results

This matters because a lot of real agent usage is not interactive.
Sometimes you want to fire off a job, come back later, and understand what happened without tailing a terminal the whole time.

3. Better local model support

Hawkeye now supports Ollama and LM Studio more cleanly across the product.

That includes:

  • local runtime configuration
  • local model selection in the relevant CLI flows
  • better handling in task/agent workflows
  • cleaner integration with the dashboard settings

This was important to me because I did not want Hawkeye to be tied to one cloud provider or one commercial runtime.

The goal is:
if you want to run a local model, Hawkeye should still give you observability, drift detection, and control.

4. The dashboard is much more capable now

A lot of work in the past weeks went into turning the dashboard into something you can actually operate from, not just inspect after the fact.

The biggest improvements were around:

  • Compare

    proper visual comparison between runs, links to sessions, export, top cost files, cleaner highlighting

  • Firewall

    better initial loading, clearer live feed behavior, review feedback, clear/reset actions, less stale state

  • Tasks

    retry, cancel, daemon status, clearer runtime selection, better handling of provider/runtime errors

  • Agents

    follow-ups while running, relaunch/clone flows, better runtime choices, cleaner launch studio

  • Settings

    cleaner structure, more obvious save feedback, better local provider handling

A lot of this work was not “new feature” work.
It was cleanup.
Removing confusion.
Making the product more coherent when you actually use it every day.

5. GitHub PR reporting

I also pushed the GitHub reporting side further.

hawkeye ci can now post a structured report back to a PR, including things like:

  • drift
  • cost
  • files touched
  • run summary

That closes an important loop for me:
not just what happened while the agent ran, but also how that run gets communicated back into a normal engineering workflow.

6. Cleaner runtime story

Another thing I spent time on was making the runtime choices more honest.

Some agent runtimes are great for broad repo understanding.
Some are better for focused patching.
Some local models work fine for one task and fall apart for another.

A lot of recent work went into making Hawkeye expose those tradeoffs more cleanly instead of pretending all runtimes behave the same.

That also meant removing or de-emphasizing flows that were creating more confusion than value.

What did not change

The core philosophy is still the same:

  • local-first
  • SQLite
  • no Hawkeye cloud dependency
  • useful even if the underlying agent changes

Right now Hawkeye is most aligned with:

  • Claude Code
  • Codex
  • Cline
  • custom agent CLIs

That feels like a better, cleaner foundation than trying to support everything equally at once.

What I learned while building this

A few things became clearer over time.

Observability matters more than autonomy

People talk a lot about how capable agents are becoming.

But once an agent starts touching a real codebase, debuggability becomes just as important as capability.

Raw autonomy without visibility stops being impressive very quickly.

Local-first is still worth it

Keeping Hawkeye local-first forced some tradeoffs, but I still think it is the right call.

A lot of people experimenting seriously with agent workflows do not want another hosted black box sitting on top of their existing black boxes.

UX matters a lot in agent tooling

A huge amount of this update was not about adding “AI”.
It was about removing friction:

  • too much noise
  • stale state
  • misleading labels
  • unclear provider behavior
  • heavy layouts
  • confusing runtime defaults

That work is less flashy, but it matters more than it looks.

What still needs work

There is still a lot I want to improve.

A few areas I am actively thinking about:

  • even better runtime selection defaults
  • stronger live output UX across all task types
  • more polished reporting/export flows
  • further CLI cleanup
  • more consistency between task, agent, and session workflows

So this is definitely not “done”.
But it feels much more like a real product now than it did in the first post.

Try it

GitHub:

github.com/MLaminekane/hawkeye
Npm:
(https://www.npmjs.com/package/hawkeye-ai)

Install:


bash
npm install -g hawkeye-ai
Enter fullscreen mode Exit fullscreen mode

Top comments (0)