Your Agent Doesn't Know When to Stop. Give It a Deadline.

#hermeschallenge #ai #python #agents

The user clicked "stop." The button turned gray. The spinner kept spinning for two more minutes.

The agent was mid-loop. The cancellation signal came in but nothing in the loop checked for it. Each tool call, each LLM request, each retry went through as planned. The agent was polite about finishing what it started.

The problem: there was no deadline. No shared deadline that both the agent loop and the tool calls could check. Just a flag somewhere in the outer handler that nobody inside the loop knew about.

The Shape of the Fix

from agent_deadline import Deadline, DeadlineExceeded

# Create a deadline 30 seconds from now
deadline = Deadline.from_now(seconds=30)

for step in plan:
    deadline.check_or_raise()  # raises DeadlineExceeded if expired

    # Pass sub-deadline to nested calls
    sub = deadline.intersect(Deadline.from_now(seconds=5))
    result = call_tool(step, deadline=sub)

    results.append(result)

check_or_raise() is a single call that fits anywhere in the loop. When the deadline expires, it raises DeadlineExceeded. Your loop stops at the next check point. No threads. No signals. Cooperative cancellation.

What It Does NOT Do

agent-deadline does not cancel inflight network requests. If your LLM call is already in progress when the deadline expires, that call runs to completion. The deadline only stops the loop from starting the next step.

It does not integrate with asyncio.timeout or threading.Event. It is a pure monotonic-clock check. For true async cancellation, you need asyncio.timeout or similar. This library handles the cooperative case: a well-behaved agent loop that checks before each step.

It does not store state across process restarts. A deadline is a runtime object. If the process dies and restarts, the new process needs to create a new deadline.

Inside the Library

The deadline is a monotonic timestamp plus a few helpers:

import time

class Deadline:
    def __init__(self, expires_at: float):
        self._expires_at = expires_at  # monotonic time

    @classmethod
    def from_now(cls, seconds: float) -> "Deadline":
        return cls(time.monotonic() + seconds)

    def check_or_raise(self) -> None:
        if time.monotonic() >= self._expires_at:
            raise DeadlineExceeded(f"Deadline exceeded at {self._expires_at}")

    def remaining_seconds(self) -> float:
        return max(0.0, self._expires_at - time.monotonic())

    def intersect(self, other: "Deadline") -> "Deadline":
        return Deadline(min(self._expires_at, other._expires_at))

    def is_expired(self) -> bool:
        return time.monotonic() >= self._expires_at

intersect() is the key method for nested calls. If the outer deadline is 10 seconds from now and you want each tool call to have at most 5 seconds, intersect(Deadline.from_now(5)) gives you a deadline that expires at whichever comes first.

Monotonic clock matters: wall clock can go backward (NTP adjustments, sleep/wake). Monotonic never goes backward. Deadlines built on wall clock can spuriously expire or never expire after a clock jump.

The 27 tests cover creation, check_or_raise() before and after expiry, remaining_seconds(), intersect() (outer shorter, inner shorter), is_expired(), and the zero-remaining edge case.

When to Use It

Use it for any agent loop that can run for a variable amount of time and needs to respect an external time bound. User-facing latency SLAs. Background job time windows. Per-request server timeouts.

Pass deadlines through your call stack rather than checking a global. This keeps the logic local: each function that accepts a deadline knows it needs to cooperate. Functions that do not accept a deadline will not cooperate, which is an honest expression of the design.

Skip it if your agent loop has a fixed upper bound from other constraints. If llm-stop-conditions already caps iterations and each iteration has a known max latency, you may have an implicit deadline without needing an explicit one.

Install

pip install git+https://github.com/MukundaKatta/agent-deadline

from agent_deadline import Deadline, DeadlineExceeded

def run_agent(task: str, timeout_seconds: float = 60.0) -> str:
    deadline = Deadline.from_now(seconds=timeout_seconds)
    results = []

    while not is_complete(results):
        deadline.check_or_raise()

        plan_deadline = deadline.intersect(Deadline.from_now(seconds=10))
        step = plan_next_step(task, results, deadline=plan_deadline)

        exec_deadline = deadline.intersect(Deadline.from_now(seconds=15))
        result = execute_step(step, deadline=exec_deadline)

        results.append(result)

    return summarize(results)

try:
    output = run_agent("Analyze this dataset", timeout_seconds=45.0)
except DeadlineExceeded:
    output = "Task timed out. Partial results: ..."

Sibling Libraries

Library	What it solves
`llm-stop-conditions`	Composable stop conditions (MaxIters, MaxUsd, MaxTokens, NoProgress)
`agent-loop-bound`	Hard cap on loop iteration count
`tool-timeout-wrap`	Per-tool timeout enforcement
`llm-cost-cap`	Pre-flight USD cost gate
`token-budget-pool`	Shared token/USD budget across concurrent agents

The practical combination: agent-deadline for wall-clock bounds, llm-stop-conditions for semantic stop conditions (no progress, max cost), and tool-timeout-wrap for per-call network timeouts. Together they give you three layers of loop control.

What's Next

The async story needs work. Right now check_or_raise() is synchronous. An async loop that awaits each step needs to check the deadline between awaits, which works, but an async_check_or_raise() that integrates with asyncio.CancelledError would be cleaner.

Propagation helpers would be useful. Right now you thread the deadline through your call stack manually. A context-variable approach (Deadline.set_current() / Deadline.get_current()) would let helper functions check the current deadline without it being passed explicitly. The risk: implicit context is harder to reason about. The benefit: libraries that do not take a deadline argument can still participate. Both patterns can coexist.

A with deadline: context manager that automatically calls check_or_raise() on exit from each block is another direction. Less manual than calling check_or_raise() in every loop iteration.

Built as part of the agent-stack family: composable Python primitives for production LLM agents.