Shane Johnson

Posted on Apr 21

Java Agent Frameworks Are Code-First. Agentican Isn't.

#agents #ai #java #opensource

I've been immersed in agentic workflows for the past year, and I had a persistent feeling that something was missing — not in any particular product, but in how the field was thinking about agents.

Most of the conversation was about chatbots and assistants. Which is fine, but it's a narrow slice of what agents can do. What I kept coming back to was the idea of an AI-native workforce — not AI as a tool you prompt, but AI as colleagues you delegate to. Agents that can handle complex, multi-step tasks, coordinate with each other, use tools as needed, check in with humans, and work reliably.

So I started building an AI workforce platform. I kept running into problems that existing frameworks hadn't addressed — not because they were bad, but because they were built with different goals in mind. I open-sourced what I'd learned as the Agentican Framework, an open source multi-agent framework for Java.

Java was a no-brainer. I'm a former Red Hatter, so Java is home. And Java has always lagged Python in AI — I want to help change that.

There are agent frameworks for Java. They just weren't what I had in mind.

Most of them — LangChain4j and Embabel are good examples — are code-first. Agents are annotated classes or interfaces. Workflows are built programmatically. Everything lives in the application. That's a reasonable approach, but it couples the definition of agents and workflows to the applications that run them. Every service that needs a Market Analyst defines its own. Every workflow that involves one is a new class hierarchy. The more you build, the more you duplicate.

The model I kept coming back to was simpler: agents, skills, and plans should exist in repositories, independent of any application. Define a Market Analyst. Let it participate in any workflow that needs it, across any service in the organization. Same for plans — a Competitive Research workflow shouldn't be owned by the application that first needed it. It should live somewhere it can be versioned, reused, and evolved independently. Separation of concerns, applied to agentic systems.

This is the core idea behind Agentican. Agents, skills, and plans — and increasingly tools, via MCP and platforms like Composio — are managed as first-class artifacts outside the application. The framework accesses them through repositories (in-memory or persistent) and makes embedding and running them in your service nearly codeless. Extension points exist where they should — custom code steps, custom tools, custom agent loops, custom event listeners — but orchestration is the framework's job. The focus shifts from building agents to embedding them.

Agentican

Here's the fast path to see it in action:

try (var agentican = AgenticanRuntime.builder()
        .llm(LlmConfig.builder()
            .apiKey(System.getenv("ANTHROPIC_API_KEY"))
            .build())
        .build()) {

    var researchTask = agentican.run("Research the top 5 LLMs");

    System.out.println(researchTask.result().output());
}

No agents, skills or tools are defined. The built-in Planner reads the task description, creates or reuses the agents and skills needed, chooses from available tools, builds a plan and executes it.

It's the shortest path from "I wish agents could do this" to agents doing it — but it's not how most teams will use the framework in production. In production, you maintain a catalog of agents, skills, and plans — and embed them into your services with next to no code:

@Inject @AgenticanPlan("Competitive Research")
Agentican<ResearchParams, ResearchSummary> competitiveResearch;

var researchParams = new ResearchParams("realtime CDC");

ResearchSummary researchSummary = 
        competitiveResearch.runAndAwait(researchParams);

Here's how the Competitive Research plan is defined.

Agentic workflows

A competitive research workflow that serves product, marketing, and sales. Product wants to know if market changes warrant a strategic response. Marketing wants per-competitor positioning intel. Sales wants to know who they're likely to run into and what to say.

Agents and skills

Agents and skills are defined independently. Multiple agents can share the same skill, and an agent doesn't need every skill for every task — skills are assigned at the step level when defining the plan.

Define agents and skills in application.yaml:

agentican:
  llm:
    - name: default
      provider: anthropic
      api-key: ${ANTHROPIC_API_KEY}
      model: claude-sonnet-4-5

  agents:
    - external-id: market-analyst
      name: Market Analyst
      role: |
        Analyst on a corporate research team. Profiles companies, maps
        market categories, and produces structured findings for internal
        decision-makers. Works from evidence; comfortable saying "I don't
        know" when the data is thin.

    - external-id: research-manager
      name: Research Manager
      role: |
        Senior contributor on a research team. Takes multiple analysts'
        findings and combines them into briefings fit for leadership
        consumption. Prioritizes signal and structure over
        comprehensiveness.

    - external-id: product-strategy-manager
      name: Product Strategy Manager
      role: |
        Strategic partner embedded with product leadership. Evaluates the
        competitive landscape and translates observation into actionable
        recommendations on roadmap, positioning, and bets. Names tradeoffs
        explicitly; does not hedge when a call is needed.

    - external-id: communications-manager
      name: Communications Manager
      role: |
        Owns outbound messaging on behalf of leadership. Decides how
        findings reach stakeholders — who, when, through what channel, in
        what form. Final quality gate before anything ships externally.

  skills:
    - external-id: primary-source-preference
      name: Primary Source Preference
      instructions: |
        Prefer primary sources: press releases, product pages, engineering
        blogs, SEC filings, analyst-report abstracts. Cite sources inline
        with URLs when possible. Only use secondary coverage when no primary
        source exists, and label it as such. Do not fabricate sources or
        quotes.

    - external-id: momentum-signals
      name: Momentum Signals
      instructions: |
        Evaluate momentum from concrete signals: funding rounds in the last
        12 months, headcount growth, named customer logos, analyst coverage,
        meaningful product releases. Discount vanity signals (press mentions,
        award lists, "fastest-growing" claims) unless corroborated.

    - external-id: tiered-comparison
      name: Tiered Comparison
      instructions: |
        When comparing multiple entities, organize them into meaningful tiers
        before narrating. Group items that share a core property, surface
        what distinguishes one tier from another, and call out outliers
        within a tier. Favor bullets or tables for the inventory; reserve
        prose for synthesis.

    - external-id: threat-rubric
      name: Threat Rubric
      instructions: |
        When classifying competitive threat level, apply this rubric:
          high-threat: clear product overlap with our roadmap AND at least
            one of (recent Series B+ funding, named enterprise wins,
            2× headcount growth in the past year)
          low-threat: otherwise
        When in doubt, choose low-threat.

    - external-id: exec-memo-voice
      name: Exec Memo Voice
      instructions: |
        When writing for a busy executive audience: one page max, open with
        the bottom line in a single sentence, bullets for evidence, close
        with concrete recommendations when the memo calls for action. No
        hedging, no marketing language.

    - external-id: multi-channel-publishing
      name: Multi Channel Publishing
      instructions: |
        Adapt content to the delivery channel. Email: full artifact,
        professional tone. Slack: 1-2 sentence teaser plus a link to the
        archive; conversational but concrete. Notion (or other archival
        surface): full artifact, titled for searchability. Never send the
        same text verbatim across surfaces.

Tools

Agents are only as capable as the tools available to them. Agentican supports four kinds of tools:

Provider tools. Most major providers — Anthropic, OpenAI, Google — now ship built-in tools like web search.
MCP servers. Model Context Protocol has become the standard for tools (e.g., Notion, Slack, Linear, GitHub and more).
Composio. Gives agents access to 100s of tools across the SaaS ecosystem — Gmail, Salesforce, HubSpot, Jira and more.
Custom toolkits. A Toolkit API lets developers build their own tools and make them available to any agent.

agentican:
  # llms...
  # agents...
  # skills...

  mcp:
    - slug: slack
      endpoint: https://mcp.slack.com/mcp
      headers:
        Authorization: Bearer ${SLACK_MCP_TOKEN}

  composio:
    api-key: ${COMPOSIO_API_KEY}
    user-id: ${COMPOSIO_USER_ID}

With agents, skills, and tools configured, it's time to define the plan.

Plan

A plan is a named, versioned workflow — params, a sequence of steps, and an output-step that designates which step's result becomes the typed return value. Plans are registered at runtime and can be invoked by name, reused across the application, and updated through config without code changes.

agentican:
  # llms...
  # agents...
  # skills...
  # tools...

  plans:
    - external-id: competitive-research
      name: Competitive Research
      description: Research the competitive landscape for a category
      output-step: Draft Memo   # forward reference
      params:
        - name: category
          description: Category to analyze
          default-value: realtime CDC
          required: false
      steps:
        # filled in below

Agent steps

Agent steps are the core building block. Each one assigns a task to a named agent, optionally activating a subset of that agent's skills for that step only. Steps with no dependencies — or whose dependencies are already satisfied — run in parallel.

steps:
  - name: Find Leaders
    type: agent
    agent: Market Analyst
    skills: [Primary Source Preference]
    instructions: |
      Identify 3-5 established leaders in {{param.category}}.
      Respond with a JSON array of objects with two fields each:
        { "name": "Company", "summary": "1-2 lines on what they sell and market position" }

  - name: Find Challengers
    type: agent
    agent: Market Analyst
    skills: [Primary Source Preference, Momentum Signals]
    instructions: |
      Identify 3-5 emerging or recently-funded challengers in {{param.category}}.
      Respond with the same JSON shape as above:
        { "name": "Company", "summary": "1-2 lines on what makes them notable" }

  - name: Synthesize Findings
    type: agent
    agent: Research Manager
    skills: [Tiered Comparison]
    dependencies: [Find Leaders, Find Challengers]
    instructions: |
      Synthesize findings on {{param.category}}.

      Leaders (JSON):
      {{step.Find Leaders.output}}

      Emerging (JSON):
      {{step.Find Challengers.output}}

      Synthesize into a coherent one-page overview, grouping by leader vs emerging.

Find Leaders and Find Challengers have no dependencies, so they run in parallel. Synthesize Findings waits for both.

Code steps

Not everything should go through an LLM. Code steps let you drop into typed Java for deterministic work — data transformation, API calls, computation — and wire the result back into the plan graph. They implement a simple CodeStep<I, O> interface; input and output are any Jackson-serializable types.

Here we merge two separate JSON arrays into one deduplicated list, ready to iterate. An LLM could do this, but you'd pay for a model call per run and risk hallucinated duplicates.

record MergeLists(String leaders, String emerging) {}
record Competitor(String name, String summary) {}

public class MergeCompetitorLists implements CodeStep<MergeLists, List<Competitor>> {

    private static final ObjectMapper MAPPER = new ObjectMapper();
    private static final TypeReference<List<Competitor>> TYPE = new TypeReference<>() {};

    @Override
    public List<Competitor> execute(MergeLists in, StepContext ctx) throws Exception {

        var merged = new LinkedHashMap<String, Competitor>();

        MAPPER.<List<Competitor>>readValue(in.leaders(),  TYPE).forEach(c -> merged.putIfAbsent(c.name(), c));
        MAPPER.<List<Competitor>>readValue(in.emerging(), TYPE).forEach(c -> merged.putIfAbsent(c.name(), c));

        return List.copyOf(merged.values());
    }
}

Reference it from the plan by slug:

steps:
  # Find Leaders
  # Find Challengers
  # Synthesize Findings

  - name: Merge Competitors
    type: code
    code-slug: merge-lists
    dependencies: [Find Leaders, Find Challengers]
    code-input:
      leaders: "{{step.Find Leaders.output}}"
      emerging: "{{step.Find Challengers.output}}"

Merge Competitors runs concurrently with Synthesize Findings — both depend on the research steps and nothing else. Its output is a List<Competitor> serialized as a JSON array, ready to iterate.

Loop steps

Loop steps fan out over a JSON array, running the body once per element — all in parallel. Inside the body, {{item}} resolves to the current element and {{item.field}} to a specific field. The loop's aggregate output is available downstream as a single step reference.

steps:
  # Find Leaders
  # Find Challengers
  # Synthesize Findings
  # Merge Competitors

  - name: Research Competitors
    type: loop
    over: Merge Competitors
    steps:
      - name: Profile Competitor
        type: agent
        agent: Market Analyst
        skills: [Primary Source Preference, Momentum Signals]
        instructions: |
          Research {{item.name}} in the context of {{param.category}}:
          pricing model, GTM motion, recent funding or headcount signals,
          product differentiators.

          Build on what we already know:
          {{item.summary}}

          Write 3-5 sentences.

Each competitor gets its own Market Analyst run in parallel, seeded with what we already know from the merge step.

Branch steps

Branch steps execute one of several named paths based on an upstream step's output. Each path is its own subgraph. When paths share a terminal step name — as they do here — downstream steps and output-step resolve cleanly regardless of which path ran.

steps:
  # Find Leaders
  # Find Challengers
  # Synthesize Findings
  # Merge Competitors
  # Research Competitors

  - name: Assess Threat
    type: agent
    agent: Product Strategy Manager
    skills: [Threat Rubric]
    dependencies: [Research Competitors]
    instructions: |
      Given these competitor deep dives:
      {{step.Research Competitors.output}}

      Reply with exactly two words: 'High Threat' or 'Low Threat'.

  - name: Route By Threat
    type: branch
    from: Assess Threat
    default-path: "Low Threat"
    path-configs:
      - path-name: "High Threat"
        steps:
          - name: Draft Memo
            type: agent
            agent: Product Strategy Manager
            skills: [Exec Memo Voice]
            instructions: |
              Write an urgent strategic memo.
              Research: {{step.Research Competitors.output}}

              Call out the 2-3 most pressing threats, why they matter now,
              and finish with 3 concrete recommendations for our product team.

      - path-name: "Low Threat"
        steps:
          - name: Draft Memo
            type: agent
            agent: Product Strategy Manager
            skills: [Exec Memo Voice]
            instructions: |
              Write a brief routine update.
              Research: {{step.Research Competitors.output}}

              Two paragraphs — no recommendations needed.

Human-in-the-loop

HITL in Agentican is a checkpoint, not a callback. Flag any agent step with hitl: true and the workflow suspends after that step completes — the step's output sits waiting, state is persisted, a checkpoint event fires over SSE, and nothing downstream proceeds until a human approves or rejects via the REST API. Suspended tasks survive restarts and resume exactly where they left off. No custom state management; just a flag.

Because HITL suspends after a step runs, the flag has to go on the step whose output you want approved — not on the step that acts on that output. Two different responsibilities, two steps:

First, a review step. Communications Manager applies the exec-comms guidelines (the Exec Memo Voice skill), produces the final shape of the brief, and the task suspends for human sign-off before anything goes further:

steps:
  # Find Leaders
  # Find Challengers
  # Synthesize Findings
  # Merge Competitors
  # Research Competitors
  # Assess Threat
  # Route Threat
  # Draft Memo

  - name: Review Brief
    type: agent
    agent: Communications Manager
    skills: [Exec Memo Voice]
    dependencies: [Draft Memo]
    hitl: true
    instructions: |
      Review the draft memo against our exec-comms guidelines: one-line
      bottom line, specific-named evidence, no hedging, recommendations
      only when a call to action is warranted.

      Draft: {{step.Draft Memo.output}}

      Output the final brief as a CompetitiveResearchSummary
      ({headline, assessment, recommendations}). If the draft already
      conforms, pass it through; otherwise, correct it.

When Review Brief finishes, the task transitions to SUSPENDED, its full state (plan graph, turn history, the draft itself) is persisted to Postgres, and a checkpoint event fires over the Quarkus REST SSE stream. A reviewer sees the draft in whatever UI you've built — the framework ships 18 REST endpoints exactly for this — and either approves or rejects with feedback. On rejection, Review Brief re-runs with the reviewer's feedback appended to its instructions. A task can sit suspended across multiple deploys for days and resume the moment a human clicks approve.

This is also why output-step: Review Brief on the plan (set in the skeleton at the top): the reviewed, approved version is what the typed invoker parses into CompetitiveResearchSummary. The framework attaches a JSON Schema generated from that record to Review Brief's LLM call via the provider's native structured-output mode, so the output is schema-constrained and Jackson-parsable.

Second, a publish step. No HITL — the brief has already been approved:

steps:
  # Find Leaders
  # Find Challengers
  # Synthesize Findings
  # Merge Competitors
  # Research Competitors
  # Assess Threat
  # Route Threat
  # Draft Memo
  # Review Brief

  - name: Publish Brief
    type: agent
    agent: Communications Manager
    skills: [Multi Channel Publishing]
    tools: [gmail_send_email, slack_post_message, notion_create_page]
    dependencies: [Review Brief]
    instructions: |
      Distribute the approved brief across three channels:

      1. Email the exec-distro@ list with the full memo.
      2. Post a 1-2 sentence teaser to the #competitive-intel Slack channel.
      3. Create a page in the "Competitive Briefs" Notion database titled
         with today's date, body = the full memo.

      Brief:
      {{step.Review Brief.output}}

tools lists exactly which tool names this step is allowed to call — one from Composio (Gmail) and one from each of the two MCP servers (Slack and Notion). The registry resolves each tool name to its owning toolkit at dispatch, so the agent can't accidentally reach for any other tool in the registered toolkits.

Two steps, two responsibilities, one HITL gate in the right place: humans approve before any email lands in an inbox.

That's the plan. The injection from the top of the post now has everything it needs:

@Inject @AgenticanPlan("Competitive Research")
Agentican<ResearchParams, ResearchSummary> competitiveResearch;

var researchParams = new ResearchParams("realtime CDC");

ResearchSummary researchSummary = 
        competitiveResearch.runAndAwait(researchParams);

Why this matters

The shift toward AI-native workflows is happening. Teams are already delegating real work to agents — research, analysis, drafting, routing, publishing. The question isn't whether to build agentic systems. It's how.

For Java teams, the answer has mostly been "assemble it yourself from lower-level primitives" or "wait." Agentican is a bet on a third option: a framework built around the idea that agents, skills, and plans are durable organizational assets — not application scaffolding. That the right model is declarative, repository-based, and designed to support workflows that span services, teams, and time.

It's alpha. The APIs are stabilizing. But the foundations are real and the direction is clear. If you're building agentic systems on the JVM — or thinking about it — I'd love for you to take a look.

Try it

<dependency>
    <groupId>ai.agentican</groupId>
    <artifactId>agentican-framework-core</artifactId>
    <version>0.1.0-alpha.2</version>
</dependency>