Most "AI automation" demos fall apart the moment a workflow needs to run longer than a single request. An agent makes a few tool calls, the process crashes or times out, and you lose all state. I wanted something that could drive real, multi-step work inside Atlassian (Jira and Confluence) and survive restarts, retries, and failures. So I built an open-source platform around two ideas: MCP for tool access and Temporal for durable execution.
Repo: https://github.com/ahmet-ozel/atlassian-ai-workflow-platform
The problem with one-shot agents
A typical agent loop looks like: read a ticket, decide on an action, call a tool, repeat. This is fine for short tasks. It breaks down when a workflow spans minutes or hours, depends on external systems that fail intermittently, or needs to be resumed after a deploy. If your orchestration lives in a single Python process, any crash means you start over. For business workflows that touch real Jira issues, that is not acceptable.
Why MCP for tools
The Model Context Protocol (MCP) standardizes how an agent discovers and calls tools. Instead of hard-coding Jira API calls into the agent, I expose Jira and Confluence as MCP tools. The agent sees a clean, typed tool surface (create issue, transition status, search, comment, fetch a Confluence page) and the protocol handles the wiring.
The practical benefit is decoupling. I can add or change tools without touching the agent logic, and the same tools work with any MCP-compatible client. It also keeps the agent prompt focused on intent rather than API mechanics.
Why Temporal for orchestration
Temporal gives you durable workflows. The workflow code looks like ordinary Python, but every step is checkpointed. If a worker dies, the workflow resumes from the last completed step on another worker. Retries, timeouts, and backoff are declarative.
This maps perfectly onto agent workflows. Each LLM call and each tool call becomes a Temporal activity. If an LLM provider rate-limits you or a Jira call fails, Temporal retries that single activity instead of replaying the whole reasoning chain. Long-running approvals (wait for a human to review before transitioning a ticket) become a normal part of the workflow instead of a hack.
The tradeoff is added infrastructure. Temporal is one more service to run, and you have to think in terms of deterministic workflow code versus side-effecting activities. For short, stateless tasks it is overkill. For anything that has to be reliable, it pays for itself quickly.
Architecture
The stack ties together a few pieces:
- An MCP integration layer that exposes Atlassian tools to the agent
- Temporal workers that run the durable workflows and activities
- A webhook gateway that turns Jira events into workflow triggers
- An admin dashboard plus a Streamlit UI for running and inspecting workflows
- Multi-provider LLM support (OpenAI, Anthropic, Gemini, and self-hosted vLLM)
Everything runs in a single Docker Compose stack, so you can bring the whole system up locally and see the moving parts together. Provider choice is config-driven, which makes it easy to swap a hosted model for a local one during development.
What I learned
Separating "what to do" from "how to survive doing it" was the key insight. The agent reasons about intent and picks tools. Temporal owns reliability. MCP owns the tool boundary. Keeping those three responsibilities apart made each one much simpler to reason about and test.
The other lesson: deterministic workflow code is a discipline. Anything non-deterministic (network calls, timestamps, random values) has to live in an activity, not the workflow body. Once that clicked, debugging got a lot easier because the workflow history is a precise, replayable log of what happened.
It currently targets Atlassian, but the tool layer is designed to extend to other platforms.
Feedback welcome
I would like to hear how others handle long-running agent workflows. Are you using Temporal, a queue plus your own state machine, or a custom orchestration loop? And for MCP users: how are you structuring tools when one agent needs access to several systems at once?
Repo and setup instructions: https://github.com/ahmet-ozel/atlassian-ai-workflow-platform
Top comments (0)