<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Valentina</title>
    <description>The latest articles on DEV Community by Valentina (@vhalasi).</description>
    <link>https://dev.to/vhalasi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3761871%2F6fbd46a2-62a6-4261-90f7-79ed56cadf05.png</url>
      <title>DEV Community: Valentina</title>
      <link>https://dev.to/vhalasi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vhalasi"/>
    <language>en</language>
    <item>
      <title>Runs vs. Threads: When to Use Which</title>
      <dc:creator>Valentina</dc:creator>
      <pubDate>Thu, 05 Mar 2026 11:00:00 +0000</pubDate>
      <link>https://dev.to/vhalasi/runs-vs-threads-when-to-use-which-3e58</link>
      <guid>https://dev.to/vhalasi/runs-vs-threads-when-to-use-which-3e58</guid>
      <description>&lt;p&gt;Crewship has two ways to execute a deployed crew: the &lt;strong&gt;Run API&lt;/strong&gt; and the &lt;strong&gt;Thread API&lt;/strong&gt;. If you've used the platform at all, you've already used runs. Threads are newer and less obvious, and the question we keep hearing is: when should I use which?&lt;/p&gt;

&lt;p&gt;Runs are for one-shot tasks. Threads are for conversations. That's the short version. The rest of this post is the long version.&lt;/p&gt;

&lt;h2&gt;
  
  
  Runs: the default
&lt;/h2&gt;

&lt;p&gt;A run is a single execution of your crew. You send input, it does its thing, you get output. Each run gets its own container, shares nothing with other runs, and the environment gets torn down when it finishes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship invoke &lt;span class="nt"&gt;--input&lt;/span&gt; &lt;span class="s1"&gt;'{"topic": "AI agents in logistics"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or via the API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.crewship.dev/v1/runs &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"deployment_id": "dep_abc123", "input": {"topic": "AI agents in logistics"}}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You get back a run ID. The run moves through &lt;code&gt;pending&lt;/code&gt;, &lt;code&gt;running&lt;/code&gt;, then lands on &lt;code&gt;succeeded&lt;/code&gt;, &lt;code&gt;failed&lt;/code&gt;, or &lt;code&gt;canceled&lt;/code&gt;. You can stream events in real time to watch your agents work, or just poll for the result.&lt;/p&gt;

&lt;p&gt;No setup, no cleanup, no state to manage.&lt;/p&gt;

&lt;h3&gt;
  
  
  When runs make sense
&lt;/h3&gt;

&lt;p&gt;Runs are the right choice when your crew's job starts and ends in a single execution. "Write a blog post about X" — input in, content out, done. "Analyze this dataset and generate a report" — same deal. Batch operations where you process a list of items independently, background jobs triggered by a webhook or a cron schedule, pipeline steps where your crew is one stage in a larger workflow. The work is self-contained every time.&lt;/p&gt;

&lt;p&gt;The common pattern: the crew doesn't need to ask clarifying questions, doesn't need to remember what happened last time. It takes input and produces output, and that's the whole interaction.&lt;/p&gt;

&lt;h3&gt;
  
  
  What runs don't do
&lt;/h3&gt;

&lt;p&gt;Runs are stateless. When a run finishes, it's gone. If you kick off another run with the same deployment, it has zero context about the previous one. It doesn't know you ran it five minutes ago with slightly different input. It doesn't know the output last time was almost right but needed one small tweak.&lt;/p&gt;

&lt;p&gt;For a lot of workloads, that's exactly what you want. But for some, it's a problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Threads: when you need memory
&lt;/h2&gt;

&lt;p&gt;A thread is a persistent conversation context scoped to a deployment. You create it once, then run your crew inside it as many times as you need. Each run receives the thread's current state, and when the run finishes, it can update that state. The next run picks up where the last one left off.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a thread&lt;/span&gt;
crewship thread create dep_abc123

&lt;span class="c"&gt;# Run inside it&lt;/span&gt;
crewship invoke dep_abc123 &lt;span class="nt"&gt;--thread&lt;/span&gt; thr_xyz789 &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s1"&gt;'{"message": "Research AI agents in healthcare"}'&lt;/span&gt;

&lt;span class="c"&gt;# Follow up — the crew remembers the first message&lt;/span&gt;
crewship invoke dep_abc123 &lt;span class="nt"&gt;--thread&lt;/span&gt; thr_xyz789 &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s1"&gt;'{"message": "Now focus on diagnostic applications"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Via the API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create the thread&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.crewship.dev/v1/threads &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"deployment_id": "dep_abc123"}'&lt;/span&gt;

&lt;span class="c"&gt;# Run inside it&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.crewship.dev/v1/threads/thr_xyz789/runs &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"input": {"message": "Research AI agents in healthcare"}}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The thread tracks state through a &lt;code&gt;values&lt;/code&gt; field — a JSON object that your crew can read and write. After each run, Crewship saves a checkpoint, so you have a full history of how the state changed over time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thread lifecycle
&lt;/h3&gt;

&lt;p&gt;Threads have their own status:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;idle&lt;/code&gt; — ready for a new run&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;busy&lt;/code&gt; — a run is executing; new run requests get a 409 until the current one finishes&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;interrupted&lt;/code&gt; — the run was interrupted&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;error&lt;/code&gt; — the last run failed, but the thread still accepts new runs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Only one run can execute in a thread at a time. That's by design — it keeps state consistent. No risk of two concurrent runs stepping on each other's updates.&lt;/p&gt;

&lt;h3&gt;
  
  
  When threads make sense
&lt;/h3&gt;

&lt;p&gt;Threads are useful anywhere that context carries over between interactions. The most obvious case is a conversational agent — a chatbot or support assistant where the user sends a message, the crew responds, the user follows up, and so on. The thread holds the full conversation history so each response accounts for everything that came before.&lt;/p&gt;

&lt;p&gt;But it goes beyond chat. Iterative refinement is another good fit: "Generate a marketing plan." Then: "Make the budget section more detailed." Then: "Add a timeline." Each run builds on the previous output instead of starting from scratch. You could try to stuff the entire prior result into the next run's input, but that gets unwieldy fast.&lt;/p&gt;

&lt;p&gt;Multi-step workflows with human approval also work well with threads. The crew does research, presents findings, and waits. The user reviews, gives direction, and kicks off the next run. The thread holds the intermediate state between steps without you having to manage it yourself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Checkpoints
&lt;/h3&gt;

&lt;p&gt;Every time a run inside a thread finishes, Crewship saves a checkpoint — a snapshot of the thread's state at that moment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl https://api.crewship.dev/v1/threads/thr_xyz789/history &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you an audit trail. It's also useful for debugging: if the crew's response went sideways on turn 5, you can look at the checkpoint from turn 4 to see what state it was working with.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thread metadata
&lt;/h3&gt;

&lt;p&gt;Threads support a &lt;code&gt;metadata&lt;/code&gt; field that's separate from the conversation state. Use it for things like user IDs, channels, tags — anything you want to filter or search by later:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.crewship.dev/v1/threads &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "deployment_id": "dep_abc123",
    "metadata": {"user_id": "user_42", "channel": "web", "priority": "high"}
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This matters once you have hundreds of threads across different users and use cases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Side by side
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Runs&lt;/th&gt;
&lt;th&gt;Threads&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;State&lt;/td&gt;
&lt;td&gt;None — each run is isolated&lt;/td&gt;
&lt;td&gt;Persistent across runs via &lt;code&gt;values&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lifecycle&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;pending&lt;/code&gt; → &lt;code&gt;running&lt;/code&gt; → terminal state&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;idle&lt;/code&gt; → &lt;code&gt;busy&lt;/code&gt; → &lt;code&gt;idle&lt;/code&gt; (repeats)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Concurrency&lt;/td&gt;
&lt;td&gt;Unlimited parallel runs&lt;/td&gt;
&lt;td&gt;One run at a time per thread&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;History&lt;/td&gt;
&lt;td&gt;Individual run records&lt;/td&gt;
&lt;td&gt;Checkpoints after each run&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Setup&lt;/td&gt;
&lt;td&gt;None — just create a run&lt;/td&gt;
&lt;td&gt;Create thread first, then run inside it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cleanup&lt;/td&gt;
&lt;td&gt;Automatic&lt;/td&gt;
&lt;td&gt;Manual — delete thread when done&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost profile&lt;/td&gt;
&lt;td&gt;Predictable per run&lt;/td&gt;
&lt;td&gt;Grows with conversation length&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  How other platforms handle this
&lt;/h2&gt;

&lt;p&gt;If you've used other agent platforms, you'll recognize the split. OpenAI's Assistants API had Threads and Runs — literally the same names. They've since replaced it with the Responses API and Conversations, but it's the same idea: a stateless execution primitive and an optional persistence layer.&lt;/p&gt;

&lt;p&gt;LangGraph does the same thing. Call &lt;code&gt;graph.invoke()&lt;/code&gt; for a one-shot execution. Pass a &lt;code&gt;thread_id&lt;/code&gt; and you get persistence, checkpoints, and the ability to resume from any point. CrewAI has &lt;code&gt;kickoff()&lt;/code&gt; for one-shot execution and a separate conversational mode for multi-turn interactions.&lt;/p&gt;

&lt;p&gt;The pattern across all of these: runs and threads are independent concerns. Runs are always the execution primitive. Threads optionally chain runs together with shared state. You don't need threads until you do.&lt;/p&gt;

&lt;h2&gt;
  
  
  So which one?
&lt;/h2&gt;

&lt;p&gt;One question gets you most of the way there: &lt;strong&gt;does the crew need to remember anything from previous executions?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If no, use a run. Adding a thread just adds complexity you don't need.&lt;/p&gt;

&lt;p&gt;If yes, use a thread. Trying to fake statefulness by jamming prior context into run inputs gets ugly fast, and you lose checkpoints, history, and the concurrency guarantees that threads give you.&lt;/p&gt;

&lt;p&gt;A few more signals that point toward threads: your users will interact with the crew multiple times per session, the crew's output depends on conversation history rather than just the current input, or you need an audit trail of how state evolved over time.&lt;/p&gt;

&lt;p&gt;And toward runs: the crew can do its job with a single set of inputs, you want to fire off many executions in parallel, or the workload is triggered by an automated system rather than a person sending messages.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started with threads
&lt;/h2&gt;

&lt;p&gt;If you've been using runs and want to try threads, there's nothing to change about your deployment. Threads work with any deployed crew — CrewAI, LangGraph Python, or LangGraph JS.&lt;/p&gt;

&lt;p&gt;Create a thread, run inside it, and your crew receives the thread state. Update the state from within your crew, and the next run picks up where you left off.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a thread&lt;/span&gt;
crewship thread create dep_abc123 &lt;span class="nt"&gt;--metadata&lt;/span&gt; &lt;span class="s1"&gt;'{"user_id": "demo"}'&lt;/span&gt;

&lt;span class="c"&gt;# First turn&lt;/span&gt;
crewship invoke dep_abc123 &lt;span class="nt"&gt;--thread&lt;/span&gt; thr_xyz789 &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s1"&gt;'{"message": "What can you help me with?"}'&lt;/span&gt;

&lt;span class="c"&gt;# Second turn — crew receives context from the first&lt;/span&gt;
crewship invoke dep_abc123 &lt;span class="nt"&gt;--thread&lt;/span&gt; thr_xyz789 &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s1"&gt;'{"message": "Tell me more about option 2"}'&lt;/span&gt;

&lt;span class="c"&gt;# Check the history&lt;/span&gt;
crewship thread &lt;span class="nb"&gt;history &lt;/span&gt;thr_xyz789
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full API reference is in the &lt;a href="https://docs.crewship.dev/guides/threads" rel="noopener noreferrer"&gt;threads documentation&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Questions about runs, threads, or anything else? Check the &lt;a href="https://docs.crewship.dev" rel="noopener noreferrer"&gt;docs&lt;/a&gt; or reach out at &lt;a href="mailto:mail@crewship.dev"&gt;mail@crewship.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>guide</category>
      <category>api</category>
      <category>runs</category>
      <category>threads</category>
    </item>
    <item>
      <title>Multiple Deployments, One Config File</title>
      <dc:creator>Valentina</dc:creator>
      <pubDate>Thu, 05 Mar 2026 11:00:00 +0000</pubDate>
      <link>https://dev.to/vhalasi/multiple-deployments-one-config-file-2757</link>
      <guid>https://dev.to/vhalasi/multiple-deployments-one-config-file-2757</guid>
      <description>&lt;p&gt;If you're building with AI agents, you probably don't have just one. Say you're building a lead aggregation pipeline. You've got one agent that scrapes company websites, another that pulls leads from LinkedIn, and a third that mines Reddit and community forums. They all share the same data models and scoring logic, they all run on a schedule, and they all live in the same repo. But each one deploys independently, so each one needs its own &lt;code&gt;crewship.toml&lt;/code&gt; and its own deploy commands, which adds up fast.&lt;/p&gt;

&lt;p&gt;It works, but it's clunky. You end up duplicating build settings, keeping exclude lists in sync, and jumping between directories every time you deploy.&lt;/p&gt;

&lt;p&gt;We kept hearing this from teams building multi-agent systems, and honestly ran into it ourselves. So we fixed it.&lt;/p&gt;

&lt;h2&gt;
  
  
  One file, multiple deployments
&lt;/h2&gt;

&lt;p&gt;You can now define multiple deployments in a single &lt;code&gt;crewship.toml&lt;/code&gt;. Instead of one &lt;code&gt;[deployment]&lt;/code&gt; section, use named &lt;code&gt;[deployments.&amp;lt;name&amp;gt;]&lt;/code&gt; sections:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[build]&lt;/span&gt;
&lt;span class="py"&gt;exclude&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"tests"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"notebooks"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nn"&gt;[deployments.web-scraper]&lt;/span&gt;
&lt;span class="py"&gt;framework&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"crewai"&lt;/span&gt;
&lt;span class="py"&gt;entrypoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"leads.web_scraper.crew:WebScraperCrew"&lt;/span&gt;
&lt;span class="py"&gt;profile&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"browser"&lt;/span&gt;
&lt;span class="py"&gt;python&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"3.11"&lt;/span&gt;

&lt;span class="nn"&gt;[deployments.linkedin-miner]&lt;/span&gt;
&lt;span class="py"&gt;framework&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"crewai"&lt;/span&gt;
&lt;span class="py"&gt;entrypoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"leads.linkedin.crew:LinkedInCrew"&lt;/span&gt;

&lt;span class="nn"&gt;[deployments.reddit-miner]&lt;/span&gt;
&lt;span class="py"&gt;framework&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"crewai"&lt;/span&gt;
&lt;span class="py"&gt;entrypoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"leads.reddit.crew:RedditCrew"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each named section becomes its own deployment on Crewship, with the name as the project name. The &lt;code&gt;[build]&lt;/code&gt; config is shared across all of them, so you only declare your exclude list once.&lt;/p&gt;

&lt;p&gt;That's it. No wrapper scripts, no monorepo tooling, no separate directories. Three lead miners, one file.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploying and targeting
&lt;/h2&gt;

&lt;p&gt;Every CLI command now takes a &lt;code&gt;--name&lt;/code&gt; (or &lt;code&gt;-n&lt;/code&gt;) flag to target a specific deployment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship deploy &lt;span class="nt"&gt;--name&lt;/span&gt; web-scraper
crewship deploy &lt;span class="nt"&gt;--name&lt;/span&gt; linkedin-miner
crewship deploy &lt;span class="nt"&gt;--name&lt;/span&gt; reddit-miner
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same for env vars, invocations, and schedules. For a lead pipeline where every source runs on its own schedule, that looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship &lt;span class="nb"&gt;env set&lt;/span&gt; &lt;span class="nt"&gt;--name&lt;/span&gt; linkedin-miner &lt;span class="nv"&gt;LINKEDIN_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;...
crewship &lt;span class="nb"&gt;env set&lt;/span&gt; &lt;span class="nt"&gt;--name&lt;/span&gt; reddit-miner &lt;span class="nv"&gt;REDDIT_CLIENT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;... &lt;span class="nv"&gt;REDDIT_CLIENT_SECRET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;...

crewship schedule create &lt;span class="nt"&gt;--name&lt;/span&gt; web-scraper &lt;span class="s2"&gt;"Scrape targets"&lt;/span&gt; &lt;span class="nt"&gt;--cron&lt;/span&gt; &lt;span class="s2"&gt;"0 */6 * * *"&lt;/span&gt;
crewship schedule create &lt;span class="nt"&gt;--name&lt;/span&gt; linkedin-miner &lt;span class="s2"&gt;"LinkedIn sync"&lt;/span&gt; &lt;span class="nt"&gt;--cron&lt;/span&gt; &lt;span class="s2"&gt;"0 8 * * 1-5"&lt;/span&gt;
crewship schedule create &lt;span class="nt"&gt;--name&lt;/span&gt; reddit-miner &lt;span class="s2"&gt;"Reddit sweep"&lt;/span&gt; &lt;span class="nt"&gt;--cron&lt;/span&gt; &lt;span class="s2"&gt;"0 9 * * *"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you skip &lt;code&gt;--name&lt;/code&gt; and there's only one deployment in the file, it gets picked automatically. If there are multiple, the CLI prompts you to choose. In CI where there's no TTY, it'll error and tell you to pass &lt;code&gt;--name&lt;/code&gt; explicitly, so you don't accidentally deploy the wrong thing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deployment IDs are tracked per deployment
&lt;/h2&gt;

&lt;p&gt;After the first deploy, Crewship saves the &lt;code&gt;deployment_id&lt;/code&gt; back into the config for each deployment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[deployments.web-scraper]&lt;/span&gt;
&lt;span class="py"&gt;framework&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"crewai"&lt;/span&gt;
&lt;span class="py"&gt;entrypoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"leads.web_scraper.crew:WebScraperCrew"&lt;/span&gt;
&lt;span class="py"&gt;deployment_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"dep_abc123"&lt;/span&gt;   &lt;span class="c"&gt;# auto-populated after first deploy&lt;/span&gt;

&lt;span class="nn"&gt;[deployments.linkedin-miner]&lt;/span&gt;
&lt;span class="py"&gt;framework&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"crewai"&lt;/span&gt;
&lt;span class="py"&gt;entrypoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"leads.linkedin.crew:LinkedInCrew"&lt;/span&gt;
&lt;span class="py"&gt;deployment_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"dep_def456"&lt;/span&gt;   &lt;span class="c"&gt;# auto-populated after first deploy&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means subsequent deploys know exactly which deployment to update without you having to track IDs manually. Commit the file to version control and your whole team stays in sync.&lt;/p&gt;

&lt;h2&gt;
  
  
  Nothing breaks
&lt;/h2&gt;

&lt;p&gt;If you've got an existing &lt;code&gt;crewship.toml&lt;/code&gt; with a single &lt;code&gt;[deployment]&lt;/code&gt; section, nothing changes. That format works exactly as before. The new multi-deployment format is opt-in, and &lt;code&gt;crewship init&lt;/code&gt; still generates the single-deployment config by default.&lt;/p&gt;

&lt;p&gt;The two formats are mutually exclusive. If you accidentally mix &lt;code&gt;[deployment]&lt;/code&gt; and &lt;code&gt;[deployments.*]&lt;/code&gt; in the same file, the CLI catches it and tells you what to do.&lt;/p&gt;

&lt;h2&gt;
  
  
  When this matters
&lt;/h2&gt;

&lt;p&gt;The lead aggregator setup is a good example, but it applies anywhere you have agents that share code but deploy separately. A few patterns that fall out of this naturally:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Monorepo without the mess&lt;/strong&gt; — your lead miners share scoring logic, data models, and utility code. With multi-deployment, they stay in one repo and one config file instead of being split across separate projects that drift out of sync.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Independent schedules&lt;/strong&gt; — each source runs on its own cadence. The web scraper every 6 hours, LinkedIn on weekday mornings, Reddit once a day. Set them up with &lt;code&gt;crewship schedule create --name&lt;/code&gt; and they run independently.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gradual rollout&lt;/strong&gt; — deploy one miner at a time, verify it works, then deploy the next. Each deployment has its own version history and rollback.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;If you're starting from scratch, &lt;code&gt;crewship init&lt;/code&gt; sets up a single-deployment config. When you're ready to add more agents, edit the file to use the named format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="c"&gt;# Before&lt;/span&gt;
&lt;span class="nn"&gt;[deployment]&lt;/span&gt;
&lt;span class="py"&gt;framework&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"crewai"&lt;/span&gt;
&lt;span class="py"&gt;entrypoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"leads.web_scraper.crew:WebScraperCrew"&lt;/span&gt;

&lt;span class="c"&gt;# After&lt;/span&gt;
&lt;span class="nn"&gt;[deployments.web-scraper]&lt;/span&gt;
&lt;span class="py"&gt;framework&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"crewai"&lt;/span&gt;
&lt;span class="py"&gt;entrypoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"leads.web_scraper.crew:WebScraperCrew"&lt;/span&gt;

&lt;span class="nn"&gt;[deployments.linkedin-miner]&lt;/span&gt;
&lt;span class="py"&gt;framework&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"crewai"&lt;/span&gt;
&lt;span class="py"&gt;entrypoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"leads.linkedin.crew:LinkedInCrew"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Deploy them, set their env vars, invoke them. Everything else works the same.&lt;/p&gt;




&lt;p&gt;Full details are in the &lt;a href="https://docs.crewship.dev/configuration/crewship-toml" rel="noopener noreferrer"&gt;configuration docs&lt;/a&gt;. If you run into anything or have feedback, reach out -- we'd like to hear how you're using it.&lt;/p&gt;

</description>
      <category>feature</category>
      <category>configuration</category>
      <category>multiagent</category>
      <category>monorepo</category>
    </item>
    <item>
      <title>How to Deploy LangGraph to Production</title>
      <dc:creator>Valentina</dc:creator>
      <pubDate>Mon, 09 Feb 2026 15:00:00 +0000</pubDate>
      <link>https://dev.to/vhalasi/how-to-deploy-langgraph-to-production-11cg</link>
      <guid>https://dev.to/vhalasi/how-to-deploy-langgraph-to-production-11cg</guid>
      <description>&lt;p&gt;You built a LangGraph agent. It runs locally. You've got nodes, edges, conditional routing, state that flows through the graph. Maybe it's a research assistant that searches the web and writes reports. Maybe it's a multi-step tool-calling agent with loops. Whatever it does, it works on your machine.&lt;/p&gt;

&lt;p&gt;Now you want to put it somewhere other people can use it. Or somewhere your backend can call it. And this is where LangGraph gets interesting, because it's not a simple stateless function you can throw behind a Lambda.&lt;/p&gt;

&lt;p&gt;LangGraph adds real complexity that plain LangChain chains don't have: state management, cycles, conditional edges, long-running executions with multiple tool calls. That complexity matters when you try to run it in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  The state problem
&lt;/h2&gt;

&lt;p&gt;LangGraph graphs are stateful by design. Your &lt;code&gt;StateGraph&lt;/code&gt; defines a typed state object, and every node reads from and writes to that state as execution flows through the graph. Nodes accumulate results, branch based on previous outputs, loop back when conditions aren't met.&lt;/p&gt;

&lt;p&gt;Here's a basic example of what that looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.graph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TypedDict&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;
    &lt;span class="n"&gt;next_step&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;

&lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AgentState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;research_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;analyze&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;analyze_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_conditional_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;should_continue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Locally, this all lives in memory. State gets created, passed around, updated. When the graph finishes, you read the final state and move on. Simple.&lt;/p&gt;

&lt;p&gt;In production, that in-memory model breaks. You need state to survive process restarts. You need it isolated between concurrent runs so one user's execution doesn't bleed into another's. If you're running workers across multiple machines, state can't just sit in a local variable. And you need to be able to inspect the state of a running or failed graph to understand what happened.&lt;/p&gt;

&lt;h2&gt;
  
  
  You still need an API
&lt;/h2&gt;

&lt;p&gt;Same problem as any agent framework: &lt;code&gt;graph.invoke()&lt;/code&gt; works fine in a script, but nobody can call a Python script running on your laptop. You need an HTTP API in front of it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BackgroundTasks&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;uuid4&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/run&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;start_run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;background_tasks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;BackgroundTasks&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;run_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;running&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;background_tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;execute_graph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;run_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The catch with LangGraph is that runs tend to be long. A graph with conditional loops, multiple tool calls, and multi-step reasoning can easily run for several minutes. That's well past the default timeout on most reverse proxies and serverless platforms. So you can't just wait for the result inside the request handler. You need async execution, a way to store results, and a way for clients to check back or get notified when it's done.&lt;/p&gt;

&lt;p&gt;You also want streaming. One of LangGraph's strengths is that you can stream events as nodes execute: which node just ran, what it produced, what the state looks like at each step. Losing that in production means losing one of the main benefits of using LangGraph in the first place. So now you need SSE or WebSocket support in your API too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Containerization
&lt;/h2&gt;

&lt;p&gt;LangGraph pulls in &lt;code&gt;langgraph&lt;/code&gt;, &lt;code&gt;langchain-core&lt;/code&gt;, and depending on what tools and models you're using, potentially &lt;code&gt;langchain-community&lt;/code&gt;, &lt;code&gt;langchain-openai&lt;/code&gt;, &lt;code&gt;langchain-anthropic&lt;/code&gt;, and a handful of other packages. If your agents use custom tools, add those dependencies to the pile.&lt;/p&gt;

&lt;p&gt;Dependency resolution in the LangChain ecosystem can be painful. Version conflicts between &lt;code&gt;langchain-core&lt;/code&gt; and community packages are common. Pinning versions in a &lt;code&gt;requirements.txt&lt;/code&gt; or &lt;code&gt;pyproject.toml&lt;/code&gt; helps, but you'll still spend time debugging import errors that only show up in the container and not on your machine.&lt;/p&gt;

&lt;p&gt;Docker solves the environment consistency problem, but now you're maintaining Dockerfiles, dealing with image builds, and pushing to a registry. If your graph uses any tools that need system-level dependencies (browsers, ffmpeg, etc.), the Dockerfile gets more complex.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tracing graph execution
&lt;/h2&gt;

&lt;p&gt;This is where LangGraph really differs from other frameworks. When a CrewAI crew fails, you have a sequence of agent actions to trace through. When a LangGraph graph fails, you have a directed graph where node A called node B which conditionally routed back to node A, which then called node C.&lt;/p&gt;

&lt;p&gt;Without proper tracing, debugging a failed graph run is rough. Which node threw the error? What was the state when it happened? Did a conditional edge route to the wrong node? Was the state corrupted by a previous node? Did a cycle run more times than expected?&lt;/p&gt;

&lt;p&gt;You could pipe everything to stdout and read logs, but that gets unreadable fast with complex graphs. LangSmith exists for this, but it's a separate hosted service with its own pricing and setup. And even with LangSmith, you still need to wire up the integration and make sure traces are actually being captured in your production environment.&lt;/p&gt;

&lt;h2&gt;
  
  
  The same infrastructure problems
&lt;/h2&gt;

&lt;p&gt;Everything else from the &lt;a href="https://dev.to/blog/deploy-crewai-to-production"&gt;CrewAI deployment guide&lt;/a&gt; applies here too. Scaling workers, job queues, versioning deployments, authentication, rate limiting, secret management. These are the same problems regardless of whether you're running CrewAI or LangGraph.&lt;/p&gt;

&lt;p&gt;I won't rehash all of that here. The short version: you end up building a container orchestration system, a job queue, a versioning pipeline, an auth layer, and an observability stack. None of it is your actual product.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploy with Crewship
&lt;/h2&gt;

&lt;p&gt;Crewship now supports LangGraph natively. If your project has a &lt;code&gt;langgraph.json&lt;/code&gt; file, Crewship auto-detects it and handles everything from there.&lt;/p&gt;

&lt;p&gt;Here's the full deployment flow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Install the CLI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://www.crewship.dev/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Log in
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship login
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Set up your project
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Crewship detects LangGraph from your &lt;code&gt;langgraph.json&lt;/code&gt; and generates a &lt;code&gt;crewship.toml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[deployment]&lt;/span&gt;
&lt;span class="py"&gt;framework&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"langgraph"&lt;/span&gt;
&lt;span class="py"&gt;entrypoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"src.my_graph.graph:graph"&lt;/span&gt;
&lt;span class="py"&gt;python&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"3.11"&lt;/span&gt;
&lt;span class="py"&gt;profile&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"slim"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;entrypoint&lt;/code&gt; points to your compiled &lt;code&gt;StateGraph&lt;/code&gt; object. Crewship uses this to invoke your graph without you needing to write any API code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deploy
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your code gets packaged, built into a container, and deployed. You get a deployment URL and a link to the Crewship console where you can manage it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Add your secrets
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship &lt;span class="nb"&gt;env set &lt;/span&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;sk-... &lt;span class="nv"&gt;TAVILY_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or import from your &lt;code&gt;.env&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship &lt;span class="nb"&gt;env &lt;/span&gt;import &lt;span class="nt"&gt;-f&lt;/span&gt; .env
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Secrets are encrypted and injected at runtime. Nothing gets baked into the container image.&lt;/p&gt;

&lt;h3&gt;
  
  
  Run it
&lt;/h3&gt;

&lt;p&gt;From the CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship invoke &lt;span class="nt"&gt;--input&lt;/span&gt; &lt;span class="s1"&gt;'{"messages": [{"role": "user", "content": "Research the latest AI papers"}]}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CLI streams events as nodes execute, so you can watch your graph work through its steps in real-time.&lt;/p&gt;

&lt;p&gt;From the REST API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.crewship.dev/v1/runs &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"deployment": "my-graph", "input": {"messages": [{"role": "user", "content": "Research the latest AI papers"}]}}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Execution traces in Crewship map to your graph nodes. You can see which node ran, what state it received, what it produced, how long each step took. When a node fails or a conditional edge routes somewhere unexpected, you'll see where and why.&lt;/p&gt;

&lt;h2&gt;
  
  
  LangGraph.js support
&lt;/h2&gt;

&lt;p&gt;If you're building with LangGraph.js (the TypeScript/JavaScript version), Crewship supports that too. The deployment experience is the same: same CLI workflow, same API, same execution traces. Check out the &lt;a href="https://dev.to/langgraphjs"&gt;LangGraph.js page&lt;/a&gt; for details.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you get out of the box
&lt;/h2&gt;

&lt;p&gt;All of that infrastructure from steps 1 through 6 that you'd otherwise build yourself:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Isolated execution&lt;/strong&gt; — every run gets its own environment, no interference between runs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-scaling&lt;/strong&gt; — scales up when there's work, scales to zero when there isn't&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployment versioning&lt;/strong&gt; — each &lt;code&gt;crewship deploy&lt;/code&gt; creates a new version, roll back to any previous one&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graph-aware execution traces&lt;/strong&gt; — see which nodes ran, state at each step, timing, token usage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Webhooks&lt;/strong&gt; — trigger runs from CI/CD, cron jobs, or Zapier; get notified on completion&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token auth&lt;/strong&gt; — API key authentication, generate and rotate keys from the console&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time SSE streaming&lt;/strong&gt; — watch graph execution live, or poll for the result&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get started
&lt;/h2&gt;

&lt;p&gt;You can deploy your first LangGraph graph on Crewship in a few commands. No credit card required for the free tier.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://console.crewship.dev" rel="noopener noreferrer"&gt;Get started&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.crewship.dev" rel="noopener noreferrer"&gt;Read the docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/langgraph"&gt;LangGraph framework page&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're already using Crewship for CrewAI, the same account and CLI work for LangGraph. Just &lt;code&gt;crewship init&lt;/code&gt; in your LangGraph project and deploy.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Questions about deploying LangGraph? Check the &lt;a href="https://docs.crewship.dev" rel="noopener noreferrer"&gt;docs&lt;/a&gt; or reach out at &lt;a href="mailto:mail@crewship.dev"&gt;mail@crewship.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>langchain</category>
      <category>ai</category>
      <category>agents</category>
      <category>devops</category>
    </item>
    <item>
      <title>How to Deploy CrewAI to Production</title>
      <dc:creator>Valentina</dc:creator>
      <pubDate>Mon, 09 Feb 2026 12:10:16 +0000</pubDate>
      <link>https://dev.to/vhalasi/how-to-deploy-crewai-to-production-445f</link>
      <guid>https://dev.to/vhalasi/how-to-deploy-crewai-to-production-445f</guid>
      <description>&lt;p&gt;So you've built a CrewAI crew. Maybe it researches topics and writes reports, or processes customer data and spits out insights. It works on your machine, the output looks good, and now you want other systems—or other people—to be able to use it.&lt;/p&gt;

&lt;p&gt;That's where things get interesting. There's a surprising amount of stuff between "it works on my laptop" and "it runs in production," and most of it has nothing to do with AI. This guide walks through the whole journey, step by step, in roughly the order you'd run into each problem yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: You Need an API
&lt;/h2&gt;

&lt;p&gt;Right now your crew runs as a Python script. You call &lt;code&gt;crew.kickoff()&lt;/code&gt;, wait, and get a result. That's fine for development, but no other service can call a Python script sitting on your machine.&lt;/p&gt;

&lt;p&gt;First order of business: stick an HTTP API in front of it. FastAPI is the go-to choice here:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;your_project.crew&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;YourCrew&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/run&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_crew&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;crew&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;YourCrew&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;crew&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;crew&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;kickoff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Looks easy enough. But there's a catch you'll hit almost immediately—crew runs are slow. We're talking anywhere from one to ten minutes depending on how many agents you have, what they're doing, and how many LLM calls they need to make. Meanwhile, most HTTP clients and reverse proxies have timeouts way shorter than that. Nginx defaults to 60 seconds. Serverless platforms are often worse.&lt;/p&gt;

&lt;p&gt;Your crew that happily runs for 8 minutes on your laptop? In production, the request just dies with a timeout error.&lt;/p&gt;

&lt;p&gt;The fix is to not wait for the crew inside the request handler at all. Kick off the run in the background, hand back a run ID, and let the client check back later:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/run&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;start_run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;background_tasks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;BackgroundTasks&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;run_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;running&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;background_tasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;execute_crew&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;run_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nd"&gt;@app.get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/run/{run_id}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;not_found&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Great, now you have an API. Next you need somewhere to actually run it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Containerize It
&lt;/h2&gt;

&lt;p&gt;Your API needs a consistent environment every time it starts up. CrewAI pulls in &lt;code&gt;langchain&lt;/code&gt;, &lt;code&gt;openai&lt;/code&gt;, &lt;code&gt;pydantic&lt;/code&gt;, and potentially dozens of other packages depending on what tools your agents use. All of those need to be installed, at the right versions, reliably.&lt;/p&gt;

&lt;p&gt;Docker is the standard answer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; python:3.11-slim&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; requirements.txt .&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--no-cache-dir&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . .&lt;/span&gt;
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If any of your agents use browser-based tools—web scraping with Playwright, for example—your Dockerfile gets a lot uglier. You'll need Chromium and all its system-level dependencies, which bloats the image size and makes builds slower.&lt;/p&gt;

&lt;p&gt;Then there's the question of where to host the container. Railway, Fly.io, AWS ECS, Google Cloud Run—there's no shortage of options, and each one comes with its own config format, networking quirks, and pricing model. Pick one, get it running, and move on. Because there's more.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: One Machine Isn't Enough
&lt;/h2&gt;

&lt;p&gt;Here's something you'll figure out pretty fast: crew runs eat resources. Each one loads a full agent context, fires off dozens of LLM calls, and might run tools that chew through a lot of memory. If you try to handle several concurrent runs on a single server, you're going to run into memory pressure, CPU contention, and eventually OOM kills.&lt;/p&gt;

&lt;p&gt;The obvious reaction is to just get a bigger server. That helps for a while, but you're also paying for all that capacity even when nothing is running.&lt;/p&gt;

&lt;p&gt;What you really want is to spin up an isolated environment for each run:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Runs can't step on each other&lt;/li&gt;
&lt;li&gt;When there's no work, you scale to zero and stop paying&lt;/li&gt;
&lt;li&gt;When a burst of requests comes in, you spin up more instances to match&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where it stops being "deploy an app" and starts being "build a container orchestration system." Kubernetes is the usual answer, but running Kubernetes well—even managed Kubernetes on EKS or GKE—is basically its own job.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Add a Job Queue
&lt;/h2&gt;

&lt;p&gt;Now that runs happen in their own containers, something needs to coordinate the work. You can't just spin up a container inline when a request comes in—you need a proper queue.&lt;/p&gt;

&lt;p&gt;The flow ends up looking like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Your API gets a request and drops a job onto the queue&lt;/li&gt;
&lt;li&gt;A worker picks up the job&lt;/li&gt;
&lt;li&gt;The worker spins up a fresh environment, runs the crew, stores the result&lt;/li&gt;
&lt;li&gt;The client polls your API for the result (or you send a webhook)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For the queue itself, you'll need a message broker—Redis, RabbitMQ, SQS, something like that. Plus a task framework like Celery to actually run the jobs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@celery_app.task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bind&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_crew_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;crew_config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;crew&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;build_crew&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;crew_config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;crew&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;kickoff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And now you've got a whole new set of things to worry about: dead letter queues, retry policies, concurrency limits, monitoring. What happens when a worker crashes in the middle of a run? What if the queue starts backing up—how do you prioritize? All solvable, but none of it solves itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Versioning Gets Tricky
&lt;/h2&gt;

&lt;p&gt;This one sneaks up on you. Picture this: you have 10 jobs sitting in the queue, waiting to be processed. You deploy a new version of your crew—maybe you tweaked an agent's prompt or swapped out a tool. What happens to those 10 jobs?&lt;/p&gt;

&lt;p&gt;If your workers always pull the latest code, those queued jobs run on the new version. Sometimes that's fine. But if you changed the input format or removed a tool the old config depended on, those jobs are going to break.&lt;/p&gt;

&lt;p&gt;For production you need version awareness. Every job should be pinned to the version of the crew it was submitted against, and your workers need to be able to run older versions. That means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tagging your container images properly (not just pushing to &lt;code&gt;latest&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Recording the version alongside every job in the queue&lt;/li&gt;
&lt;li&gt;Keeping older versions around so in-flight jobs can finish&lt;/li&gt;
&lt;li&gt;Having a way to roll back when a new version causes problems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At this point your "deployment" has turned into a proper system—container registry, version metadata, rollback procedures, maybe separate staging and production environments each with their own version history.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: Authentication and Security
&lt;/h2&gt;

&lt;p&gt;Your API is on the internet now. If someone finds the URL, they can start kicking off crew runs—and those cost real money because every run makes LLM API calls that show up on your bill.&lt;/p&gt;

&lt;p&gt;At bare minimum you need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;API keys&lt;/strong&gt; so only authorized clients can trigger runs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secret management&lt;/strong&gt; for your LLM provider keys (OpenAI, Anthropic, etc.)—you don't want those hardcoded or scattered across worker configs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Input validation&lt;/strong&gt; to catch garbage inputs and prompt injection attempts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rate limiting&lt;/strong&gt; so a buggy client can't accidentally blow through your API budget in an afternoon&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If multiple people or teams are using your crew, add per-user keys, usage tracking, and audit logs to the list. None of this is AI-specific—it's standard web security stuff. But it all needs to get built.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 7: Observability
&lt;/h2&gt;

&lt;p&gt;Alright. Your crew is deployed, containerized, queued, versioned, and locked down. It's running in production. And then one morning a run fails.&lt;/p&gt;

&lt;p&gt;Why? Which agent hit the problem? Which task? What did the LLM actually respond with—was it a rate limit, a timeout, a weird tool output, or just a hallucination?&lt;/p&gt;

&lt;p&gt;Without proper observability, you're basically guessing. What you need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Detailed logs&lt;/strong&gt; from each run—not just "started" and "finished," but the actual trace of agent decisions, tool calls, and LLM responses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metrics&lt;/strong&gt; on run duration, token usage, cost per run, success rates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Alerts&lt;/strong&gt; for when error rates spike or costs go past a threshold&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run history&lt;/strong&gt; so you can compare outputs across different inputs and versions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That means integrating with whatever logging and monitoring stack you use (Datadog, CloudWatch, Grafana, etc.), building custom dashboards, and adding instrumentation throughout your code. It's work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Take a Step Back
&lt;/h2&gt;

&lt;p&gt;Look at everything you've put together:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;An HTTP API in front of your crew&lt;/li&gt;
&lt;li&gt;A Docker container to package it&lt;/li&gt;
&lt;li&gt;Container orchestration for isolated execution&lt;/li&gt;
&lt;li&gt;A message queue to coordinate jobs&lt;/li&gt;
&lt;li&gt;A versioning system wired into your deploy pipeline&lt;/li&gt;
&lt;li&gt;Auth, rate limiting, and secret management&lt;/li&gt;
&lt;li&gt;Logging, metrics, and alerting&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's a lot of infrastructure. Every piece makes sense on its own, but stacked together it's a real system that needs real maintenance. And here's the thing—none of it is your actual product. All of it exists purely to let your CrewAI agents run somewhere other than your laptop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Or: Deploy With Crewship
&lt;/h2&gt;

&lt;p&gt;We built Crewship because we got tired of rebuilding this stack every time. All the infrastructure above—the containers, the queues, the versioning, the auth—it's handled for you.&lt;/p&gt;

&lt;p&gt;Here's what the same deployment looks like with Crewship, starting from a standard CrewAI project:&lt;/p&gt;

&lt;h3&gt;
  
  
  Install the CLI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://www.crewship.dev/install.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Log in
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship login
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Opens your browser for a one-time login. After that, API keys are managed through the Crewship console.&lt;/p&gt;

&lt;h3&gt;
  
  
  Set up your project
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This looks at your project, finds your CrewAI entrypoint, and creates a &lt;code&gt;crewship.toml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[deployment]&lt;/span&gt;
&lt;span class="py"&gt;framework&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"crewai"&lt;/span&gt;
&lt;span class="py"&gt;entrypoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"your_project.crew:YourCrew"&lt;/span&gt;
&lt;span class="py"&gt;python&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"3.11"&lt;/span&gt;
&lt;span class="py"&gt;profile&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"slim"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your agents need a browser, just set &lt;code&gt;profile = "browser"&lt;/code&gt; and Crewship takes care of the Chromium stuff.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deploy
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Done. Your code gets packaged, built, and deployed. You get a deployment URL and a link to the console.&lt;/p&gt;

&lt;h3&gt;
  
  
  Add your secrets
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship &lt;span class="nb"&gt;env set &lt;/span&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;sk-... &lt;span class="nv"&gt;SERPER_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or just import your &lt;code&gt;.env&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship &lt;span class="nb"&gt;env &lt;/span&gt;import &lt;span class="nt"&gt;-f&lt;/span&gt; .env
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Everything is encrypted and injected at runtime. Nothing gets baked into the image.&lt;/p&gt;

&lt;h3&gt;
  
  
  Run it
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship invoke &lt;span class="nt"&gt;--input&lt;/span&gt; &lt;span class="s1"&gt;'{"topic": "AI agents in healthcare"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CLI streams events as they happen—you can watch your agents work through their tasks in real time. For programmatic access, there's a REST API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.crewship.dev/v1/runs &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"deployment": "your-crew", "input": {"topic": "AI agents"}}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Everything from steps 1–7, included
&lt;/h3&gt;

&lt;p&gt;All of that infrastructure you'd otherwise build yourself comes out of the box:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Isolated execution&lt;/strong&gt; — every run gets its own environment, no interference between runs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-scaling&lt;/strong&gt; — scales up when there's work, scales to zero when there isn't&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployment versioning&lt;/strong&gt; — each &lt;code&gt;crewship deploy&lt;/code&gt; creates a new version, roll back to any previous one with a click, keep staging and production separate&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execution traces&lt;/strong&gt; — full visibility into agent actions, LLM calls, tool usage, token counts, and cost per run&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Webhooks&lt;/strong&gt; — trigger runs from CI/CD, cron jobs, or Zapier with incoming webhooks; get notified on completion with outgoing webhooks to your backend, Slack, wherever; all signed with HMAC-SHA256&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth&lt;/strong&gt; — token-based API authentication, generate and rotate keys from the console&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time streaming&lt;/strong&gt; — watch runs happen live over Server-Sent Events, or just poll for the result&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Push a new version
&lt;/h3&gt;

&lt;p&gt;Changed your agents? Just deploy again:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crewship deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;New version goes live. In-flight jobs keep running on the version they started with. If something's off, roll back from the console. No downtime.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hook it into your systems
&lt;/h3&gt;

&lt;p&gt;Create an incoming webhook in the console and trigger runs from anywhere:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.crewship.dev/webhooks/runs/YOUR_WEBHOOK_TOKEN &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"topic": "AI agents", "year": "2025"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set up an outgoing webhook so your backend gets notified when a run finishes—no polling needed.&lt;/p&gt;




&lt;p&gt;Nobody's saying you can't build all of this yourself. Plenty of teams do. But it's a lot of engineering that doesn't move your actual product forward. Crewship handles the infrastructure so you can spend your time on the part that matters—making your agents better.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://console.crewship.dev" rel="noopener noreferrer"&gt;Get started for free&lt;/a&gt;—no credit card required.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Have questions about deploying your crew? Check out the &lt;a href="https://docs.crewship.dev" rel="noopener noreferrer"&gt;docs&lt;/a&gt; or reach out—we're happy to help.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>crewai</category>
      <category>agents</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
