<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jonathan Gastón Löwenstern</title>
    <description>The latest articles on DEV Community by Jonathan Gastón Löwenstern (@jonigl).</description>
    <link>https://dev.to/jonigl</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3134065%2F67c767ca-e297-454c-87bc-328e24e4506e.jpg</url>
      <title>DEV Community: Jonathan Gastón Löwenstern</title>
      <link>https://dev.to/jonigl</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jonigl"/>
    <language>en</language>
    <item>
      <title>Ollama’s New Thinking Mode in less than 5 Minutes</title>
      <dc:creator>Jonathan Gastón Löwenstern</dc:creator>
      <pubDate>Sat, 07 Jun 2025 11:03:13 +0000</pubDate>
      <link>https://dev.to/jonigl/ollamas-new-thinking-mode-in-less-than-5-minutes-13ig</link>
      <guid>https://dev.to/jonigl/ollamas-new-thinking-mode-in-less-than-5-minutes-13ig</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftvrvthlrmk7yk0mk2yb2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftvrvthlrmk7yk0mk2yb2.png" alt="Cover" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Get the most of Ollama’s reasoning models with the new thinking mode.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Ollama&lt;/strong&gt; &lt;code&gt;v0.9.0&lt;/code&gt; was just released with support for t*&lt;em&gt;hinking mode&lt;/em&gt;&lt;em&gt;, and now the **Ollama Python SDK&lt;/em&gt;* reached parity with &lt;code&gt;v0.5.0&lt;/code&gt;. This means you can start using this powerful reasoning feature right away to build smarter local AI agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is exciting
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Benefits of Thinking Mode&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Improved performance on complex tasks&lt;/strong&gt;: thinking before responding leads to more accurate, step‑by‑step answers for reasoning and planning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better understanding of user instructions&lt;/strong&gt;: the model can unpack nuanced prompts and pinpoint key requirements.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;More creative and informative responses&lt;/strong&gt;: by exploring multiple possibilities internally, it surfaces fresh ideas and richer explanations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What you will learn
&lt;/h2&gt;

&lt;p&gt;Ollama's new thinking mode allows models to reason through complex tasks before providing a final answer. This is a game-changer for building local &lt;strong&gt;AI agents&lt;/strong&gt; that can think through problems, plan solutions, and provide more accurate responses.&lt;/p&gt;

&lt;p&gt;On this tutorial I will guide you through setting up a simple interactive chat application that demonstrates this new feature using the Ollama Python SDK. You’ll see how to pull a thinking‑capable model, install the SDK, and run a chat that reveals the model's thought process in real-time.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you will do
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Upgrade to the latest &lt;code&gt;Ollama&lt;/code&gt; release&lt;/li&gt;
&lt;li&gt;Pull thinking‑ready models (&lt;code&gt;qwen3:0.6b&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Install the brand‑new &lt;code&gt;Ollama Python SDK&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Run a fully interactive “thought bubble” chat in your terminal using the &lt;code&gt;rich&lt;/code&gt; library&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Python 3.10&lt;/code&gt; or higher&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/astral-sh/uv" rel="noopener noreferrer"&gt;uv&lt;/a&gt; for Python package management&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://ollama.com/" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; version ≥ &lt;code&gt;v0.9.0&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;A thinking‑capable model like &lt;code&gt;qwen3:0.6b&lt;/code&gt; pulled from Ollama&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;[!NOTE]&lt;br&gt;
&lt;strong&gt;Heads‑up:&lt;/strong&gt; Only models trained to expose their reasoning support thinking today. Check the &lt;em&gt;thinking models&lt;/em&gt; list that Ollama is maintaining.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Let’s get started!&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1. Let's Upgrade Ollama to v0.9.0
&lt;/h2&gt;

&lt;p&gt;We will need the latest Ollama release to use the thinking mode. If you already have Ollama installed, ensure it is at least version &lt;code&gt;0.9.0&lt;/code&gt;. You can check your version with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama &lt;span class="nt"&gt;--version&lt;/span&gt;   &lt;span class="c"&gt;# should print 0.9.0 or higher&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you need to upgrade, you can do so with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# If you have the desktop app installed, it will prompt you to update.&lt;/span&gt;

&lt;span class="c"&gt;# macOS or Linux (Homebrew)&lt;/span&gt;
brew upgrade ollama

&lt;span class="c"&gt;# Windows&lt;/span&gt;
winget upgrade Ollama
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you don’t have Ollama installed yet check the official website &lt;a href="https://ollama.com/" rel="noopener noreferrer"&gt;https://ollama.com/&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2. Pull a thinking‑capable model
&lt;/h2&gt;

&lt;p&gt;We will use the &lt;code&gt;qwen3:0.6b&lt;/code&gt; model, since it is super small and fast, yet supports thinking mode. You can pull it with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull qwen3:0.6b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's run one quickly to see the CLI in action:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama run qwen3:0.6b &lt;span class="s2"&gt;"Is 5 a Fibonacci number?"&lt;/span&gt; &lt;span class="nt"&gt;--think&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll see two distinct sections: first, the dim &lt;em&gt;Thinking...&lt;/em&gt; output showing the model's internal reasoning, followed by the clean final answer. Because it is only 0.6B parameters, this tiny model blazes through tokens faster than you can read them 🤣&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3. Install the Python SDK with thinking support
&lt;/h2&gt;

&lt;p&gt;Now let's generate a &lt;strong&gt;virtual environment&lt;/strong&gt; using &lt;code&gt;uv&lt;/code&gt; and install the latest Ollama Python SDK.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;[!TIP]&lt;br&gt;
If you don't have &lt;code&gt;uv&lt;/code&gt; installed, you can do check the &lt;a href="https://github.com/astral-sh/uv?tab=readme-ov-file#installation" rel="noopener noreferrer"&gt;uv documentation&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a new directory for the demo&lt;/span&gt;
&lt;span class="nb"&gt;mkdir &lt;/span&gt;ollama-thinking-demo

&lt;span class="c"&gt;# Change into the new directory&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;ollama-thinking-demo

&lt;span class="c"&gt;# Create a new virtual environment&lt;/span&gt;
uv venv
&lt;span class="c"&gt;# Activate the virtual environment&lt;/span&gt;
&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate  &lt;span class="c"&gt;# On Windows use: .venv\Scripts\activate&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now let's install the latest Ollama Python SDK, which includes support for thinking mode and the &lt;code&gt;rich&lt;/code&gt; library for pretty terminal output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv add ollama rich
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;[!TIP]&lt;br&gt;
Version &lt;code&gt;0.5.0&lt;/code&gt; introduces the &lt;code&gt;think&lt;/code&gt; parameter in both &lt;code&gt;generate&lt;/code&gt; and &lt;code&gt;chat&lt;/code&gt; helpers. We are installing rich as well&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Step 4. Copy&amp;amp;paste the &lt;strong&gt;ThinkingChat&lt;/strong&gt; demo
&lt;/h2&gt;

&lt;p&gt;Create a new file called &lt;code&gt;ollama_thinking_chat.py&lt;/code&gt; and copy the following code into it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ollama&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;rich.console&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Console&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;rich.live&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Live&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;rich.markdown&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Markdown&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ThinkingChat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;qwen3:0.6b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;console&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ollama&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ollama&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a helpful assistant that thinks through answers.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Ask a question and see the model think through the answer&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ollama&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;think&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;  &lt;span class="c1"&gt;# &amp;lt;-- Enable thinking mode
&lt;/span&gt;        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;thinking&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
        &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;

        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;Live&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;console&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;console&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;refresh_per_second&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;live&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

                &lt;span class="c1"&gt;# Show thinking process
&lt;/span&gt;                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;thinking&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;thinking&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;thinking&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🤔 **Thinking:**&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="n"&gt;thinking&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;thinking&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                    &lt;span class="n"&gt;live&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;thinking&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dim&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

                &lt;span class="c1"&gt;# Show final answer
&lt;/span&gt;                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                    &lt;span class="n"&gt;live&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Simple chat loop&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[bold green]💭 Thinking Chat[/bold green] [dim](&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)[/dim]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[yellow]Ask me anything! Type &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;quit&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; to exit.[/yellow]&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Question: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;quit&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;exit&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Goodbye! 👋&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;break&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Add space after response
&lt;/span&gt;            &lt;span class="nf"&gt;except &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;KeyboardInterrupt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;EOFError&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Goodbye! 👋&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;


&lt;span class="c1"&gt;# Run the chat
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;chat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ThinkingChat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save the file as &lt;code&gt;ollama_thinking_chat.py&lt;/code&gt; and run it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run ollama_thinking_chat.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you have a fully interactive chat that shows the model's thought process in real-time! Let's try it out. Type a question like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Question: Is 5 a Fibonacci number?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see the model's thinking process displayed in a dimmed format, followed by the final answer. Check out the demo in action:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqlv6rwsozwcn9wdugevu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqlv6rwsozwcn9wdugevu.png" alt="Demo example" width="800" height="566"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What you can build next?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Educational tutors&lt;/strong&gt; that teach by example, revealing step‑by‑step logic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debugging dashboards&lt;/strong&gt; that compare the chain‑of‑thought across models.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Creative assistants&lt;/strong&gt; that brainstorm ideas and show their reasoning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interactive agents&lt;/strong&gt; that explain their decisions in real-time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Check out the complete code on my &lt;a href="https://github.com/jonigl/ai-publications/tree/03c66dd11c74955877bfbf94762be4198477500a/publications/2025/05/Ollama%E2%80%99s%20New%20Thinking%20Mode%C2%A0in%20less%20than%205%20Minutes/code" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;. There you will find:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;ollama_thinking_chat.py&lt;/code&gt; file with the full implementation.&lt;/li&gt;
&lt;li&gt;And the extended version &lt;code&gt;ollama_thinking_chat_extended.py&lt;/code&gt; with additional features and capabilities.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you like this repository, consider dropping a ⭐️&lt;/p&gt;

&lt;h2&gt;
  
  
  Final thoughts
&lt;/h2&gt;

&lt;p&gt;With Ollama's new thinking mode and the Ollama Python SDK, you can now build applications that leverage the model's &lt;strong&gt;reasoning capabilities&lt;/strong&gt;. This opens up exciting possibilities for creating &lt;strong&gt;more intelligent local AI agents&lt;/strong&gt; that can think through complex tasks and provide &lt;strong&gt;better answers&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enjoy building!&lt;/strong&gt; If this guide saved you time, consider sharing a ❤️ on this post. Thank you for your support, and happy coding! 🚀&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://ollama.com/" rel="noopener noreferrer"&gt;Ollama Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/ollama/ollama-python" rel="noopener noreferrer"&gt;Ollama Python SDK&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://rich.readthedocs.io/" rel="noopener noreferrer"&gt;Rich Library Documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>howto</category>
      <category>ollama</category>
      <category>thinkingmode</category>
    </item>
    <item>
      <title>Using Ollama with TypeScript: A Simple Guide</title>
      <dc:creator>Jonathan Gastón Löwenstern</dc:creator>
      <pubDate>Wed, 14 May 2025 20:05:26 +0000</pubDate>
      <link>https://dev.to/jonigl/using-ollama-with-typescript-a-simple-guide-1nf4</link>
      <guid>https://dev.to/jonigl/using-ollama-with-typescript-a-simple-guide-1nf4</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc5ezoxn4who7emw42kjl.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc5ezoxn4who7emw42kjl.jpg" alt="Cover article image" width="800" height="535"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you've installed Ollama and experimented with running models from the command line, the next logical step is to integrate these powerful AI capabilities into your TypeScript applications. This guide will show you how to use Ollama with TypeScript.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: If you prefer Python, check out my other guide: &lt;a href=""&gt;Using Ollama with Python: A Simple Guide&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Setting Up
&lt;/h2&gt;

&lt;p&gt;First, make sure Ollama is installed and running on your system.&lt;/p&gt;

&lt;p&gt;You can check this other article &lt;a href="https://dev.to/jonigl/using-ollama-with-python-a-simple-guide-27c2"&gt;Getting Started with Ollama: Run LLMs on Your Computer&lt;/a&gt; if you are not familiar with Ollama yet.&lt;/p&gt;

&lt;h3&gt;
  
  
  Required Ollama Models
&lt;/h3&gt;

&lt;p&gt;Before running the TypeScript examples in this guide, make sure you have the necessary models pulled. You can pull them using the Ollama CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Pull the models used in these examples
ollama pull llama3.2:1b  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You only need to pull these models once. Check which models you already have with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ollama list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setting Up Your TypeScript Project
&lt;/h2&gt;

&lt;p&gt;Let's create a new TypeScript project to work with Ollama:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a new directory for your project&lt;/span&gt;
&lt;span class="nb"&gt;mkdir &lt;/span&gt;ollama-typescript
&lt;span class="nb"&gt;cd &lt;/span&gt;ollama-typescript

&lt;span class="c"&gt;# Initialize a new Node.js project&lt;/span&gt;
npm init &lt;span class="nt"&gt;-y&lt;/span&gt;

&lt;span class="c"&gt;# Install TypeScript and tsx for running TypeScript files&lt;/span&gt;
npm &lt;span class="nb"&gt;install &lt;/span&gt;typescript @types/node &lt;span class="nt"&gt;--save-dev&lt;/span&gt;
npm &lt;span class="nb"&gt;install &lt;/span&gt;tsx &lt;span class="nt"&gt;--save-dev&lt;/span&gt;

&lt;span class="c"&gt;# Install the Ollama library&lt;/span&gt;
npm &lt;span class="nb"&gt;install &lt;/span&gt;ollama
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, create a TypeScript configuration file (&lt;code&gt;tsconfig.json&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"compilerOptions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ES2020"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"module"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"NodeNext"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"moduleResolution"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"NodeNext"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"esModuleInterop"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"strict"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"outDir"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"./dist"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"include"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"src/**/*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a &lt;code&gt;src&lt;/code&gt; directory for your TypeScript files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;src
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Basic Usage
&lt;/h2&gt;

&lt;p&gt;Let's start with a simple example using the Llama 3.2 1B model.&lt;/p&gt;

&lt;p&gt;Create a file named &lt;code&gt;src/generate.ts&lt;/code&gt; with this content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Ollama&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ollama&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ollama&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Ollama&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="c1"&gt;// Regular response&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ollama&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;llama3.2:1b&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Why is the sky blue?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run this example with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx tsx src/generate.ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will output the model's explanation of why the sky is blue as a complete response.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhkxgfi26h52aiohxi2sc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhkxgfi26h52aiohxi2sc.png" alt="basic usage screenshot" width="800" height="290"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Streaming Responses
&lt;/h2&gt;

&lt;p&gt;For a more interactive experience, you can get the response as it's being generated.&lt;/p&gt;

&lt;p&gt;Create a file named &lt;code&gt;src/generate-stream.ts&lt;/code&gt; with this content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Ollama&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ollama&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ollama&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Ollama&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Streaming response:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Streaming response&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ollama&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;llama3.2:1b&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Why is the sky blue?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stdout&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// New line at the end&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This displays the response incrementally as it's generated, creating a more interactive experience.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7egufzvz2yuw7yits0rx.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7egufzvz2yuw7yits0rx.gif" alt="streaming usage screenshot" width="1928" height="918"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How Streaming Works in TypeScript
&lt;/h2&gt;

&lt;p&gt;When you use the streaming functionality with Ollama in TypeScript, the response is returned as an &lt;code&gt;AsyncIterable&lt;/code&gt; that you can consume with a &lt;code&gt;for await...of&lt;/code&gt; loop.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;ollama.generate()&lt;/code&gt; function with &lt;code&gt;stream: true&lt;/code&gt; returns an &lt;code&gt;AsyncIterable&lt;/code&gt; in TypeScript. This allows you to process each chunk as it becomes available from the model:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Each chunk contains a small piece of the response in &lt;code&gt;chunk.response&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Using &lt;code&gt;process.stdout.write()&lt;/code&gt; prevents adding newlines between chunks&lt;/li&gt;
&lt;li&gt;The chunks are displayed immediately as they arrive&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This creates the effect of watching the AI "think" in real-time, similar to watching someone type.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using System Prompts
&lt;/h2&gt;

&lt;p&gt;The system prompt allows you to set context and instructions for the model before the conversation starts. It's a powerful way to define the model's behavior.&lt;/p&gt;

&lt;p&gt;Create a file named &lt;code&gt;src/chat-system-role.ts&lt;/code&gt; with this content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Ollama&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ollama&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ollama&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Ollama&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="c1"&gt;// Define a system prompt&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;systemPrompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You speak and sound like a pirate with short sentences.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// Chat with a system prompt&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ollama&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;llama3.2:1b&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;systemPrompt&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Tell me about your boat.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The system prompt stays active throughout the conversation, influencing how the model responds to all user inputs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdv2pxrj0m7fd17depkti.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdv2pxrj0m7fd17depkti.png" alt="system prompt usage screenshot" width="800" height="123"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conversational Context
&lt;/h2&gt;

&lt;p&gt;Maintain a conversation with context using streaming for a more interactive experience.&lt;/p&gt;

&lt;p&gt;Create a file named &lt;code&gt;src/chat-history-stream.ts&lt;/code&gt; with this content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Ollama&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ollama&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;readline&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;readline&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ollama&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Ollama&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="c1"&gt;// Initialize an empty message history&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;readline&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createInterface&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stdin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stdout&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;askQuestion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;rl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;question&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Chat with history: &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userInput&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userInput&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;exit&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;rl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="c1"&gt;// Get streaming response while maintaining conversation history&lt;/span&gt;
      &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;responseContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ollama&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;llama3.2:1b&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
          &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You are a helpful assistant. You only give a short sentence by answer.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userInput&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;

      &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;responseChunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
          &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stdout&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;responseChunk&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="nx"&gt;responseContent&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;responseChunk&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="c1"&gt;// Add the exchange to the conversation history&lt;/span&gt;
      &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userInput&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;responseContent&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;);&lt;/span&gt;

      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Add space after response&lt;/span&gt;
      &lt;span class="nf"&gt;askQuestion&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="nf"&gt;askQuestion&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here you can see how this example looks like.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsd3bsscn8dk4b9xo94z6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsd3bsscn8dk4b9xo94z6.png" alt="chat history usage screenshot" width="800" height="172"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The Ollama TypeScript library makes it easy to integrate powerful language models into your TypeScript and JavaScript applications. Whether you're building a simple script, a Node.js application, or integrating AI into a web app, the library's straightforward API allows you to focus on creating value rather than managing the underlying AI infrastructure.&lt;/p&gt;

&lt;p&gt;As you become more comfortable with the basics, explore more advanced features and consider how you can use these capabilities to solve real-world problems in your projects.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;p&gt;In this GitHub repository, you'll find working code examples: &lt;a href="https://github.com/jonigl/ai-publications/tree/main/publications/2025/03/Using%20Ollama%20with%20TypeScript%20-%20A%20Simple%20Guide/code" rel="noopener noreferrer"&gt;GitHub Repository&lt;/a&gt;&lt;/p&gt;

</description>
      <category>howto</category>
      <category>ollama</category>
      <category>ai</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Using Ollama with Python: A Simple Guide</title>
      <dc:creator>Jonathan Gastón Löwenstern</dc:creator>
      <pubDate>Mon, 12 May 2025 13:10:49 +0000</pubDate>
      <link>https://dev.to/jonigl/using-ollama-with-python-a-simple-guide-27c2</link>
      <guid>https://dev.to/jonigl/using-ollama-with-python-a-simple-guide-27c2</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4jd3b05xdxlecmm3bat0.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4jd3b05xdxlecmm3bat0.webp" alt="Cover image" width="800" height="567"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you’ve installed Ollama and experimented with running models from the command line, the next logical step is to integrate these powerful AI capabilities into your Python applications. This guide will show you how to use Ollama with Python.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting Up
&lt;/h2&gt;

&lt;p&gt;First, make sure Ollama is installed and running on your system.&lt;/p&gt;

&lt;p&gt;You can check this other article  &lt;a href="https://dev.to/jonigl/getting-started-with-ollama-run-llms-on-your-computer-35d6"&gt;Getting Started with Ollama: Run LLMs on Your Computer&lt;/a&gt;  if you are no familiar with Ollama yet.&lt;/p&gt;

&lt;h3&gt;
  
  
  Required Ollama Models
&lt;/h3&gt;

&lt;p&gt;Before running the Python examples in this guide, make sure you have the necessary models pulled. You can pull them using the Ollama CLI:&lt;/p&gt;

&lt;h4&gt;
  
  
  Pull the models used in these examples
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ollama pull llama3.2:1b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You only need to pull these models once. Check which models you already have with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ollama list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Creating a Virtual Environment
&lt;/h3&gt;

&lt;p&gt;It’s a good practice to use a virtual environment for your Python projects. This keeps your dependencies isolated and makes your project more portable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Create a virtual environment  

python -m venv ollama-env  

# Activate the virtual environment  

# On Windows:  

ollama-env\Scripts\activate  

# On macOS/Linux:  
source ollama-env/bin/activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Installing Dependencies
&lt;/h4&gt;

&lt;p&gt;Install the Ollama Python library:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install ollama
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Creating a requirements.txt
&lt;/h4&gt;

&lt;p&gt;For better project management, create a requirements.txt file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip freeze &amp;gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To install from this file in the future:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install -r requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Basic Usage
&lt;/h2&gt;

&lt;p&gt;Let’s start with a simple example using the Llama 3.2 1B model.&lt;/p&gt;

&lt;p&gt;Create a file named  &lt;code&gt;generate.py&lt;/code&gt;  with this content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from ollama import generate  
# Regular response  
response = generate('llama3.2:1b', 'Why is the sky blue?')  
print(response['response'])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will output the model’s explanation of why the sky is blue as a complete response.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2d7sceoe697rairkd1m2.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2d7sceoe697rairkd1m2.webp" alt="basic usage screenshot" width="800" height="256"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Streaming Responses
&lt;/h2&gt;

&lt;p&gt;For a more interactive experience, you can get the response as it’s being generated.&lt;/p&gt;

&lt;p&gt;Create a file named  &lt;code&gt;generate-stream.py&lt;/code&gt;  with this content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from ollama import generate  
# Streaming response  
print("Streaming response:")  
for chunk in generate('llama3.2:1b', 'Why is the sky blue?', stream=True):  
    print(chunk['response'], end='', flush=True)  
print()  # New line at the end
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This displays the response incrementally as it’s generated, creating a more interactive experience.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ac2mp2f001o18m3cosq.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ac2mp2f001o18m3cosq.gif" alt="streaming usage screenshot" width="1408" height="364"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Why  &lt;code&gt;for chunk in generate&lt;/code&gt;  is used?
&lt;/h3&gt;

&lt;p&gt;When you use the streaming functionality with Ollama, the response isn’t returned all at once. Instead, it’s broken into small pieces (chunks) that arrive one at a time as they’re generated by the model.&lt;/p&gt;

&lt;p&gt;The  &lt;code&gt;generate()&lt;/code&gt;  function with  &lt;code&gt;stream=True&lt;/code&gt;  returns an iterator in Python. This iterator yields new chunks of text as they become available from the model. The  &lt;code&gt;for&lt;/code&gt;  loop processes these chunks one by one as they arrive:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Each chunk contains a small piece of the response in  &lt;code&gt;chunk['response']&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt; The  &lt;code&gt;end=''&lt;/code&gt;  parameter prevents adding newlines between chunks&lt;/li&gt;
&lt;li&gt; The  &lt;code&gt;flush=True&lt;/code&gt;  ensures text displays immediately&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This creates the effect of watching the AI “think” in real-time, similar to watching someone type.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using System Prompts
&lt;/h2&gt;

&lt;p&gt;The system prompt allows you to set context and instructions for the model before the conversation starts. It’s a powerful way to define the model’s behavior.&lt;/p&gt;

&lt;p&gt;Create a file named  &lt;code&gt;chat-system-role.py&lt;/code&gt;  with this content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from ollama import chat  

# Define a system prompt  
system_prompt = "You speaks and sounds like a pirate with short sentences."  
# Chat with a system prompt  
response = chat('llama3.2:1b',   
                messages=[  
                    {'role': 'system', 'content': system_prompt},  
                    {'role': 'user', 'content': 'Tell me about your boat.'}  
                ])  
print(response.message.content)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The system prompt stays active throughout the conversation, influencing how the model responds to all user inputs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fafd1p7c0mj47tinlo118.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fafd1p7c0mj47tinlo118.webp" alt="system role usage screenshot" width="800" height="136"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conversational Context
&lt;/h2&gt;

&lt;p&gt;Maintain a conversation with context using streaming for a more interactive experience.&lt;/p&gt;

&lt;p&gt;Create a file named  &lt;code&gt;chat-history-stream.py&lt;/code&gt;  with this content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from ollama import chat  

# Initialize an empty message history  
messages = []  
while True:  
    user_input = input('Chat with history: ')  
    if user_input.lower() == 'exit':  
        break  
    # Get streaming response while maintaining conversation history  
    response_content = ""  
    for chunk in chat(  
        'llama3.2:1b',  
        messages=messages + [  
            {'role': 'system', 'content': 'You are a helpful assistant. You only give a short sentence by answer.'},  
            {'role': 'user', 'content': user_input},  
        ],  
        stream=True  
    ):  
        if chunk.message:  
            response_chunk = chunk.message.content  
            print(response_chunk, end='', flush=True)  
            response_content += response_chunk  
    # Add the exchange to the conversation history  
    messages += [  
        {'role': 'user', 'content': user_input},  
        {'role': 'assistant', 'content': response_content},  
    ]  
    print('\n')  # Add space after response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here you can see how this example looks like.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2gcee8hq4qb8uyygzra.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2gcee8hq4qb8uyygzra.gif" alt="Chat usage screenshot" width="1190" height="338"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The Ollama Python library makes it easy to integrate powerful language models into your Python applications. Whether you’re building a simple script or a complex application, the library’s straightforward API allows you to focus on creating value rather than managing the underlying AI infrastructure.&lt;/p&gt;

&lt;p&gt;As you become more comfortable with the basics, explore more advanced features and consider how you can use these capabilities to solve real-world problems in your projects.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;p&gt;In this GitHub repository, you'll find working code examples: &lt;a href="https://github.com/jonigl/ai-publications/tree/main/publications/2025/02/Using%20Ollama%20with%20Python%20-%20A%20Simple%20Guide/code" rel="noopener noreferrer"&gt;GitHub Repository&lt;/a&gt;&lt;/p&gt;

</description>
      <category>howto</category>
      <category>ollama</category>
      <category>ai</category>
      <category>python</category>
    </item>
    <item>
      <title>Build an MCP Client in Minutes: Local AI Agents Just Got Real</title>
      <dc:creator>Jonathan Gastón Löwenstern</dc:creator>
      <pubDate>Fri, 09 May 2025 14:48:19 +0000</pubDate>
      <link>https://dev.to/jonigl/build-an-mcp-client-in-minutes-local-ai-agents-just-got-real-4gj6</link>
      <guid>https://dev.to/jonigl/build-an-mcp-client-in-minutes-local-ai-agents-just-got-real-4gj6</guid>
      <description>&lt;p&gt;Built an MCP server already? well done! But it's only half the story. Without a client your model is shouting into the void.&lt;/p&gt;

&lt;p&gt;Give me &lt;strong&gt;8 minutes&lt;/strong&gt; and you'll:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Ship a full MCP client in under 100 lines&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;First steps into the world of local AI agents&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Plug it into any MCP server&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Keep every byte local. No cloud fees.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Privacy first. No API keys.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why roll your own client?
&lt;/h2&gt;

&lt;p&gt;Cloud agents are fun until the bill hits, and who really knows where your data ends up. A local MCP client means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Full privacy &amp;amp; control.&lt;/strong&gt; Every token lives on your machine, nowhere else.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero API keys.&lt;/strong&gt; No provider lock‑in, no surprise invoices.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Good‑enough performance today, better tomorrow.&lt;/strong&gt; Local models aren’t GPT‑4o yet, but they’re getting sharper every release.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The code below is &lt;em&gt;inspired&lt;/em&gt; by the &lt;a href="https://modelcontextprotocol.io/quickstart/client" rel="noopener noreferrer"&gt;official MCP client quickstart&lt;/a&gt; (which targets Anthropic models) and tweaked for Ollama so every byte runs offline. &lt;strong&gt;Let's build! 🚀&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Requirements
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Python 3.10+ (uv handles the virtual env)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ollama installed, follow the &lt;a href="https://ollama.com/" rel="noopener noreferrer"&gt;official install guide&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A local model pulled, the client defaults to &lt;code&gt;llama3.2:3b&lt;/code&gt;; run &lt;code&gt;ollama pull llama3.2:3b&lt;/code&gt; or switch with &lt;code&gt;--model&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Quick setup with uv
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. create project folder&lt;/span&gt;
&lt;span class="nb"&gt;mkdir &lt;/span&gt;simple-mcp-client &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;simple-mcp-client

&lt;span class="c"&gt;# 2. init uv (fast Python package manager)&lt;/span&gt;
uv init

&lt;span class="c"&gt;# 3. create &amp;amp; activate virtual env&lt;/span&gt;
uv venv
&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate

&lt;span class="c"&gt;# 4. add deps&lt;/span&gt;
uv add mcp ollama rich

&lt;span class="c"&gt;# 5. drop in the two files&lt;/span&gt;
client.py   &lt;span class="c"&gt;# we will code this on this article&lt;/span&gt;
server.py   &lt;span class="c"&gt;# from this repo https://github.com/jonigl/mcp-client-for-ollama/blob/simple-client/server.py&lt;/span&gt;

&lt;span class="c"&gt;# 6. run it&lt;/span&gt;
uv run client.py &lt;span class="nt"&gt;--mcp-server&lt;/span&gt; server.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Five commands and you're chatting locally ⚡️&lt;/p&gt;

&lt;h2&gt;
  
  
  The 93‑line client
&lt;/h2&gt;

&lt;p&gt;Below is the heart of our client. Copy it, paste it, run it. I'll highlight the spicy parts right after.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# client.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;contextlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AsyncExitStack&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ollama&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;ollama&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatResponse&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;StdioServerParameters&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.client.stdio&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;stdio_client&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;rich.console&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Console&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;rich.markdown&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Markdown&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MCPClient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Initialize the MCP client with a model and console
&lt;/span&gt;    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama3.2:3b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;              
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;console&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exit_stack&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AsyncExitStack&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ollama&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ollama&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="c1"&gt;# Connect to the MCP server using the provided script path
&lt;/span&gt;    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;connect_to_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;server_script_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;        
        &lt;span class="n"&gt;is_python&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;server_script_path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.py&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;is_js&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;server_script_path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.js&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;is_python&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;is_js&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Server script must be a .py or .js file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;command&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;python&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;is_python&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;server_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StdioServerParameters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;server_script_path&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;stdio_transport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exit_stack&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enter_async_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;stdio_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;server_params&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stdio_transport&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exit_stack&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enter_async_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="c1"&gt;# List tools
&lt;/span&gt;        &lt;span class="n"&gt;meta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_tools&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Server connected. Tools:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dim green&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Process a query by sending it to the model and handling tool calls
&lt;/span&gt;    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
        &lt;span class="n"&gt;meta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_tools&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;tools_meta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parameters&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="c1"&gt;# Initial call
&lt;/span&gt;        &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ChatResponse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ollama&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tools_meta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;final&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;final&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# Check for tool calls
&lt;/span&gt;        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="c1"&gt;# Call the tool
&lt;/span&gt;                &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;                
                &lt;span class="c1"&gt;# Call the model again with the tool result, so we can get the final answer. Max tokens is set to 500 speed up the process. Adjust as needed.
&lt;/span&gt;                &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ollama&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tools_meta&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
                &lt;span class="n"&gt;final&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;final&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Main loop for user interaction
&lt;/span&gt;    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;chat_loop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[bold green]MCP Client Started![/bold green] [cyan]Model: {}[/cyan]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[yellow]Type your queries below or [bold]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;quit&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[/bold] to exit.[/yellow]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[bold blue]Query:[/bold blue] &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;quit&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;ans&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;process_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ans&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;orange3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bold red&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exit_stack&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;aclose&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Main function to parse arguments and run the client
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ArgumentParser&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--mcp-server&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;required&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama3.2:3b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_args&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MCPClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect_to_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mcp_server&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat_loop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;finally&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Run it
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run client.py &lt;span class="nt"&gt;--mcp-server&lt;/span&gt; server.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ask something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;What's the weather in Tokyo?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your server exposes &lt;code&gt;get_weather&lt;/code&gt;, the model silently calls it and answers like a pro. That's &lt;strong&gt;MCP + Ollama&lt;/strong&gt; working together on your desk.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbvb37xobqqbzri8ojk5q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbvb37xobqqbzri8ojk5q.png" alt="terminal execution" width="800" height="176"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Here you can see how it looks like!&lt;/em&gt; 🤩&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;connect_to_server&lt;/strong&gt; launches the mcp server (&lt;code&gt;python&lt;/code&gt; or &lt;code&gt;node&lt;/code&gt;) and chats over stdio.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;list_tools&lt;/strong&gt; grabs available tools and hands them to Ollama as function specs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;process_query&lt;/strong&gt; lets the model decide, executes the chosen tool, then loops back for the final reply.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same pattern big providers use, but running locally, though latency depends on your hardware and the model you load.&lt;/p&gt;

&lt;h2&gt;
  
  
  Want turbo mode?
&lt;/h2&gt;

&lt;p&gt;I’ve expanded this simple MCP client into a more feature-rich CLI tool called &lt;code&gt;ollcmp&lt;/code&gt;. Check it out in my repo 👉 &lt;strong&gt;&lt;a href="https://github.com/jonigl/mcp-client-for-ollama" rel="noopener noreferrer"&gt;mcp-client-for-ollama&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Quickstart with &lt;code&gt;ollcmp&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Run it instantly with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uvx ollcmp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or install it globally for easy access:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;ollcmp &lt;span class="nt"&gt;--upgrade&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ollcmp&lt;/code&gt; ships with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🌐 &lt;strong&gt;Multi‑Server Support&lt;/strong&gt; — connect to several MCP servers at once&lt;/li&gt;
&lt;li&gt;🎨 &lt;strong&gt;Rich Terminal UI&lt;/strong&gt; — slick interactive console&lt;/li&gt;
&lt;li&gt;🚀 &lt;strong&gt;Dynamic Model Switching&lt;/strong&gt; — swap any installed Ollama model on the fly&lt;/li&gt;
&lt;li&gt;🛠️ &lt;strong&gt;Tool Management&lt;/strong&gt; — toggle tools or whole servers mid‑chat&lt;/li&gt;
&lt;li&gt;🔄 &lt;strong&gt;Cross‑Language Servers&lt;/strong&gt; — Python &lt;em&gt;and&lt;/em&gt; JavaScript MCP servers work out of the box&lt;/li&gt;
&lt;li&gt;🔌 &lt;strong&gt;Plug‑and‑Play&lt;/strong&gt; — point at any MCP‑compliant server and go&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And if you want just fork it, bend it, contribute it.&lt;/p&gt;

&lt;p&gt;If you found this helpful, consider giving it a reaction ❤️ to show your support!&lt;/p&gt;

&lt;h2&gt;
  
  
  Next moves
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Browse the &lt;a href="https://github.com/modelcontextprotocol/servers" rel="noopener noreferrer"&gt;official MCP server list&lt;/a&gt; and plug one in.&lt;/li&gt;
&lt;li&gt;Try other Ollama models (for example qwen2.5:7b).&lt;/li&gt;
&lt;li&gt;Wrap your own script in MCP and watch your LLM gain a superpower.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Clone it, hack it, show it off.&lt;/strong&gt; Drop a link to your build below and let’s continue learning together.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Grab the full example
&lt;/h3&gt;

&lt;p&gt;All the code with the &lt;code&gt;client.py&lt;/code&gt;, a ready-to-run &lt;code&gt;server.py&lt;/code&gt;, plus the &lt;code&gt;uv&lt;/code&gt; project scaffold—is on GitHub:&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;&lt;a href="https://github.com/jonigl/mcp-client-for-ollama/tree/simple-client" rel="noopener noreferrer"&gt;https://github.com/jonigl/mcp-client-for-ollama/tree/simple-client&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Clone it, run it, and start wiring your own tools in minutes 🙌&lt;/p&gt;

&lt;h3&gt;
  
  
  Check out the asciinema demo
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://asciinema.org/a/718592" rel="noopener noreferrer"&gt;https://asciinema.org/a/718592&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Want to know how to build an MCP server?
&lt;/h3&gt;

&lt;p&gt;You can check out my article on how to build an MCP server in minutes here: &lt;a href="https://dev.to/jonigl/your-first-mcp-server-quick-35eg"&gt;Your first MCP Server (quick)&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ollama</category>
      <category>ai</category>
      <category>agents</category>
    </item>
    <item>
      <title>MCP filesystem: Server disconnected</title>
      <dc:creator>Jonathan Gastón Löwenstern</dc:creator>
      <pubDate>Wed, 07 May 2025 15:23:10 +0000</pubDate>
      <link>https://dev.to/jonigl/mcp-filesystem-server-disconnected-44db</link>
      <guid>https://dev.to/jonigl/mcp-filesystem-server-disconnected-44db</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiiq0pc68s9u4vprr9lfg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiiq0pc68s9u4vprr9lfg.png" alt="MCP filesystem: Server disconnected screenshot image" width="720" height="510"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’re excited about using your Claude Desktop app with the new Model Context Protocol (MCP) but keep running into frustrating configuration errors, you’re not alone. This is especially common if you’re a Node Version Manager (nvm) user.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;You followed the “Quickstart guide: For Claude Desktop User,” carefully copied the JSON configuration into your own  &lt;code&gt;claude_desktop_config.json&lt;/code&gt;  file, but it keeps failing repeatedly.&lt;/p&gt;

&lt;p&gt;You’ll likely see this specific error alert:  &lt;strong&gt;“MCP filesystem: Server disconnected”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;After some investigation, I discovered the root cause from a comment in the project’s GitHub repository (see Sources section below):&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The MCP Server Commands environment cannot directly access Node.js executables installed via nvm unless they are properly configured.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: Create a Wrapper Script
&lt;/h2&gt;

&lt;p&gt;This elegant solution creates a wrapper script that ensures the correct Node.js environment is used:&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Find your Node.js path
&lt;/h2&gt;

&lt;p&gt;First, determine your nvm-installed Node.js path:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;which node
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will return something like:  &lt;code&gt;/Users/username/.nvm/versions/node/v16.x.x/bin/node&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Create a wrapper script
&lt;/h2&gt;

&lt;p&gt;Create a new file at  &lt;code&gt;/usr/local/bin/npx-for-claude&lt;/code&gt;  with the following content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;##!/usr/bin/env bash  &lt;/span&gt;

&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/Users/YOUR-USERNAME/.nvm/versions/node/YOUR-NODE-VERSION/bin:&lt;/span&gt;&lt;span class="nv"&gt;$PATH&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;  
&lt;span class="nb"&gt;exec &lt;/span&gt;npx &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$@&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make sure to replace:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;code&gt;YOUR-USERNAME&lt;/code&gt;  with your actual username&lt;/li&gt;
&lt;li&gt;  &lt;code&gt;YOUR-NODE-VERSION&lt;/code&gt;  with your actual Node.js version (e.g., v16.20.0)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 3: Make the wrapper script executable
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x /usr/local/bin/npx-for-claude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 4: Configure Claude Desktop
&lt;/h2&gt;

&lt;p&gt;Edit your  &lt;code&gt;claude_desktop_config.json&lt;/code&gt;  file to use the wrapper script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;  
    &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;  
        &lt;/span&gt;&lt;span class="nl"&gt;"filesystem"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;  
            &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx-for-claude"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  
            &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;  
                &lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  
                &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-filesystem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  
                &lt;/span&gt;&lt;span class="s2"&gt;"/Users/username/path/to/allowed/directory"&lt;/span&gt;&lt;span class="w"&gt;  
            &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;  
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;  
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;  
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Final Step
&lt;/h2&gt;

&lt;p&gt;Restart your Claude Desktop app, and it should start working with the new configuration!&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Works
&lt;/h2&gt;

&lt;p&gt;This solution works because:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; The wrapper script sets up the correct PATH environment that includes your nvm-installed Node.js binaries&lt;/li&gt;
&lt;li&gt; It then executes the  &lt;code&gt;npx&lt;/code&gt;  command with all the arguments passed to it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The wrapper script essentially bridges the gap between Claude Desktop’s execution environment and your nvm setup, allowing you to use the standard configuration pattern recommended in the documentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Source
&lt;/h2&gt;

&lt;p&gt;This solution was inspired by some comments in the ModelContextProtocol GitHub repository:  &lt;a href="https://github.com/modelcontextprotocol/servers/issues/64#issuecomment-2611032995" rel="noopener noreferrer"&gt;Issue #64&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I have also added this solution as a comment on this GitHub issue:  &lt;a href="https://github.com/modelcontextprotocol/servers/issues/64#issuecomment-2730913259" rel="noopener noreferrer"&gt;comment&lt;/a&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>howtofix</category>
      <category>howto</category>
    </item>
    <item>
      <title>Getting Started with Ollama: Run LLMs on Your Computer</title>
      <dc:creator>Jonathan Gastón Löwenstern</dc:creator>
      <pubDate>Wed, 07 May 2025 15:17:24 +0000</pubDate>
      <link>https://dev.to/jonigl/getting-started-with-ollama-run-llms-on-your-computer-35d6</link>
      <guid>https://dev.to/jonigl/getting-started-with-ollama-run-llms-on-your-computer-35d6</guid>
      <description>&lt;p&gt;Ollama makes it easy to run large language models (LLMs) locally on your own computer. This simple guide will show you how to install Ollama, run your first model, and use it in a Python script.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installing Ollama
&lt;/h2&gt;

&lt;h3&gt;
  
  
  macOS
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; Download the installer from  &lt;a href="https://ollama.ai/" rel="noopener noreferrer"&gt;ollama.ai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; Open the downloaded file and drag Ollama to your Applications folder&lt;/li&gt;
&lt;li&gt; Open Ollama from your Applications&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Windows
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; Download the installer from  &lt;a href="https://ollama.ai/" rel="noopener noreferrer"&gt;ollama.ai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; Run the .exe file and follow the installation wizard&lt;/li&gt;
&lt;li&gt; Ollama will start automatically when installation completes&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Linux
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; Just run this
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -fsSL https://ollama.ai/install.sh | sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Docker
&lt;/h3&gt;

&lt;p&gt;Ollama is also available as a Docker container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker pull ollama/ollama  
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Running Your First Model
&lt;/h2&gt;

&lt;p&gt;Let’s try Llama 3.2 1B, a compact but capable model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ollama run llama3.2:1b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first time you run this command, Ollama will download the model. Once it’s ready, you’ll see a prompt where you can start chatting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; Why is the sky blue?  
The sky appears blue due to a phenomenon called Rayleigh scattering. As sunlight travels through the atmosphere, the shorter blue wavelengths of light are scattered more by air molecules than the longer red wavelengths. This scattered blue light comes to us from all directions in the sky, making the sky appear blue during the day.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To exit the Ollama terminal, you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Type  &lt;code&gt;/bye&lt;/code&gt;  and press Enter&lt;/li&gt;
&lt;li&gt;  Press Ctrl+D (on macOS/Linux)&lt;/li&gt;
&lt;li&gt;  Press Ctrl+C twice&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Basic Ollama Commands
&lt;/h2&gt;

&lt;p&gt;Here are some useful commands to get you started:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;### List all your downloaded models  
ollama list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;### Download a model without running it  
ollama pull llama3.2:1b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;### Remove a model you no longer need  
ollama rm llama3.2:1b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;### Get information about a model  
ollama info llama3.2:1b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Next Steps
&lt;/h3&gt;

&lt;p&gt;Once you’re comfortable with the basics, you can try the Ollama Python library to integrate it in your Python applications. Check this article  &lt;a href="https://medium.com/@jonigl/using-ollama-with-python-a-simple-guide-0752369e1e55" rel="noopener noreferrer"&gt;Using Ollama with Python: A Simple Guide&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Enjoy the freedom of running AI locally with Ollama!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>ollama</category>
      <category>howto</category>
      <category>beginners</category>
    </item>
    <item>
      <title>How to Build Your First MCP Server (Fast)</title>
      <dc:creator>Jonathan Gastón Löwenstern</dc:creator>
      <pubDate>Wed, 07 May 2025 13:59:00 +0000</pubDate>
      <link>https://dev.to/jonigl/your-first-mcp-server-quick-35eg</link>
      <guid>https://dev.to/jonigl/your-first-mcp-server-quick-35eg</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;If you’ve landed on this article, you’re probably wondering: &lt;em&gt;“What’s all this MCP stuff about?”&lt;/em&gt; and &lt;em&gt;“Why is it getting so much hype lately?”&lt;/em&gt; Or maybe you already have an idea and just want to build your own MCP server to let LLMs interact with your tools. So, let’s quickly answer the basics to get on the same page and then jump right into building your first MCP Server.&lt;/p&gt;

&lt;h3&gt;
  
  
  What’s all this MCP stuff about?
&lt;/h3&gt;

&lt;p&gt;At this point you probably know what &lt;strong&gt;MCP&lt;/strong&gt; is, but if you don’t, no worries - I’ve got you covered. MCP was introduced last year by Anthropic (the company behind Claude) and stands for &lt;strong&gt;Model Context Protocol&lt;/strong&gt; It might sound complex at first, but it’s actually quite simple. MCP is a way to let &lt;strong&gt;LLMs&lt;/strong&gt; (&lt;strong&gt;Large Language Models&lt;/strong&gt;) interact with tools in a open standard way, allowing them to get context from different data sources and even execute tasks on you behalf, so building agents is just around the corner.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why is it getting so much hype lately?
&lt;/h3&gt;

&lt;p&gt;Answering this isn’t easy, because everyone has their own take. But here’s the deal: MCP gives the &lt;strong&gt;community&lt;/strong&gt; a &lt;strong&gt;standard&lt;/strong&gt; way to build connectors for AI apps, so you don’t have to build everything yourself. Most tools can be &lt;strong&gt;plug-and-play.&lt;/strong&gt; Third-party services can run their own servers, and you can use them without needing custom implementations. That &lt;strong&gt;simplicity&lt;/strong&gt; is what’s getting everyone excited and backing MCP as the go-to standard.&lt;/p&gt;

&lt;h2&gt;
  
  
  Clock is ticking, How can I build my first MCP Server (Fast!)?
&lt;/h2&gt;

&lt;p&gt;First of all, we need to decide what tool we want to make available to LLMs. Let’s keep it simple and just create a get weather by city tool which is kind of a “hello world” example. For this we will use wttr.in and as they say, they are “the right way to check curl the weather!” and it will allow us to create this weather server really straightforward!&lt;/p&gt;

&lt;p&gt;Now that we know what tool we want to build, we need to select one of the &lt;strong&gt;SDKs&lt;/strong&gt; available. I will pick one for you this time, selecting the &lt;strong&gt;python&lt;/strong&gt; one for our first server.&lt;/p&gt;

&lt;p&gt;We will use &lt;a href="https://github.com/astral-sh/uv?tab=readme-ov-file#installation" rel="noopener noreferrer"&gt;&lt;strong&gt;uv&lt;/strong&gt;&lt;/a&gt; to manage our &lt;strong&gt;Python project&lt;/strong&gt;. Open your favorite terminal and follow me on these commands. Let’s create a project together:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you are not familiar with &lt;strong&gt;uv&lt;/strong&gt; you can check &lt;a href="https://github.com/astral-sh/uv?tab=readme-ov-file#installation" rel="noopener noreferrer"&gt;this page&lt;/a&gt; and learn how to use it and install it.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv init weather-mcp-server  
&lt;span class="nb"&gt;cd &lt;/span&gt;weather-mcp-server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add the mcp dependency to your project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv add &lt;span class="s2"&gt;"mcp[cli]"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we need a file to write our MCP server code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;touch &lt;/span&gt;weather-server.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nice, we are ready to start writing code! Open this file in your favorite code editor. If you’re new to &lt;strong&gt;decorators&lt;/strong&gt; or &lt;strong&gt;docstrings&lt;/strong&gt;, pay extra attention to those parts , they’re key to how tools are defined and described for the LLM. So, don’t forget to take a look on the comments.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# FastMCP is all what we need from mcp dependency  
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.server.fastmcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastMCP&lt;/span&gt;  
&lt;span class="c1"&gt;# We will use this lib to request the weather from wttr.in   
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;  

&lt;span class="c1"&gt;# Now lets create an MCP Server  
&lt;/span&gt;&lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastMCP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Weather&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  

&lt;span class="c1"&gt;# Now let's register a tool with this decorator,   
&lt;/span&gt;&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_weather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# define a function with city argument  
&lt;/span&gt;  &lt;span class="c1"&gt;# And now we will docuement this cuntion using Python Docstrings  
&lt;/span&gt;  &lt;span class="c1"&gt;# FastMCP will add this documentation to the LLM so it can decide when to use  
&lt;/span&gt;  &lt;span class="c1"&gt;# this tool and how to use it.  
&lt;/span&gt;  &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;  
  Get the current weather for a given city  
  Args:  
    city (str): The name of the city  
  Returns:  
    str: The current weather in the city, for example, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Sunny +20°C&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  
  &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;  
  &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
    &lt;span class="c1"&gt;# URL-encode the city name.  
&lt;/span&gt;    &lt;span class="n"&gt;url_encoded_city&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;quote_plus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
    &lt;span class="c1"&gt;# Prepare wittr url request  
&lt;/span&gt;    &lt;span class="n"&gt;wttr_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;https://wttr.in/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url_encoded_city&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;?format=%C+%t&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;  
    &lt;span class="c1"&gt;# Request weather  
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;urlopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wttr_url&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;    
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
  &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
    &lt;span class="c1"&gt;# If something goes wrong we let the LLM know about it  
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error fetching weather data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  

&lt;span class="c1"&gt;# And here we add the main entry point for the server  
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
  &lt;span class="c1"&gt;# Here we initialize and run the server  
&lt;/span&gt;  &lt;span class="c1"&gt;# We select stdio transport for process-based communication.   
&lt;/span&gt;  &lt;span class="c1"&gt;# This allow a process (the client) to communicate with its parent   
&lt;/span&gt;  &lt;span class="c1"&gt;# process through pipes using standard input/output.  
&lt;/span&gt;  &lt;span class="n"&gt;mcp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transport&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;stdio&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yay! We have our first MCP server ready!&lt;/p&gt;

&lt;h2&gt;
  
  
  Using your MCP Server
&lt;/h2&gt;

&lt;p&gt;Now we need to prepare a configuration &lt;strong&gt;JSON file&lt;/strong&gt; to let local &lt;strong&gt;MCP clients&lt;/strong&gt; like Claude Desktop, vscode, etc know how to execute your server.&lt;/p&gt;

&lt;p&gt;Let’s create a file named &lt;code&gt;mcp-servers-config.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;  
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;  
    &lt;/span&gt;&lt;span class="nl"&gt;"weather"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;  
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"uv"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;  
        &lt;/span&gt;&lt;span class="s2"&gt;"--directory"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  
        &lt;/span&gt;&lt;span class="s2"&gt;"/REPLACE/ME/WITH/THE/dir/path/to/your/mcp/weather-server/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  
        &lt;/span&gt;&lt;span class="s2"&gt;"run"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  
        &lt;/span&gt;&lt;span class="s2"&gt;"weather-server.py"&lt;/span&gt;&lt;span class="w"&gt;  
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;  
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;  
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;  
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Make sure you are replacing &lt;code&gt;/REPLACE/ME/WITH/THE/dir/path/to/your/mcp/weather-server/&lt;/code&gt; with the directory where you have your &lt;code&gt;weather-server.py&lt;/code&gt; file.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;At this point we only need to find a &lt;strong&gt;MCP client&lt;/strong&gt; to test it. I have written one that works with &lt;strong&gt;Ollama&lt;/strong&gt; called &lt;strong&gt;&lt;a href="https://github.com/jonigl/mcp-client-for-ollama" rel="noopener noreferrer"&gt;ollmcp&lt;/a&gt;&lt;/strong&gt;, let’s use it here for simplicity since it can be executed in any terminal and it is &lt;strong&gt;open source&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you are not familiar with Ollama you can check my article &lt;a href="https://dev.to/jonigl/getting-started-with-ollama-run-llms-on-your-computer-35d6"&gt;Getting Started with Ollama: Run LLMs on Your Computer&lt;/a&gt;&lt;br&gt;
Ollama is a great starting point because it’s easy to set up locally and supports llm models that can run locally and use tools through MCP.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Let’s install the MCP client&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;ollmcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I recommend using the &lt;strong&gt;&lt;a href="https://ollama.com/library/qwen2.5" rel="noopener noreferrer"&gt;qwen2.5:7b&lt;/a&gt;&lt;/strong&gt; model, but you can use any Ollama model that supports tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull qwen2.5:7b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the MCP client using our JSON configuration pointing to our MCP Sever and the model we want to use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# if you are in the same directory as the json file execute it like this  &lt;/span&gt;
ollmcp &lt;span class="nt"&gt;--servers-json&lt;/span&gt; mcp-servers-config.json &lt;span class="nt"&gt;--model&lt;/span&gt; qwen2.5:7b &lt;span class="c"&gt;# run it  &lt;/span&gt;

&lt;span class="c"&gt;# Pro tip: ollmcp --help&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command will execute the &lt;code&gt;ollmcp&lt;/code&gt; MCP client showing you the MCP server and tools available for your ollama model while you can interact with it writing queries where the LLM can use your MCP Servers. Here you can see how this tool interface looks like:&lt;/p&gt;

&lt;p&gt;This command will run the &lt;code&gt;ollmcp&lt;/code&gt; MCP client, showing your MCP server and tools available for your Ollama model. You can interact by writing queries, and the LLM will decide when to use your MCP server.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff2n08286wpwtyyaiqqls.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff2n08286wpwtyyaiqqls.png" alt="ollmcp interface screenshot image" width="800" height="299"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Here you can see how &lt;strong&gt;ollmcp&lt;/strong&gt; interface looks like when started&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;As you can see on the screenshot we have &lt;code&gt;1/1 tools enabled&lt;/code&gt;, which is great! you can see that our &lt;code&gt;weather&lt;/code&gt; MCP Server gets listed with one tool called &lt;code&gt;get_weather&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Try writing something like: &lt;code&gt;What is the weather in Ibiza?&lt;/code&gt; and hit return. It might take some time depending on the model and your machine’s resources.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F983bkzsyfndbdfqvy5ux.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F983bkzsyfndbdfqvy5ux.png" alt="ollmcp query screenshot image" width="800" height="374"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The LLM will seamlessly call your tool, grab the response from your MCP Server, and craft a complete answer for you. Congrats, you have your first MCP Server working!&lt;/p&gt;

&lt;p&gt;If you like &lt;code&gt;ollmcp&lt;/code&gt;, you can check out the repo here: &lt;a href="https://github.com/jonigl/mcp-client-for-ollama" rel="noopener noreferrer"&gt;https://github.com/jonigl/mcp-client-for-ollama&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;MCP servers are super easy to build and let you share your APIs, apps, or tools with different AI applications without much hassle. And if you’re making an AI app yourself, you can tap into the huge number of ready-to-go MCP servers out there , no need to build or maintain everything from scratch!&lt;/p&gt;

&lt;p&gt;Now you’re ready to build even more powerful tools for your AI!&lt;/p&gt;

&lt;p&gt;If you like it this article you can leave a ⭐️&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  Try different MCP Servers. Check out the &lt;a href="https://github.com/modelcontextprotocol/servers" rel="noopener noreferrer"&gt;official and community list of MCP servers&lt;/a&gt; to find new servers to connect with.&lt;/li&gt;
&lt;li&gt;  Explore different MCP Clients too! You can continue using &lt;a href="https://github.com/jonigl/mcp-client-for-ollama" rel="noopener noreferrer"&gt;ollmcp&lt;/a&gt; as we did earlier, or try other clients like &lt;a href="https://claude.ai/download" rel="noopener noreferrer"&gt;Claude Desktop&lt;/a&gt;, &lt;a href="https://code.visualstudio.com/" rel="noopener noreferrer"&gt;Visual Studio Code&lt;/a&gt;, and more to see how different environments interact with your server.&lt;/li&gt;
&lt;li&gt;  Build your own MCP Server. Base it on what you learned in this article, or dive deeper with the &lt;a href="https://modelcontextprotocol.io/introduction" rel="noopener noreferrer"&gt;official documentation&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;  Write your own MCP Client! I’ll be publishing a new article focused on this topic soon. Let me know in the comments if you’re interested!&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;h3&gt;
  
  
  MCP Server code
&lt;/h3&gt;

&lt;p&gt;You can also find this MCP Server code in my following &lt;a href="//code/README.md"&gt;GitHub repo&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Acknowledgments
&lt;/h2&gt;

&lt;p&gt;Big thanks to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;a href="https://ollama.com/" rel="noopener noreferrer"&gt;&lt;strong&gt;Ollama&lt;/strong&gt;&lt;/a&gt; for providing an awesome local platform to run LLMs easily.&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://modelcontextprotocol.io/" rel="noopener noreferrer"&gt;&lt;strong&gt;Model Context Protocol (MCP)&lt;/strong&gt;&lt;/a&gt; for creating a powerful open standard to connect LLMs with external tools.&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://wttr.in/" rel="noopener noreferrer"&gt;&lt;strong&gt;wttr.in&lt;/strong&gt;&lt;/a&gt; for offering a simple and reliable weather API service used in our example.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>ollama</category>
      <category>mcp</category>
      <category>howto</category>
    </item>
  </channel>
</rss>
