<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Zak Mandhro</title>
    <description>The latest articles on DEV Community by Zak Mandhro (@zak_mandhro).</description>
    <link>https://dev.to/zak_mandhro</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2603892%2F85f00279-6838-4593-97b5-f3467d5d16ca.png</url>
      <title>DEV Community: Zak Mandhro</title>
      <link>https://dev.to/zak_mandhro</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/zak_mandhro"/>
    <language>en</language>
    <item>
      <title>GitHub Copilot just crushed every AI review startup (40.3M PR analysis)</title>
      <dc:creator>Zak Mandhro</dc:creator>
      <pubDate>Fri, 19 Dec 2025 16:00:00 +0000</pubDate>
      <link>https://dev.to/zak_mandhro/github-copilot-crushed-every-code-review-startup-40m-pr-analysis-2no6</link>
      <guid>https://dev.to/zak_mandhro/github-copilot-crushed-every-code-review-startup-40m-pr-analysis-2no6</guid>
      <description>&lt;p&gt;I analyzed 40.3 million public pull requests from 2022-2025.&lt;/p&gt;

&lt;p&gt;The data is brutal: GitHub Copilot now dominates organizational adoption despite CodeRabbit processing more total PRs.&lt;/p&gt;

&lt;p&gt;This isn't a hot take. It's what 40M PRs told us.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Uncomfortable Truth
&lt;/h2&gt;

&lt;p&gt;CodeRabbit was built from the ground up for AI code review. It's genuinely good at what it does.&lt;/p&gt;

&lt;p&gt;👑 &lt;strong&gt;CodeRabbit: #1 in PR volume for 2025.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;But that crown is slipping. Copilot overtook them in monthly PRs in November, and already leads in org adoption.&lt;/p&gt;

&lt;p&gt;Copilot wasn't built for code review. It started as autocomplete. Yet here we are.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9t697rioxwmniv9zko0w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9t697rioxwmniv9zko0w.png" alt="Cumulative Org Adoption by Agent - shows Copilot's purple line at 29K+ crushing everyone else" width="800" height="492"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Platform Usually Wins
&lt;/h2&gt;

&lt;p&gt;Here's what the data actually shows:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Copilot's advantage isn't the model. It's the distribution.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pre-installed in every GitHub org that pays for it&lt;/li&gt;
&lt;li&gt;Zero friction to enable&lt;/li&gt;
&lt;li&gt;Shows up in the workflow developers already use&lt;/li&gt;
&lt;li&gt;No new vendor to approve, no new tool to learn&lt;/li&gt;
&lt;li&gt;Can bundle pricing into existing subscriptions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;CodeRabbit requires you to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Find and evaluate it&lt;/li&gt;
&lt;li&gt;Get approval and install it&lt;/li&gt;
&lt;li&gt;Configure and maintain it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That friction compounds. Copilot wins by being &lt;em&gt;there&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Consolidation Already Happened
&lt;/h2&gt;

&lt;p&gt;Look at this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Top 3 AI review agents control 72% of all activity.&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rank&lt;/th&gt;
&lt;th&gt;Agent&lt;/th&gt;
&lt;th&gt;Share&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;CodeRabbit&lt;/td&gt;
&lt;td&gt;~33%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Copilot&lt;/td&gt;
&lt;td&gt;~29%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Gemini&lt;/td&gt;
&lt;td&gt;~10%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Everyone else&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;~28%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;And it's getting worse for the long tail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Korbit&lt;/strong&gt;  - raised money, had traction, purpose-built for code review  - shut down this year.&lt;/p&gt;

&lt;p&gt;The market consolidated before most people realized there was a market.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Long Tail Is Crowded
&lt;/h2&gt;

&lt;p&gt;Beyond the top 10, dozens of AI review agents are fighting for what's left:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Agent&lt;/th&gt;
&lt;th&gt;Orgs&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Cubic AI&lt;/td&gt;
&lt;td&gt;~197&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ellipsis&lt;/td&gt;
&lt;td&gt;~160&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenHands&lt;/td&gt;
&lt;td&gt;~160&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bolt AI&lt;/td&gt;
&lt;td&gt;~114&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Based on observable public repo installs&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;And there are dozens more. All competing for the 28% not owned by the top 3.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Platform Giants Are Coming
&lt;/h2&gt;

&lt;p&gt;What happens when OpenAI and Google actually start pushing their code review features?&lt;/p&gt;

&lt;p&gt;Look at org adoption growth from October to November alone:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Agent&lt;/th&gt;
&lt;th&gt;Oct 2025&lt;/th&gt;
&lt;th&gt;Nov 2025&lt;/th&gt;
&lt;th&gt;Growth&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ChatGPT&lt;/td&gt;
&lt;td&gt;4,875&lt;/td&gt;
&lt;td&gt;5,911&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+21%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini&lt;/td&gt;
&lt;td&gt;1,220&lt;/td&gt;
&lt;td&gt;2,788&lt;/td&gt;
&lt;td&gt;+128%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;ChatGPT is pulling away. The gap between them grew 5x in a single month.&lt;/p&gt;

&lt;p&gt;And &lt;strong&gt;Gemini still grew 43x this year&lt;/strong&gt; - from basically nothing to #3 in PR volume globally.&lt;/p&gt;

&lt;p&gt;Neither OpenAI nor Google is even trying yet. No big marketing push. No deep IDE integration. No bundling with Workspace or Cloud.&lt;/p&gt;

&lt;p&gt;And they have the same platform leverage as Microsoft. Millions already subscribe to ChatGPT and Google Workspace for other use cases. Code review becomes a free add-on, not a new line item.&lt;/p&gt;

&lt;p&gt;When they flip that switch, what happens to Sourcery, Greptile, Ellipsis, and the rest of the long tail?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Adoption Curve Is Insane
&lt;/h2&gt;

&lt;p&gt;Let me show you how fast this moved:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fubtqow0pelhfonrju6zy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fubtqow0pelhfonrju6zy.png" alt="AI Agent Participation - Feb 2024: 1.1% to Nov 2025: 14.9%" width="800" height="422"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;14x in 18 months.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;1 in 7 PRs now has an AI reviewer participating. Not a prediction. Already happening.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means If You're Building Dev Tools
&lt;/h2&gt;

&lt;p&gt;I'll be direct:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Distribution &amp;gt; Features.&lt;/strong&gt; Copilot proved it. Being native to the platform beats being best-in-class at the feature.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The platform tax is real.&lt;/strong&gt; If GitHub/Microsoft decides your feature is worth building, you're competing against free + pre-installed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Consolidation is faster than you think.&lt;/strong&gt; Some players won't survive 2026.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OpenAI and Google are the wildcards.&lt;/strong&gt; Both growing fast with virtually zero marketing effort for code reviews.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  My 2026 Prediction
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Copilot, ChatGPT, and Gemini lock in the top 3. Claude and Cursor climb to #4 and #5.&lt;/li&gt;
&lt;li&gt;AI agents author 2M+ PRs (up from 335K in 2025).&lt;/li&gt;
&lt;li&gt;AI-reviewing-AI becomes a thing - agents reviewing agent-authored code grows 10x.&lt;/li&gt;
&lt;li&gt;At least 2 long-tail players shut down or get acqui-hired.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The Full Data
&lt;/h2&gt;

&lt;p&gt;Everything I cited is from our State of AI Code Review 2025 report:&lt;/p&gt;

&lt;p&gt;📊 &lt;strong&gt;&lt;a href="https://pullflow.com/state-of-ai-code-review-2025?utm_source=social&amp;amp;utm_medium=dev-to&amp;amp;utm_campaign=soacr-2025" rel="noopener noreferrer"&gt;Full report + methodology&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Includes monthly breakdowns, full agent rankings, and the methodology for how we identified and classified AI agents.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Methodology note: We filtered for active repos only - at least 10 PRs/month and 0.3 feedback events per PR. This filters out noise and surfaces repos with real development activity.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  I Want the Counterarguments
&lt;/h2&gt;

&lt;p&gt;Seriously. Tell me where I'm wrong:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is "platform beats product" too simplistic?&lt;/li&gt;
&lt;li&gt;Are there code review startups that can survive the consolidation?&lt;/li&gt;
&lt;li&gt;Is the Copilot dominance overstated because of GitHub's visibility bias?&lt;/li&gt;
&lt;li&gt;Am I underestimating how much enterprises care about best-in-class vs. good-enough-and-integrated?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I've been staring at this data for weeks. I want someone to challenge it.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;I work on &lt;a href="https://pullflow.com?utm_source=social&amp;amp;utm_medium=dev-to" rel="noopener noreferrer"&gt;PullFlow&lt;/a&gt; - we're agent-agnostic, building unified code review across GitHub, Slack, and AI agents. This research came from trying to understand where the market is actually going.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>discuss</category>
      <category>startup</category>
    </item>
    <item>
      <title>How to Run Your Own OpenAI GPT OSS Server for Fun and Profit</title>
      <dc:creator>Zak Mandhro</dc:creator>
      <pubDate>Thu, 07 Aug 2025 16:40:20 +0000</pubDate>
      <link>https://dev.to/pullflow/how-to-run-your-own-openai-gpt-oss-server-for-fun-and-profit-1amj</link>
      <guid>https://dev.to/pullflow/how-to-run-your-own-openai-gpt-oss-server-for-fun-and-profit-1amj</guid>
      <description>&lt;p&gt;&lt;em&gt;Deploy GPT-OSS locally on a commodity gaming PC and watch your API bills disappear while your team's productivity soars&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The game changed in August 2025 when OpenAI dropped GPT-OSS—their first open-weight models since GPT-2. These aren't toy models; gpt-oss-120b matches OpenAI's proprietary o4-mini on reasoning benchmarks while gpt-oss-20b rivals o3-mini, and both can run on hardware you can order from Amazon today.&lt;/p&gt;

&lt;p&gt;This isn't just about having cool tech on your desk. This is about fundamentally changing the economics of AI for your team, gaining complete control over your models, and having unlimited access to enterprise-grade reasoning capabilities without the monthly subscription anxiety.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Your Team Needs Local AI (Spoiler: It Pays for Itself in Months)
&lt;/h2&gt;

&lt;p&gt;Let's talk numbers that matter to your bottom line. If you have 10 team members using AI tools at an average of $30 per month each (ChatGPT Plus, Claude Pro, or API costs), you're spending $3,600 annually. A capable GPT-OSS server costs $2,700-$3,100 upfront, meaning &lt;strong&gt;your hardware investment pays for itself in 8-12 months&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;But the economics get even better:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Year 2 savings&lt;/strong&gt;: $3,600 &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Year 3 savings&lt;/strong&gt;: Another $3,600&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost per additional user&lt;/strong&gt;: Nearly zero&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The Business Case Beyond Cost Savings&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;GPT-OSS comes with Apache 2.0 licensing, which means you can fine-tune these models on your proprietary data, customize behavior for your industry, and create competitive advantages that API-based solutions simply can't offer. Your legal team processes contracts differently than your marketing team writes copy—local AI lets you optimize for both without compromise.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fympwtdd727v7gjaey49v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fympwtdd727v7gjaey49v.png" alt="Network Setup"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data sovereignty&lt;/strong&gt; becomes critical when you're processing sensitive information. Client data, internal strategies, and proprietary code never leave your network. Compare this to cloud APIs where your data travels through external servers, potentially triggering compliance headaches in regulated industries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No rate limits&lt;/strong&gt; means your developers can iterate freely, your content team can brainstorm without throttling, and your data analysts can process large datasets without worrying about quota exhaustion. The psychological shift from "conserving API calls" to "unlimited experimentation" unlocks creativity and productivity gains that are difficult to quantify but impossible to ignore.&lt;/p&gt;

&lt;h2&gt;
  
  
  Shopping Made Easy: Your GPT-OSS Powerhouse from Amazon
&lt;/h2&gt;

&lt;p&gt;The sweet spot for GPT-OSS deployment centers around NVIDIA RTX 4080/4080 Super GPUs with 16GB+ VRAM, capable of delivering up to 250 tokens per second for the gpt-oss-20b model. These systems handle the computational demands while remaining accessible to small teams and growing businesses.&lt;/p&gt;

&lt;h3&gt;
  
  
  Three Proven Amazon Options
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0dbl5w2ig26a8tvrb4zw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0dbl5w2ig26a8tvrb4zw.png" alt="Example Desktop Server - iBUYPOWER Y40"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Budget Leader: iBUYPOWER Y40 (~$2,400-2,700)&lt;/strong&gt;&lt;br&gt;
Intel Core i7-14700KF, RTX 4080 Super 16GB, 32GB DDR5, 2TB NVMe SSD&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Perfect for teams of 5-15 users&lt;/li&gt;
&lt;li&gt;Handles gpt-oss-20b with room for growth&lt;/li&gt;
&lt;li&gt;Professional build quality with warranty support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Performance Pick: Skytech Legacy (~$2,700-3,100)&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Intel i7-14700K, RTX 4080 Super, 32GB DDR5 RGB, 2TB Gen4 NVMe&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Optimized cooling for sustained workloads&lt;/li&gt;
&lt;li&gt;Premium components for reliability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Current Deal: Skytech O11 (~$2,699, down from $3,099)&lt;/strong&gt;&lt;br&gt;
Intel i7-14700K, RTX 4080 Super, 32GB DDR5, 2TB Gen4 SSD, 1000W Gold PSU&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;13% discount makes this exceptional value&lt;/li&gt;
&lt;li&gt;Enterprise-grade power supply&lt;/li&gt;
&lt;li&gt;Excellent thermal design&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Future-Proofing Considerations
&lt;/h3&gt;

&lt;p&gt;These systems are architected for growth. While they handle the gpt-oss-20b model well, running the larger gpt-oss-120b model requires about 80GB of VRAM. This is a significant step up, typically requiring a multi-GPU configuration. The robust power supplies and cooling in the recommended builds can support adding a second GPU, but always verify component compatibility and physical space before upgrading.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hardware ROI Calculator&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;5 users: Break-even at 18 months&lt;/li&gt;
&lt;li&gt;10 users: Break-even at 9 months&lt;/li&gt;
&lt;li&gt;15 users: Break-even at 6 months&lt;/li&gt;
&lt;li&gt;20+ users: Break-even in under 6 months&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Server Setup: Install and Run Your First Model
&lt;/h2&gt;

&lt;p&gt;Installing Ollama on Windows transforms complex LLM deployment into a streamlined process. The entire setup takes less than 30 minutes from download to first inference.&lt;/p&gt;
&lt;h3&gt;
  
  
  Download and Install Ollama
&lt;/h3&gt;

&lt;p&gt;Navigate to &lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;ollama.com&lt;/a&gt; and download the Windows installer. The installation process automatically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Creates a system service for background operation&lt;/li&gt;
&lt;li&gt;Configures the API server on &lt;code&gt;localhost:11434&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Installs both GUI and command-line tools&lt;/li&gt;
&lt;li&gt;Sets up automatic startup with Windows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs9gdah1khv96w2jfpdrk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs9gdah1khv96w2jfpdrk.png" alt="Download Ollama"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Run the installer with administrator privileges. The setup wizard handles service registration and initial configuration without requiring manual intervention.&lt;/p&gt;
&lt;h3&gt;
  
  
  Launch Ollama Desktop App
&lt;/h3&gt;

&lt;p&gt;After installation, launch the Ollama desktop application from your Start menu or desktop shortcut. The app provides a clean, user-friendly interface for managing models and server settings.&lt;/p&gt;

&lt;p&gt;The application automatically starts the Ollama service in the background and displays available models. The service runs on port 11434 and starts automatically with Windows.&lt;/p&gt;
&lt;h3&gt;
  
  
  Download GPT-OSS Models
&lt;/h3&gt;

&lt;p&gt;GPT-OSS models appear in Ollama's model library with native MXFP4 support. Click on the "Library" or "Models" tab to browse available models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Download gpt-oss:20b:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Search for "gpt-oss" in the model library&lt;/li&gt;
&lt;li&gt;Click on "gpt-oss:20b" &lt;/li&gt;
&lt;li&gt;Click "Download" or "Pull"&lt;/li&gt;
&lt;li&gt;Monitor download progress in the app&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The download retrieves approximately 16GB of model weights. Ollama's native MXFP4 support eliminates additional quantization overhead, ensuring optimal performance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For Higher Performance (Optional):&lt;/strong&gt;&lt;br&gt;
Similarly download "gpt-oss:120b" if your system has sufficient resources (80GB storage, high-end GPU). This larger model delivers enhanced reasoning capabilities for complex tasks.&lt;/p&gt;
&lt;h3&gt;
  
  
  First Chat Test
&lt;/h3&gt;

&lt;p&gt;Once download completes, test the model directly in the Ollama app:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Click on "gpt-oss:20b" in your downloaded models list&lt;/li&gt;
&lt;li&gt;Click "Chat" or "Run" to start a conversation&lt;/li&gt;
&lt;li&gt;Type a test prompt in the chat interface&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ub410cuwtvqh8sc4x6x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ub410cuwtvqh8sc4x6x.png" alt="Ollama Interface"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Test prompt example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Explain the computational complexity of merge sort and why it's preferred for external sorting algorithms.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model should respond with detailed, technically accurate explanations, confirming successful deployment. The chat interface provides an easy way to validate model functionality before connecting other applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Open It Up: Connect Your Whole Office in Minutes
&lt;/h2&gt;

&lt;p&gt;Sharing your AI server across the office requires three configuration steps: enabling network access, configuring Windows Firewall, and setting a fixed IP address for reliable connectivity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Enable Network Access in Ollama
&lt;/h3&gt;

&lt;p&gt;Open the Ollama desktop application and navigate to Settings. Locate the "Expose to Network" option and enable it. This configuration change allows Ollama to accept connections from other devices on your local network, not just localhost requests.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F84918tqmi67ki9gs0aq5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F84918tqmi67ki9gs0aq5.png" alt="Ollama network settings"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The setting takes effect immediately—no service restart required. Ollama now listens on all network interfaces (0.0.0.0:11434) instead of just localhost.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configure Windows Defender Firewall
&lt;/h3&gt;

&lt;p&gt;Windows Defender blocks inbound connections to port 11434 by default. Add a firewall exception to allow team access:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9crdat4on5x9yolpse9u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9crdat4on5x9yolpse9u.png" alt="Windows Defender 1"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdpz4piwom15a14pyk6s9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdpz4piwom15a14pyk6s9.png" alt="Windows Defender 2"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open Windows Security → Firewall &amp;amp; network protection&lt;/li&gt;
&lt;li&gt;Click "Advanced settings" to open Windows Defender Firewall&lt;/li&gt;
&lt;li&gt;Select "Inbound Rules" in the left panel&lt;/li&gt;
&lt;li&gt;Click "New Rule..." in the right panel&lt;/li&gt;
&lt;li&gt;Choose "Port" → Next&lt;/li&gt;
&lt;li&gt;Select "TCP" and enter "11434" in Specific local ports&lt;/li&gt;
&lt;li&gt;Choose "Allow the connection" → Next&lt;/li&gt;
&lt;li&gt;Apply to Domain, Private, and Public networks → Next
&lt;/li&gt;
&lt;li&gt;Name the rule "Ollama AI Server" → Finish&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Set Fixed IP Address
&lt;/h3&gt;

&lt;p&gt;Configure your router to assign a consistent IP address to your AI server, ensuring team members can rely on the same connection string daily.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Router Configuration (varies by manufacturer):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Access your router's admin panel (typically 192.168.1.1 or 192.168.0.1)&lt;/li&gt;
&lt;li&gt;Navigate to DHCP settings or LAN configuration&lt;/li&gt;
&lt;li&gt;Locate "DHCP Reservation" or "Static IP Assignment"
&lt;/li&gt;
&lt;li&gt;Find your AI server by hostname or MAC address&lt;/li&gt;
&lt;li&gt;Assign a static IP (e.g., 192.168.86.24)&lt;/li&gt;
&lt;li&gt;Save configuration and restart router if required&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Alternative: Windows Static IP&lt;/strong&gt;&lt;br&gt;
Configure static IP directly on the server:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Network Settings → Change adapter options&lt;/li&gt;
&lt;li&gt;Right-click your network adapter → Properties&lt;/li&gt;
&lt;li&gt;Select "Internet Protocol Version 4 (TCP/IPv4)" → Properties&lt;/li&gt;
&lt;li&gt;Choose "Use the following IP address"&lt;/li&gt;
&lt;li&gt;Enter IP: 192.168.86.24, Subnet: 255.255.255.0, Gateway: 192.168.86.1&lt;/li&gt;
&lt;li&gt;DNS servers: 8.8.8.8, 8.8.4.4&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Test Network Connectivity
&lt;/h3&gt;

&lt;p&gt;From another device on your network, verify connectivity by opening a web browser and navigating to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://192.168.86.24:11434
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see a simple response indicating the Ollama server is running and accessible. Alternatively, any of the client applications (WaveTerm, Sidekick, etc.) can test the connection when you configure them.&lt;/p&gt;

&lt;p&gt;Your AI server is now ready for team-wide deployment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use It: WaveTerm, Sidekick (mac), Open WebUI; any app that lets you override the OpenAI endpoint
&lt;/h2&gt;

&lt;p&gt;The beauty of Ollama's OpenAI-compatible API lies in its universal compatibility. Any application supporting custom OpenAI endpoints can immediately leverage your local GPT-OSS deployment.&lt;/p&gt;

&lt;h3&gt;
  
  
  WaveTerm: Cross-Platform Excellence
&lt;/h3&gt;

&lt;p&gt;WaveTerm (&lt;a href="https://waveterm.dev" rel="noopener noreferrer"&gt;waveterm.dev&lt;/a&gt;) provides a sophisticated terminal interface with built-in AI integration across Windows, macOS, and Linux.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ferip1c98unia6vd959p0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ferip1c98unia6vd959p0.png" alt="Waveterm app"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Installation:&lt;/strong&gt;&lt;br&gt;
Download and install WaveTerm for your operating system. The application includes native AI configuration options designed for local LLM deployments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Configuration:&lt;/strong&gt;&lt;br&gt;
Create or edit your &lt;code&gt;ai.json&lt;/code&gt; configuration file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ai@gpt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"display:name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GPT-OSS 20B (Ollama)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"display:order"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ai:*"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ai:name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gpt-oss"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ai:model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gpt-oss:20b"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ai:baseurl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://192.168.86.24:11434/v1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ai:apitoken"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ollama"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ai:temperature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ai:top_p"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ai:max_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ai:presence_penalty"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ai:frequency_penalty"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The configuration enables seamless AI interaction within your terminal environment, perfect for developers who live in command-line interfaces.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sidekick: Native macOS Integration
&lt;/h3&gt;

&lt;p&gt;Mac users benefit from Sidekick's native integration and optimized user experience. Download from &lt;a href="https://github.com/johnbean393/Sidekick/releases" rel="noopener noreferrer"&gt;github.com/johnbean393/Sidekick/releases&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Setup Process:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Install Sidekick from the GitHub releases page&lt;/li&gt;
&lt;li&gt;Open preferences and navigate to AI settings&lt;/li&gt;
&lt;li&gt;Add custom provider with your Ollama endpoint&lt;/li&gt;
&lt;li&gt;Configure model name and API key (use "ollama" as placeholder)&lt;/li&gt;
&lt;li&gt;Test connection to verify functionality&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Sidekick's macOS-native interface provides excellent integration with system services, notifications, and keyboard shortcuts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Open WebUI: Browser-Based Access
&lt;/h3&gt;

&lt;p&gt;For teams preferring web interfaces, Open WebUI delivers a ChatGPT-like experience through your browser.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docker Installation:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; 3000:8080 &lt;span class="nt"&gt;--add-host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;host.docker.internal:host-gateway &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; open-webui ghcr.io/open-webui/open-webui:main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Configuration:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Navigate to &lt;code&gt;http://localhost:3000&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Complete initial setup and create admin account&lt;/li&gt;
&lt;li&gt;Access Settings → Connections&lt;/li&gt;
&lt;li&gt;Add Ollama server: &lt;code&gt;http://192.168.86.24:11434&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Verify model detection and availability&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Universal Compatibility
&lt;/h3&gt;

&lt;p&gt;The OpenAI-compatible API means virtually any AI-enabled application can connect to your server:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Development Tools:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cursor IDE (detailed in next section)&lt;/li&gt;
&lt;li&gt;GitHub Copilot alternatives&lt;/li&gt;
&lt;li&gt;VS Code extensions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Productivity Apps:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Raycast (macOS)&lt;/li&gt;
&lt;li&gt;Alfred workflows&lt;/li&gt;
&lt;li&gt;Custom business applications&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;API Integration Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://192.168.86.24:11434/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ollama&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Placeholder for compatibility
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-oss:20b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a helpful assistant.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyze our Q3 sales data trends.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Code With It: Supercharge Cursor IDE with Your Local AI
&lt;/h2&gt;

&lt;p&gt;Cursor IDE supports custom API endpoints through its Models settings, enabling developers to leverage local GPT-OSS for unlimited coding assistance without subscription costs or usage anxiety.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configure Custom Endpoint
&lt;/h3&gt;

&lt;p&gt;Navigate to Cursor Settings → Models to access API configuration options. Cursor requires OpenAI-compatible providers, making Ollama integration straightforward.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step-by-Step Configuration:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Disable Default Models&lt;/strong&gt;&lt;br&gt;
Uncheck existing models (GPT-4, Claude, etc.) to avoid confusion during setup and prevent accidental cloud API usage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Add Custom Model&lt;/strong&gt;&lt;br&gt;
Click "+ Add model" to create a new model configuration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Model name: &lt;code&gt;gpt-oss:20b&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Display name: &lt;code&gt;GPT-OSS 20B Local&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Override Base URL&lt;/strong&gt;&lt;br&gt;
Enable "Override OpenAI Base URL" and enter:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   http://192.168.86.24:11434/v1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Set API Key&lt;/strong&gt;&lt;br&gt;
Enter &lt;code&gt;ollama&lt;/code&gt; as the API key (required for compatibility, but not validated by local server).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Verify Connection&lt;/strong&gt;&lt;br&gt;
Click "Verify" to test the connection. Cursor sends a test request to validate the endpoint and model availability.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Development Workflow Integration
&lt;/h3&gt;

&lt;p&gt;Once configured, GPT-OSS becomes available throughout Cursor's interface:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chat Interface (Cmd/Ctrl+L):&lt;/strong&gt;&lt;br&gt;
Access the AI chat sidebar for code discussions, architecture questions, and debugging assistance. The local model provides unlimited conversations without rate limiting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Inline Assistance (Cmd/Ctrl+K):&lt;/strong&gt;&lt;br&gt;
Highlight code and invoke AI assistance for refactoring, optimization, or explanation. The model understands context and provides relevant suggestions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Code Generation:&lt;/strong&gt;&lt;br&gt;
Describe functionality in comments, then request implementation. GPT-OSS generates code based on your specific patterns and preferences.&lt;/p&gt;
&lt;h3&gt;
  
  
  Benefits and Limitations
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Advantages:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unlimited usage&lt;/strong&gt;: No token limits or monthly quotas&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Privacy&lt;/strong&gt;: Code never leaves your network&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost control&lt;/strong&gt;: No per-request charges&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Customization&lt;/strong&gt;: Fine-tune models for your coding style&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Offline capability&lt;/strong&gt;: Work without internet connectivity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Current Limitations:&lt;/strong&gt;&lt;br&gt;
Tab completion requires specialized models and continues using Cursor's built-in models. This feature depends on optimized completion models not yet available in the GPT-OSS release.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performance Optimization:&lt;/strong&gt;&lt;br&gt;
For optimal coding assistance, configure the model with appropriate parameters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"temperature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"max_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2048&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"top_p"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lower temperature ensures more deterministic code generation, while higher token limits accommodate larger code blocks and detailed explanations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Team Development Scenarios
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Code Reviews:&lt;/strong&gt;&lt;br&gt;
Paste code snippets into Cursor's chat interface for automated review, security analysis, and optimization suggestions. The AI identifies potential issues and suggests improvements without external dependencies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Documentation Generation:&lt;/strong&gt;&lt;br&gt;
Select functions or classes and request documentation generation. GPT-OSS analyzes code structure and creates comprehensive documentation matching your existing style.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Debugging Assistance:&lt;/strong&gt;&lt;br&gt;
Describe error messages or unexpected behavior to receive debugging guidance. The model suggests investigation approaches and potential solutions based on code context.&lt;/p&gt;
&lt;h2&gt;
  
  
  Win: GPT without limits. Other models. Finetuning, and more.
&lt;/h2&gt;

&lt;p&gt;Local GPT-OSS deployment transcends simple cost savings—it unlocks capabilities and flexibility impossible with cloud-based solutions.&lt;/p&gt;
&lt;h3&gt;
  
  
  Unlimited Experimentation
&lt;/h3&gt;

&lt;p&gt;Without rate limits or usage charges, your team can explore AI capabilities without financial constraints:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Developers&lt;/strong&gt; iterate freely on code generation, testing multiple approaches without quota anxiety. Complex refactoring tasks that might require dozens of API calls become economically feasible.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Content Teams&lt;/strong&gt; brainstorm extensively, generate multiple variations, and refine messaging through iterative AI collaboration. The psychological shift from "conserving tokens" to "unlimited exploration" fundamentally changes creative workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Scientists&lt;/strong&gt; process large datasets, generate synthetic data, and experiment with different analysis approaches without worrying about API costs scaling with data volume.&lt;/p&gt;
&lt;h3&gt;
  
  
  Model Diversity and Customization
&lt;/h3&gt;

&lt;p&gt;Apache 2.0 licensing enables unrestricted fine-tuning and customization. Your GPT-OSS deployment becomes the foundation for specialized AI systems tailored to your business requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Industry-Specific Fine-Tuning:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Legal firms: Train on case law and legal precedents&lt;/li&gt;
&lt;li&gt;Healthcare: Customize for medical terminology and protocols
&lt;/li&gt;
&lt;li&gt;Finance: Optimize for regulatory compliance and analysis&lt;/li&gt;
&lt;li&gt;Education: Adapt for curriculum and pedagogical approaches&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Domain Expertise Development:&lt;/strong&gt;&lt;br&gt;
Fine-tune models on your proprietary documentation, code repositories, and institutional knowledge. Create AI assistants that understand your specific terminology, processes, and quality standards.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-Model Ecosystem:&lt;/strong&gt;&lt;br&gt;
Ollama supports dozens of open-source models beyond GPT-OSS. Deploy specialized models for different tasks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Code generation: CodeLlama, StarCoder&lt;/li&gt;
&lt;li&gt;Creative writing: Mistral, Anthropic models&lt;/li&gt;
&lt;li&gt;Analysis: Specialized reasoning models&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Scaling Strategies
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Horizontal Scaling:&lt;/strong&gt;&lt;br&gt;
Deploy multiple servers as team size grows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;50+ users: 2-3 servers with load balancing&lt;/li&gt;
&lt;li&gt;100+ users: Dedicated servers by department&lt;/li&gt;
&lt;li&gt;Enterprise: Multi-location deployment with model synchronization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Vertical Scaling:&lt;/strong&gt;&lt;br&gt;
Upgrade hardware for enhanced performance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GPU upgrades: RTX 4090, RTX 5080 for increased throughput&lt;/li&gt;
&lt;li&gt;Memory expansion: Support larger models and batch processing&lt;/li&gt;
&lt;li&gt;Storage optimization: NVMe RAID for faster model loading&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Advanced Features Roadmap
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Function Calling and Tool Integration:&lt;/strong&gt;&lt;br&gt;
GPT-OSS supports native function calling capabilities, enabling integration with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Internal APIs and databases&lt;/li&gt;
&lt;li&gt;Business intelligence tools&lt;/li&gt;
&lt;li&gt;Custom automation workflows&lt;/li&gt;
&lt;li&gt;External service integration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reasoning Effort Configuration:&lt;/strong&gt;&lt;br&gt;
Configurable reasoning effort levels allow optimization for different use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Low effort: Quick responses for simple queries&lt;/li&gt;
&lt;li&gt;Medium effort: Balanced performance for general use&lt;/li&gt;
&lt;li&gt;High effort: Maximum quality for complex problem-solving&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  ROI Calculation Framework
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Break-Even Analysis by Team Size:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Team Size&lt;/th&gt;
&lt;th&gt;Monthly API Cost&lt;/th&gt;
&lt;th&gt;Hardware Investment&lt;/th&gt;
&lt;th&gt;Break-Even Period&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;5 users&lt;/td&gt;
&lt;td&gt;$150/month&lt;/td&gt;
&lt;td&gt;$2,700&lt;/td&gt;
&lt;td&gt;18 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10 users&lt;/td&gt;
&lt;td&gt;$300/month&lt;/td&gt;
&lt;td&gt;$2,700&lt;/td&gt;
&lt;td&gt;9 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15 users&lt;/td&gt;
&lt;td&gt;$450/month&lt;/td&gt;
&lt;td&gt;$2,700&lt;/td&gt;
&lt;td&gt;6 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;25 users&lt;/td&gt;
&lt;td&gt;$750/month&lt;/td&gt;
&lt;td&gt;$5,400 (2 servers)&lt;/td&gt;
&lt;td&gt;7 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50 users&lt;/td&gt;
&lt;td&gt;$1,500/month&lt;/td&gt;
&lt;td&gt;$8,100 (3 servers)&lt;/td&gt;
&lt;td&gt;5.4 months&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Total Cost of Ownership (3 Years):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cloud APIs (10 users): $10,800&lt;/li&gt;
&lt;li&gt;Local deployment: $3,500 (hardware + maintenance)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Net savings: $7,300&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Future-Proofing Considerations
&lt;/h3&gt;

&lt;p&gt;The AI landscape evolves rapidly, but local deployment provides stability and control:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model Updates:&lt;/strong&gt; &lt;br&gt;
Download and test new models without disrupting existing workflows. Rollback capabilities ensure stability during transitions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compliance Evolution:&lt;/strong&gt;&lt;br&gt;
Maintain control over data handling as regulations evolve. Local deployment simplifies compliance audits and documentation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Technology Independence:&lt;/strong&gt;&lt;br&gt;
Reduce dependency on external providers and their policy changes. Your AI infrastructure remains under your control regardless of market dynamics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Innovation Platform:&lt;/strong&gt;&lt;br&gt;
Local deployment becomes a platform for AI innovation within your organization. Experiment with emerging techniques, develop proprietary capabilities, and maintain competitive advantages.&lt;/p&gt;


&lt;h2&gt;
  
  
  Getting Started Today
&lt;/h2&gt;

&lt;p&gt;Your journey to AI independence begins with a single order on Amazon. Choose a system that fits your budget and team size, knowing that the investment pays for itself within months while unlocking capabilities that cloud APIs simply cannot provide.&lt;/p&gt;

&lt;p&gt;The future of AI belongs to organizations that control their own destiny. GPT-OSS and Ollama make that future accessible today, transforming expensive cloud dependencies into owned infrastructure that grows stronger and more valuable over time.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Ready to deploy? Share your experience in the comments below and join the growing community of developers running their own AI infrastructure.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;--&lt;br&gt;
Tired of &lt;strong&gt;fragmented workflows&lt;/strong&gt; breaking your &lt;em&gt;flow state&lt;/em&gt;? &lt;strong&gt;PullFlow&lt;/strong&gt; bridges the gap, enabling seamless code review collaboration across &lt;strong&gt;GitHub&lt;/strong&gt;, &lt;strong&gt;Slack&lt;/strong&gt;, and &lt;strong&gt;VS Code&lt;/strong&gt; (plus Cursor, Windsurf, and more).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://pullflow.com" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;Try PullFlow - Unified Code-Review Collaboration&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>beginners</category>
      <category>tutorial</category>
      <category>chatgpt</category>
    </item>
    <item>
      <title>Forked by Cursor: The Hidden Cost of VS Code Fragmentation</title>
      <dc:creator>Zak Mandhro</dc:creator>
      <pubDate>Thu, 24 Jul 2025 15:58:00 +0000</pubDate>
      <link>https://dev.to/pullflow/forked-by-cursor-the-hidden-cost-of-vs-code-fragmentation-4p1</link>
      <guid>https://dev.to/pullflow/forked-by-cursor-the-hidden-cost-of-vs-code-fragmentation-4p1</guid>
      <description>&lt;p&gt;It's the story of the year in developer tooling. &lt;strong&gt;Visual Studio Code&lt;/strong&gt;, the open-source editor that became a &lt;em&gt;unifying force&lt;/em&gt; for millions of developers, is now the foundation for a new, revolutionary wave of &lt;strong&gt;AI-powered tools&lt;/strong&gt;. At the forefront is &lt;strong&gt;Cursor&lt;/strong&gt;, a VS Code fork that has taken the world by storm with its promise of a faster, smarter, &lt;strong&gt;AI-first workflow&lt;/strong&gt;. Along with Windsurf, Firebase Studio, and the just released &lt;strong&gt;Kiro IDE by AWS&lt;/strong&gt;, these VS Code forks provide a tantalizing glimpse into the &lt;em&gt;future of coding&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;But for every developer celebrating this new era, there's another who feels the growing pains. The very ecosystem that made VS Code a beloved and reliable standard is now &lt;strong&gt;fracturing&lt;/strong&gt;, leaving developers caught between a &lt;em&gt;familiar present&lt;/em&gt; and a &lt;em&gt;promising but chaotic future&lt;/em&gt;. This isn't just a story about competing tools; it's about the difficult, necessary, and often messy process of &lt;strong&gt;progress&lt;/strong&gt;, and the hidden cost of &lt;strong&gt;fragmentation&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Open-Source Rocket Ship
&lt;/h2&gt;

&lt;p&gt;Before we dive into the growing pains, it's crucial to remember how &lt;strong&gt;VS Code&lt;/strong&gt; became such a phenomenon. When Microsoft launched it in 2015, few could have predicted its trajectory. For the next decade, it became the &lt;em&gt;undisputed editor of choice&lt;/em&gt;, dominating the development landscape. This success wasn't an accident; it was a &lt;strong&gt;masterclass in open-source strategy&lt;/strong&gt;. Microsoft made a brilliant architectural bet by building the editor on a foundation of web technologies: &lt;strong&gt;Monaco&lt;/strong&gt; for the core editor, and &lt;strong&gt;Electron&lt;/strong&gt; to bring it to the desktop.&lt;/p&gt;

&lt;p&gt;This wasn't just a technical choice; it was a &lt;em&gt;community-building&lt;/em&gt; one. By using familiar technologies like &lt;strong&gt;TypeScript&lt;/strong&gt; and &lt;strong&gt;Node.js&lt;/strong&gt;, Microsoft lowered the &lt;em&gt;barrier to entry&lt;/em&gt; to almost zero. Any web developer could jump in and contribute. The advent of the &lt;strong&gt;Language Server Protocol&lt;/strong&gt; and a vibrant ecosystem of over &lt;strong&gt;70,000 extensions&lt;/strong&gt; transformed VS Code from a simple editor into a powerful, universal workbench—a &lt;em&gt;full-fledged IDE&lt;/em&gt; for any language. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;VS Code didn't just win; it won by &lt;em&gt;empowering its community&lt;/em&gt;. And that's what makes the current situation so complex.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Price of a Bolder Vision
&lt;/h2&gt;

&lt;p&gt;Let's be clear: the frustrations are &lt;em&gt;real and deeply felt&lt;/em&gt;. When you switch from the stable ground of &lt;strong&gt;VS Code&lt;/strong&gt; to a fork like &lt;strong&gt;Cursor&lt;/strong&gt; or &lt;strong&gt;Windsurf&lt;/strong&gt;, you're not just getting new features; you're also leaving a &lt;em&gt;carefully curated environment&lt;/em&gt; behind. The consequences of this divergence are &lt;strong&gt;tangible and disruptive&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Broken Workflows:&lt;/strong&gt; It's more than just a few extensions not working. It's the sudden inability to use a critical tool like &lt;strong&gt;Live Share&lt;/strong&gt; for a pair programming session, forcing you to switch back to VS Code mid-task. This constant &lt;em&gt;context-switching&lt;/em&gt; shatters the &lt;strong&gt;"flow state"&lt;/strong&gt; that is so crucial for productive development.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Retrained Muscle Memory:&lt;/strong&gt; Core keyboard shortcuts—the ones &lt;em&gt;burned into your brain&lt;/em&gt; for deleting lines, navigating files, or running tests—are often repurposed for AI features. This isn't just a minor annoyance; it's a constant &lt;strong&gt;tax on your cognitive load&lt;/strong&gt;, forcing you to think about the &lt;em&gt;mechanics of editing&lt;/em&gt; instead of the &lt;em&gt;logic of your code&lt;/em&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;A Loss of Flexibility:&lt;/strong&gt; The highly configurable UI was one of &lt;strong&gt;VS Code's greatest strengths&lt;/strong&gt;. It allowed developers from all backgrounds to feel at home. Now, that flexibility is often gone, replaced by a &lt;em&gt;locked-down interface&lt;/em&gt; that prioritizes a specific, opinionated &lt;strong&gt;AI-centric layout&lt;/strong&gt;. The editor no longer bends to you; &lt;em&gt;you have to bend to it&lt;/em&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These aren't minor inconveniences to be brushed aside. They are &lt;strong&gt;significant disruptions&lt;/strong&gt; that can make it feel like you're taking &lt;em&gt;one step forward and two steps back&lt;/em&gt;. It's fair to ask if the promise of tomorrow's &lt;strong&gt;AI-powered productivity&lt;/strong&gt; is worth the &lt;em&gt;friction of today&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  But What If Fragmentation is a Feature, Not a Bug?
&lt;/h2&gt;

&lt;p&gt;It's easy to view this situation as a &lt;em&gt;problem to be solved&lt;/em&gt;, a flaw in the open-source model. But what if it's not? What if this &lt;strong&gt;messy, divergent phase&lt;/strong&gt; is a &lt;em&gt;necessary and even healthy&lt;/em&gt; part of a much larger evolution?&lt;/p&gt;

&lt;p&gt;The teams behind &lt;strong&gt;Cursor&lt;/strong&gt; and other forks didn't set out to break your favorite extensions. They ran into the &lt;strong&gt;natural limits of a mature platform&lt;/strong&gt;. A platform like &lt;strong&gt;VS Code&lt;/strong&gt;, celebrated for its &lt;em&gt;stability and backward compatibility&lt;/em&gt;, cannot make &lt;strong&gt;radical architectural changes&lt;/strong&gt; without alienating its massive user base. Its &lt;strong&gt;extension API&lt;/strong&gt;, while powerful, was never designed for the kind of &lt;strong&gt;deep, system-level AI integration&lt;/strong&gt; these new tools envision—an AI that is &lt;em&gt;aware of the entire codebase&lt;/em&gt;, can &lt;em&gt;interact with the terminal&lt;/em&gt;, and can perform &lt;em&gt;complex, multi-file refactors&lt;/em&gt;. To build that truly &lt;strong&gt;AI-native experience&lt;/strong&gt;, they had to go beyond the existing APIs and modify the core architecture.&lt;/p&gt;

&lt;p&gt;They chose to &lt;strong&gt;innovate&lt;/strong&gt;, even if it meant breaking compatibility. This is the classic &lt;strong&gt;innovator's dilemma&lt;/strong&gt;: do you stay within the safe confines of the existing system and make &lt;em&gt;incremental improvements&lt;/em&gt;, or do you push beyond it to create something &lt;em&gt;radically new&lt;/em&gt;, even if it means leaving the old world behind?&lt;/p&gt;

&lt;h2&gt;
  
  
  A Glimpse Into the Future
&lt;/h2&gt;

&lt;p&gt;Viewed through this lens, these forks are not just competing products; they are &lt;strong&gt;live-action prototypes of the future&lt;/strong&gt;. They are like the &lt;em&gt;concept cars of the automotive world&lt;/em&gt;—bold, exciting, and not always practical for today's roads, but they show us exactly &lt;strong&gt;where the industry is headed&lt;/strong&gt;. They are &lt;em&gt;stress-testing new ideas&lt;/em&gt; in the real world, revealing what's possible when an editor is &lt;strong&gt;built around AI from the ground up&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This period of &lt;strong&gt;fragmentation&lt;/strong&gt;, while painful for some, is &lt;em&gt;incredibly valuable&lt;/em&gt;. It provides &lt;strong&gt;crucial data points&lt;/strong&gt; for the entire community. The successes of these forks are a &lt;em&gt;powerful signal&lt;/em&gt; to the &lt;strong&gt;VS Code team&lt;/strong&gt;, highlighting exactly where the demand is and what kinds of &lt;strong&gt;deeper integrations&lt;/strong&gt; developers are hungry for. At the same time, their failures and compatibility issues create a &lt;strong&gt;clear roadmap&lt;/strong&gt; for the future of VS Code's own extension APIs, showing what needs to be built to accommodate this &lt;em&gt;next generation of tooling&lt;/em&gt; without requiring a full fork.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Path to Convergence
&lt;/h2&gt;

&lt;p&gt;This doesn't have to be a &lt;strong&gt;zero-sum game&lt;/strong&gt; where one side wins at the other's expense. The current divergence should not be the destination; it should be a &lt;em&gt;temporary, albeit necessary&lt;/em&gt;, detour on the road to a more powerful future. The ideal path forward is one of &lt;strong&gt;collaboration and eventual convergence&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The innovations pioneered by forks like &lt;strong&gt;Cursor&lt;/strong&gt; and &lt;strong&gt;Windsurf&lt;/strong&gt; should serve as a &lt;em&gt;clear guide&lt;/em&gt; for the evolution of &lt;strong&gt;VS Code&lt;/strong&gt; itself. As these new &lt;strong&gt;AI patterns and workflows&lt;/strong&gt; become standardized, the VS Code team can build the &lt;strong&gt;next-generation hooks and APIs&lt;/strong&gt; that allow these powerful features to be implemented as &lt;em&gt;extensions, not forks&lt;/em&gt;. This would allow &lt;strong&gt;innovation to flourish&lt;/strong&gt; within the stable, unified ecosystem that everyone values.&lt;/p&gt;

&lt;p&gt;We are already seeing &lt;strong&gt;VS Code embracing this innovation&lt;/strong&gt; by open-sourcing the &lt;a href="https://github.com/microsoft/vscode-copilot-chat" rel="noopener noreferrer"&gt;&lt;strong&gt;VS Code Copilot Chat Extension&lt;/strong&gt;&lt;/a&gt;. This move signals exactly the kind of &lt;strong&gt;convergence&lt;/strong&gt; we need: taking &lt;em&gt;AI-first features&lt;/em&gt; that were previously exclusive to specialized tools and making them available as extensions within the core VS Code ecosystem.&lt;/p&gt;

&lt;p&gt;This is the &lt;strong&gt;true spirit of open source&lt;/strong&gt; in action. It's a &lt;em&gt;dynamic cycle&lt;/em&gt;: a stable platform enables &lt;strong&gt;radical experimentation&lt;/strong&gt; on its fringes, and that experimentation, in turn, &lt;em&gt;informs and strengthens&lt;/em&gt; the evolution of the core platform. The future doesn't have to be a choice between a &lt;em&gt;stable, universal editor&lt;/em&gt; and a &lt;em&gt;fragmented landscape&lt;/em&gt; of powerful but incompatible tools. &lt;strong&gt;We can have both&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Let's innovate with speed but build towards convergence.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;But what do you think? Is this &lt;strong&gt;fragmentation&lt;/strong&gt; a &lt;em&gt;necessary price for progress&lt;/em&gt;, or a &lt;em&gt;threat to the open-source community&lt;/em&gt; that &lt;strong&gt;VS Code&lt;/strong&gt; helped build? &lt;/p&gt;

&lt;p&gt;--&lt;br&gt;
Tired of &lt;strong&gt;fragmented workflows&lt;/strong&gt; breaking your &lt;em&gt;flow state&lt;/em&gt;? &lt;strong&gt;PullFlow&lt;/strong&gt; bridges the gap, enabling seamless code review collaboration across &lt;strong&gt;GitHub&lt;/strong&gt;, &lt;strong&gt;Slack&lt;/strong&gt;, and any editor—whether you're team &lt;strong&gt;VS Code&lt;/strong&gt; or team &lt;strong&gt;Cursor&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Learn more at &lt;a href="https://pullflow.com" rel="noopener noreferrer"&gt;&lt;strong&gt;PullFlow.com&lt;/strong&gt;&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://pullflow.com" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;Try PullFlow - Unified Code Review Collaboration&lt;/a&gt;
&lt;/p&gt;

</description>
      <category>programming</category>
      <category>vscode</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Perplexity Comet, Dia Browser, and Opera Neon - How Agentic Browsers Will Change The Web</title>
      <dc:creator>Zak Mandhro</dc:creator>
      <pubDate>Thu, 10 Jul 2025 16:00:00 +0000</pubDate>
      <link>https://dev.to/pullflow/perplexity-comet-dia-browser-and-opera-neon-how-agentic-browsers-will-change-the-web-3cbc</link>
      <guid>https://dev.to/pullflow/perplexity-comet-dia-browser-and-opera-neon-how-agentic-browsers-will-change-the-web-3cbc</guid>
      <description>&lt;p&gt;The web browser is evolving from a document viewer into an intelligent agent that acts on your behalf. This shift from passive browsing to active assistance represents one of the most significant changes in how we interact with the internet since the 1990s.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is Agentic Browsing?
&lt;/h2&gt;

&lt;p&gt;Agentic browsing transforms your browser from a passive tool into an intelligent assistant that can understand context, perform tasks, and make decisions. Instead of just displaying web pages, agentic browsers use AI to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Understand user intent&lt;/strong&gt; beyond simple keyword searches.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Perform automated tasks&lt;/strong&gt; like filling forms, making bookings, and shopping.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Provide contextual assistance&lt;/strong&gt; with writing, learning, and research.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Synthesize information&lt;/strong&gt; across multiple sources in real-time.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Adapt to user preferences&lt;/strong&gt; and work patterns over time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They serve as intelligent intermediaries between users and the web, capable of reasoning about content and taking action based on natural language commands.&lt;/p&gt;

&lt;h2&gt;
  
  
  The New Generation of Agentic Browsers
&lt;/h2&gt;

&lt;p&gt;Several companies are leading the charge in reimagining the browser experience:&lt;/p&gt;

&lt;h3&gt;
  
  
  Opera: From Aria to Neon
&lt;/h3&gt;

&lt;p&gt;Opera’s &lt;strong&gt;Aria&lt;/strong&gt; provides AI-powered tab management and content generation in its existing browser. &lt;strong&gt;Opera Neon&lt;/strong&gt;, by contrast, is a premium, standalone agentic browser designed to "turn user intent into action." Neon features a native AI that automates web tasks locally for privacy and includes a powerful AI engine that can create reports, code, and even websites from user requests.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Browser Company's Dia
&lt;/h3&gt;

&lt;p&gt;Dia, from the creators of the Arc browser, focuses on contextual AI assistance. It acts as a writing partner, learning tutor, planning assistant, and shopping concierge, allowing users to "chat with their tabs" to get information about any open webpage.&lt;/p&gt;

&lt;h3&gt;
  
  
  Perplexity's Comet
&lt;/h3&gt;

&lt;p&gt;Positioned as a "thought partner," Comet aims to transform browsing into active collaboration. It learns user thinking patterns to surface relevant content proactively and can acquire knowledge from one website to apply it to actions on another—for instance, gathering product specs from a manufacturer's site and using them to complete a purchase on a retailer's site.&lt;/p&gt;

&lt;h3&gt;
  
  
  OpenAI's Potential Entry into the Browser Market
&lt;/h3&gt;

&lt;p&gt;In a significant but still speculative development, OpenAI is reportedly building its own web browser. This move, if confirmed, would be based on the open-source Chromium project and could represent a major challenge to Google Chrome's dominance. Developing a native solution would allow OpenAI to design the browser architecture specifically for AI-powered features from the ground up. However, unlike the other browsers mentioned, OpenAI has not officially announced this product, so it remains a development to watch.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use Cases and Benefits
&lt;/h2&gt;

&lt;p&gt;Agentic browsers excel in scenarios where traditional browsing is cumbersome:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Productivity&lt;/strong&gt;: Accelerate research by automatically gathering and synthesizing findings from multiple sources.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Learning&lt;/strong&gt;: Get real-time explanations of complex topics and see different viewpoints on controversial subjects.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Commerce&lt;/strong&gt;: Streamline shopping by comparing products, analyzing reviews, and handling complex bookings automatically.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Publisher and Advertiser Disruption
&lt;/h2&gt;

&lt;p&gt;The rise of agentic browsers creates a significant disruption to web economics, fundamentally changing how content is consumed and monetized.&lt;/p&gt;

&lt;h3&gt;
  
  
  A Paradigm Shift for Publishers
&lt;/h3&gt;

&lt;p&gt;Web publishers who rely on page views and on-site engagement are confronting a significant challenge to their business models. Recent data shows the impact is already being felt from existing AI-powered search features.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Traffic Reduction is No Longer Hypothetical&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AI-Powered Summaries&lt;/strong&gt;: Agentic browsers and AI search tools provide comprehensive summaries, reducing the need for users to click through to the original source. According to a June 2025 report, some publishers have already seen traffic from Google search fall by as much as 60% since the rollout of its AI Overviews.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Direct Answers Kill Clicks&lt;/strong&gt;: Another analysis found that when AI Overviews appear, the first organic link loses an average of 34.5% of its clicks. As users get their questions answered directly on the results page, traditional traffic metrics plummet.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Adapting to Survive&lt;/strong&gt;: Publishers are being forced to explore new strategies, such as building paywalls, forming direct partnerships with AI companies, or creating interactive experiences that an AI cannot easily replicate.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Advertising Models Under Pressure
&lt;/h3&gt;

&lt;p&gt;Digital advertising models are also facing fundamental changes, as AI agents may filter out display ads and reduce the "dwell time" that generates impressions. New opportunities may emerge in AI-mediated recommendations and intent-based targeting, but the transition will be challenging.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Web Design Will Adapt to Agentic Browsing
&lt;/h2&gt;

&lt;p&gt;The rise of agentic browsers will reshape web development. The focus will shift from traditional Search Engine Optimization (SEO) to &lt;strong&gt;Agent Optimization&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Structured Data and APIs&lt;/strong&gt;: Websites will need machine-readable markup (like Schema.org) and robust APIs to help AI agents understand content and context.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;The UI Density Revolution&lt;/strong&gt;: Interfaces may become more information-dense. AI agents don't need large, finger-friendly buttons, allowing for more functionality to be packed into smaller spaces. We may see a rise in hybrid interfaces that can switch between human-friendly and agent-optimized views.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Limitations and Challenges
&lt;/h2&gt;

&lt;p&gt;Despite their promise, agentic browsers face significant hurdles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Technical&lt;/strong&gt;: Real-time AI assistance requires substantial processing power and network connectivity.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Privacy and Security&lt;/strong&gt;: Agentic browsers need access to browsing behavior to provide personalized assistance, raising concerns about data collection and security.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;User Experience&lt;/strong&gt;: Users must adapt to new interaction patterns and learn to trust AI recommendations without becoming over-reliant on them.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion: Preparing for the Agentic Web
&lt;/h2&gt;

&lt;p&gt;Agentic browsing is not a distant concept—it's happening now. With real products from Opera, The Browser Company, and Perplexity, the transformation of the web is accelerating.&lt;/p&gt;

&lt;p&gt;This transition requires new approaches to web design, a focus on structured data, and a search for sustainable monetization models. As we navigate this shift, the goal must be to enhance human agency, not replace it. The web is becoming more intelligent, and those who adapt will be best positioned for success.&lt;/p&gt;

&lt;p&gt;How are you preparing your web projects for an agentic future?&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>productivity</category>
      <category>news</category>
    </item>
    <item>
      <title>Vibe Coding: Why Microservices Are Cool Again</title>
      <dc:creator>Zak Mandhro</dc:creator>
      <pubDate>Thu, 26 Jun 2025 16:00:00 +0000</pubDate>
      <link>https://dev.to/pullflow/vibe-coding-why-microservices-are-cool-again-12p2</link>
      <guid>https://dev.to/pullflow/vibe-coding-why-microservices-are-cool-again-12p2</guid>
      <description>&lt;p&gt;&lt;em&gt;The surprising synergy between LLM code-generation and modular architecture&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Rise of Vibe Coding
&lt;/h2&gt;

&lt;p&gt;Somewhere between autocomplete and AGI, a new term entered the developer lexicon: &lt;strong&gt;vibe coding&lt;/strong&gt; — the act of building software by prompting an LLM and iterating in flow.&lt;/p&gt;

&lt;p&gt;Coined by Andrej Karpathy, it evokes that jazz-like rhythm of:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"You prompt, it codes, you tweak, it gets better — you vibe."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But not everyone's vibing.&lt;/p&gt;

&lt;p&gt;Andrew Ng recently called the term "unfortunate," warning it trivializes the deep, focused labor of AI-assisted engineering. Hacker News lit up with takes ranging from "this is the future" to "this is how the future explodes in prod."&lt;/p&gt;

&lt;p&gt;So… is vibe coding real? Yes. But it only works when the architecture supports it. And that's where &lt;strong&gt;microservices&lt;/strong&gt; make a surprise comeback.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Monoliths Kill the Vibe
&lt;/h2&gt;

&lt;p&gt;Large monolithic codebases — whether human-crafted or LLM-generated — are notoriously hard to work with. Not because they're morally wrong. Because they're cognitively dense.&lt;/p&gt;

&lt;p&gt;As monoliths grow, they create a degradation spiral. The cognitive overhead of understanding the entire system becomes untenable, so developers resort to localized fixes: "I'll just make this part work." These tactical shortcuts accumulate as technical debt, introducing inconsistent patterns, tighter coupling, and architectural drift. The codebase becomes progressively harder to reason about, encouraging more shortcuts — a feedback loop that compounds complexity exponentially.&lt;/p&gt;

&lt;p&gt;For humans, monoliths require tribal knowledge and cautious refactoring.&lt;br&gt;&lt;br&gt;
For LLMs, they stretch the limits of &lt;strong&gt;context windows&lt;/strong&gt; and dilute the model's ability to make accurate predictions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why LLMs Struggle With Monoliths
&lt;/h3&gt;

&lt;p&gt;LLMs process input as &lt;strong&gt;tokens&lt;/strong&gt;, and use attention mechanisms to assign weight across these tokens. As your codebase grows, so does the number of tokens — often beyond what the model can "meaningfully attend to."&lt;/p&gt;

&lt;p&gt;What happens?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dependencies become too distant in token space.&lt;/li&gt;
&lt;li&gt;Signals get lost in noise.&lt;/li&gt;
&lt;li&gt;The model's attention gets diffused, weakening its ability to recognize relevant context.&lt;/li&gt;
&lt;li&gt;Outputs become blurrier, less confident, and more error-prone.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Long story short: even the smartest model struggles when asked to make changes across a 100k-line codebase full of deeply-coupled logic. It's like trying to debug a complex system while only being able to see a small window of code at a time — critical dependencies and context get lost outside your field of view.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Microservices Make LLMs Shine
&lt;/h2&gt;

&lt;p&gt;Microservices break complex systems into small, purpose-built modules. For LLMs, that's gold.&lt;/p&gt;

&lt;p&gt;Each service becomes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A promptable unit&lt;/li&gt;
&lt;li&gt;A testable target&lt;/li&gt;
&lt;li&gt;A contained context window&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can tell an LLM:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Build a notification service that sends Slack alerts on deploy failures."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;...and it can generate a working service with routes, tests, and infrastructure glue — all without dragging in your entire backend.&lt;/p&gt;

&lt;h3&gt;
  
  
  Microservices = Better Prompts
&lt;/h3&gt;

&lt;p&gt;The APIs between services act as semantic boundaries that make reasoning easier — for both humans and LLMs. Instead of fuzzy internal function calls, you get explicit interfaces and contracts.&lt;/p&gt;

&lt;p&gt;For LLMs, this clarity improves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Planning (less ambiguity)&lt;/li&gt;
&lt;li&gt;Generation (cleaner prompts)&lt;/li&gt;
&lt;li&gt;Debugging (smaller scope)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  But Didn't Microservices Burn Us Already?
&lt;/h2&gt;

&lt;p&gt;Yes. Microservices once promised engineering nirvana — and often delivered chaos:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CI/CD pipelines everywhere&lt;/li&gt;
&lt;li&gt;Observability fatigue&lt;/li&gt;
&lt;li&gt;Three NPM packages to change a button color&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But with AI in the mix, microservices are cool again — not because they scale, but because they de-risk co-creation with machines.&lt;/p&gt;

&lt;p&gt;LLMs don't need the entire application — they need well-defined pieces. And microservices deliver exactly that.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Vibe Stack: How We Make Microservices Not Suck
&lt;/h2&gt;

&lt;p&gt;At PullFlow, we build AI-augmented microservices daily — and we do it without a DevOps nightmare or Kubernetes in local dev. Here's our actual setup that keeps developer experience smooth and LLMs productive:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Colima&lt;/strong&gt;: Lightweight containerd runtime for fast, reliable local containers (especially friendly on macOS).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caddy&lt;/strong&gt;: Smart reverse proxy with automatic HTTPS and per-service routing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;cloudflared&lt;/strong&gt;: Secure tunneling to expose local services — ideal for testing webhooks, LLM endpoints, and external integrations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NATS + JetStream&lt;/strong&gt;: High-performance pub/sub system that powers inter-service messaging and async workflows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TurboRepo + pnpm&lt;/strong&gt;: Monorepo tooling with shared packages managed through workspaces and fast, dependency-deduplicated builds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL&lt;/strong&gt;: Primary relational database for most service persistence needs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Valkey&lt;/strong&gt;: High-performance shared cache layer for cross-service data sharing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TimescaleDB&lt;/strong&gt;: Time-series database extension for PostgreSQL, handling metrics and event data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Isolated service scaffolding&lt;/strong&gt;: Each microservice lives in its own code path with dedicated persistent stores.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This setup gives us:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Predictable, testable dev environments that scale with the team&lt;/li&gt;
&lt;li&gt;Clear interfaces that LLMs can reason about&lt;/li&gt;
&lt;li&gt;Service boundaries that preserve human sanity and AI promptability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Microservices can be fast, modular, and developer-first — if you design for it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The LLM-Native Stack Is Coming
&lt;/h2&gt;

&lt;p&gt;We're already seeing a shift:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;LLMs spinning up CRUD APIs from OpenAPI schemas&lt;/li&gt;
&lt;li&gt;Agents orchestrating services via message buses&lt;/li&gt;
&lt;li&gt;Prompts as the new CLI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As AI takes a bigger role in software development, we need architectures that support modularity, autonomy, and safety. Microservices aren't just back — they might be foundational to the LLM-native dev stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  At PullFlow...
&lt;/h2&gt;

&lt;p&gt;We’re building for a future where humans and AI agents don’t just coexist — they &lt;strong&gt;collaborate&lt;/strong&gt; to ship better software, faster. That’s why we’ve embraced microservices — not for scale, but as a &lt;strong&gt;protocol for co-creation&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Want to vibe-code with confidence? &lt;strong&gt;Start with strong boundaries.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Want Human + AI collaboration that actually works in code reviews? &lt;strong&gt;&lt;a href="https://pullflow.com" rel="noopener noreferrer"&gt;Try PullFlow&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>microservices</category>
      <category>vibecoding</category>
    </item>
    <item>
      <title>Code Review Agent Adoption in PullFlow</title>
      <dc:creator>Zak Mandhro</dc:creator>
      <pubDate>Tue, 03 Jun 2025 15:00:00 +0000</pubDate>
      <link>https://dev.to/pullflow/code-review-agent-adoption-in-pullflow-25f7</link>
      <guid>https://dev.to/pullflow/code-review-agent-adoption-in-pullflow-25f7</guid>
      <description>&lt;p&gt;As a leading code review collaboration platform, PullFlow has been at the forefront of the AI agent revolution in software development. Over the past year, we've integrated with popular AI agents like &lt;a href="https://pullflow.com/blog/copilot" rel="noopener noreferrer"&gt;GitHub Copilot&lt;/a&gt;, &lt;a href="https://pullflow.com/blog/coderabbit" rel="noopener noreferrer"&gt;CodeRabbit&lt;/a&gt;, and &lt;a href="https://pullflow.com/blog/greptile" rel="noopener noreferrer"&gt;Greptile&lt;/a&gt;, giving us unprecedented visibility into how development teams are adopting and using these tools.&lt;/p&gt;

&lt;p&gt;The insights we've gathered have been remarkable. Today, &lt;strong&gt;85% of our paid customers&lt;/strong&gt; actively use AI agents for code review, representing a fundamental shift in how development teams approach collaboration and quality assurance. But the real story isn't just in the adoption numbers—it's in what we've learned about how these tools are reshaping development workflows in ways we didn't anticipate.&lt;/p&gt;

&lt;p&gt;This isn't simply about automation replacing manual processes. What we're observing through our platform is a sophisticated evolution in human-AI collaboration that's transforming how teams work together.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Current Landscape
&lt;/h2&gt;

&lt;p&gt;The adoption patterns tell a compelling story. &lt;strong&gt;30% of our paid customers&lt;/strong&gt; now use multiple AI agents simultaneously, with GitHub Copilot leading overall adoption, followed by specialized tools like CodeRabbit and Greptile for targeted review tasks. Perhaps most striking is the &lt;strong&gt;near-universal adoption of automatic PR description generation&lt;/strong&gt;, which has become so integral to teams' workflows that many describe it as indispensable as syntax highlighting or version control.&lt;/p&gt;

&lt;p&gt;The integration of these tools represents more than convenience: it's enabling teams to scale their review processes without proportionally scaling their time investment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Developer Experience Patterns
&lt;/h2&gt;

&lt;p&gt;One of the most interesting trends we've observed is how different experience levels approach AI agents. &lt;strong&gt;Junior developers&lt;/strong&gt; tend to embrace AI agents as comprehensive learning tools, using them for guidance on best practices, code patterns, and quality standards. The immediate feedback loop helps accelerate their learning curve significantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Senior developers&lt;/strong&gt; take a more strategic approach, leveraging AI agents for routine quality checks while reserving their expertise for architectural decisions, design patterns, and mentoring responsibilities. This division isn't a limitation—it's an optimization that allows teams to distribute cognitive load more effectively across both human and artificial intelligence.&lt;/p&gt;

&lt;h2&gt;
  
  
  Managing the Signal-to-Noise Challenge
&lt;/h2&gt;

&lt;p&gt;One reality every team faces: approximately &lt;strong&gt;70% of AI agent comments are resolved without action&lt;/strong&gt;, indicating they weren't actionable or relevant to the specific context. This signal-to-noise ratio can create notification fatigue and undermine trust in automated systems. However, successful teams have developed strategies to address this challenge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unified conversation management&lt;/strong&gt; through PullFlow's threading system allows teams to centralize AI feedback alongside human discussions. Senior developers can quickly validate useful suggestions with reactions while filtering out noise. &lt;strong&gt;Direct agent interaction&lt;/strong&gt; via Slack integration enables teams to clarify AI feedback contextually, allowing developers to ask &lt;code&gt;@coderabbit&lt;/code&gt; for clarification without leaving their workflow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Customized agent settings&lt;/strong&gt; help teams tune their review focus through PullFlow's Agents page, emphasizing feedback types most relevant to their codebase and development standards.&lt;/p&gt;

&lt;h2&gt;
  
  
  Workflow Evolution
&lt;/h2&gt;

&lt;p&gt;The most effective implementations treat AI agents as specialized team members with distinct strengths. Teams are developing sophisticated workflows that leverage both human insight and AI capabilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI agents handle consistent quality checks: syntax errors, security patterns, style compliance&lt;/li&gt;
&lt;li&gt;Human reviewers focus on business logic, architectural decisions, and knowledge transfer&lt;/li&gt;
&lt;li&gt;Reactions and threading systems create feedback loops that help teams learn which AI suggestions provide value&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Emerging Trends
&lt;/h2&gt;

&lt;p&gt;Several developments are reshaping the code review landscape. &lt;strong&gt;Shift-left integration&lt;/strong&gt; is moving review capabilities directly into development environments, enabling real-time feedback before code reaches the PR stage. &lt;strong&gt;Role reversal scenarios&lt;/strong&gt; are becoming more common, where human reviewers evaluate AI-generated code against business requirements and architectural standards.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-agent orchestration&lt;/strong&gt; is emerging, with specialized agents handling different aspects of code review, testing, and documentation in coordinated workflows. &lt;strong&gt;Self-improving systems&lt;/strong&gt; are beginning to update their own instruction files based on team acceptance patterns, creating more targeted feedback over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Human Element
&lt;/h2&gt;

&lt;p&gt;Despite increasing automation, the most successful teams maintain strong human oversight and decision-making. AI agents excel at identifying technical issues and enforcing consistency, but human reviewers provide essential context around business requirements, user impact, and strategic technical decisions. The most effective implementations don't replace human judgment—they amplify it by handling routine tasks and highlighting areas that require human expertise.&lt;/p&gt;

&lt;h2&gt;
  
  
  Looking Forward
&lt;/h2&gt;

&lt;p&gt;The 85% adoption rate reflects a broader shift toward co-intelligent development teams. Success isn't measured simply by speed improvements, but by the quality of collaboration between human expertise and AI capabilities. Teams achieving optimal results focus on orchestrating these tools thoughtfully, customizing their behavior to team-specific needs, and maintaining the collaborative learning aspects that make code review valuable beyond quality assurance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://pullflow.com/blog/agents-launch-blogpost" rel="noopener noreferrer"&gt;PullFlow's Agent Experience&lt;/a&gt; continues evolving to support this transformation, providing centralized management, intelligent filtering, and seamless integration that adapts to how teams actually work. The future of code review lies in thoughtful human-AI collaboration: not replacement, but strategic partnership that enhances both efficiency and quality.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Learn more about optimizing your team's code review workflow with &lt;a href="https://pullflow.com" rel="noopener noreferrer"&gt;PullFlow's Agent Experience&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>softwaredevelopment</category>
      <category>devops</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
