<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: David Britt</title>
    <description>The latest articles on DEV Community by David Britt (@djmbritt).</description>
    <link>https://dev.to/djmbritt</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F990226%2F7df27c00-8b86-421a-9853-027f52254fb0.png</url>
      <title>DEV Community: David Britt</title>
      <link>https://dev.to/djmbritt</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/djmbritt"/>
    <language>en</language>
    <item>
      <title>Builders' Challenge v3</title>
      <dc:creator>David Britt</dc:creator>
      <pubDate>Fri, 10 Oct 2025 12:24:32 +0000</pubDate>
      <link>https://dev.to/nosana/builders-challenge-v3-2979</link>
      <guid>https://dev.to/nosana/builders-challenge-v3-2979</guid>
      <description>&lt;p&gt;The Nosana Builder Challenge is back! After the success of Agents 101, we're excited to announce &lt;strong&gt;Agents 102&lt;/strong&gt; — a developer challenge where you'll build intelligent AI agents with frontend interfaces and deploy them on the Nosana decentralized compute network.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Details
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Prize Pool&lt;/strong&gt;: $3,000 USDC for top 10 submissions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Start Date&lt;/strong&gt;: October 10, 2025&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Submission Deadline&lt;/strong&gt;: October 24, 2025&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Winners Announced&lt;/strong&gt;: October 31, 2025&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Submission Platform&lt;/strong&gt;: &lt;a href="https://earn.superteam.fun/listing/nosana-builders-challenge-agents-102" rel="noopener noreferrer"&gt;SuperTeam Builders Challenge Page&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Repository&lt;/strong&gt;: &lt;a href="https://github.com/nosana-ci/agent-challenge" rel="noopener noreferrer"&gt;Agent Challenge Starter&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Your Mission
&lt;/h2&gt;

&lt;p&gt;Build an intelligent AI agent that performs real-world tasks using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mastra framework&lt;/strong&gt; for agent orchestration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool calling&lt;/strong&gt; to interact with external services&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP (Model Context Protocol)&lt;/strong&gt; for enhanced capabilities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom frontend&lt;/strong&gt; to showcase your agent's functionality&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then deploy your complete stack (agent + frontend + LLM) on Nosana's decentralized network!&lt;/p&gt;

&lt;h2&gt;
  
  
  Agent Ideas to Inspire You
&lt;/h2&gt;

&lt;p&gt;The possibilities are endless! Here are some ideas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🤖 &lt;strong&gt;Personal Assistant&lt;/strong&gt; - Schedule management, email drafting, task automation&lt;/li&gt;
&lt;li&gt;📊 &lt;strong&gt;Data Analyst Agent&lt;/strong&gt; - Fetch financial data, generate insights, create visualizations&lt;/li&gt;
&lt;li&gt;🌐 &lt;strong&gt;Web Researcher&lt;/strong&gt; - Aggregate information from multiple sources, summarize findings&lt;/li&gt;
&lt;li&gt;🛠️ &lt;strong&gt;DevOps Helper&lt;/strong&gt; - Monitor services, automate deployments, manage infrastructure&lt;/li&gt;
&lt;li&gt;🎨 &lt;strong&gt;Content Creator&lt;/strong&gt; - Generate social media posts, blog outlines, marketing copy&lt;/li&gt;
&lt;li&gt;🔍 &lt;strong&gt;Smart Search&lt;/strong&gt; - Multi-source search with AI-powered result synthesis&lt;/li&gt;
&lt;li&gt;💬 &lt;strong&gt;Customer Support Bot&lt;/strong&gt; - Answer FAQs, ticket routing, knowledge base queries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Be Creative!&lt;/strong&gt; The best agents solve real problems in innovative ways.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Framework: Mastra
&lt;/h2&gt;

&lt;p&gt;We're using &lt;a href="https://mastra.ai" rel="noopener noreferrer"&gt;Mastra&lt;/a&gt;, the powerful TypeScript framework that makes building AI applications intuitive and fast. Mastra provides all the primitives you need: workflows, agents, RAG, integrations, and evaluations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;New to Mastra?&lt;/strong&gt; Check out these resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/en/docs/agents/overview" rel="noopener noreferrer"&gt;Mastra Agent Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/en/guides/guide/stock-agent" rel="noopener noreferrer"&gt;Build an AI Stock Agent Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/en/docs/agents/tools" rel="noopener noreferrer"&gt;Mastra Tool Calling Documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Register
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Register at &lt;a href="https://earn.superteam.fun/listing/nosana-builders-challenge-agents-102" rel="noopener noreferrer"&gt;SuperTeam&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Register at the &lt;a href="https://luma.com/zkob1iae" rel="noopener noreferrer"&gt;Luma Event Page&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Star the required repos: &lt;a href="https://github.com/nosana-ci/agent-challenge" rel="noopener noreferrer"&gt;Agent Challenge&lt;/a&gt;, &lt;a href="https://github.com/nosana-ci/nosana-cli" rel="noopener noreferrer"&gt;Nosana CLI&lt;/a&gt;, &lt;a href="https://github.com/nosana-ci/nosana-sdk" rel="noopener noreferrer"&gt;Nosana SDK&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Complete the &lt;a href="https://e86f0b9c.sibforms.com/serve/MUIFALaEjtsXB60SDmm1_DHdt9TOSRCFHOZUSvwK0ANbZDeJH-sBZry2_0YTNi1OjPt_ZNiwr4gGC1DPTji2zdKGJos1QEyVGBzTq_oLalKkeHx3tq2tQtzghyIhYoF4_sFmej1YL1WtnFQyH0y1epowKmDFpDz_EdGKH2cYKTleuTu97viowkIIMqoDgMqTD0uBaZNGwjjsM07T" rel="noopener noreferrer"&gt;registration form&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Step 2: Fork &amp;amp; Build
&lt;/h3&gt;

&lt;p&gt;Fork the &lt;a href="https://github.com/nosana-ci/agent-challenge" rel="noopener noreferrer"&gt;challenge repository&lt;/a&gt; and start building your agent using the provided starter template with Next.js, Mastra, and CopilotKit.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Fork this repo on GitHub, then clone your fork&lt;/span&gt;
git clone https://github.com/YOUR-USERNAME/agent-challenge

&lt;span class="nb"&gt;cd &lt;/span&gt;agent-challenge

&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env

pnpm i

pnpm run dev:ui      &lt;span class="c"&gt;# Start UI server (port 3000)&lt;/span&gt;
pnpm run dev:agent   &lt;span class="c"&gt;# Start Mastra agent server (port 4111)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Deploy to Nosana
&lt;/h3&gt;

&lt;p&gt;Build your Docker container and deploy your complete stack to the Nosana network using either the &lt;a href="https://dashboard.nosana.com/deploy" rel="noopener noreferrer"&gt;Nosana Dashboard&lt;/a&gt; or the Nosana CLI.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Submit
&lt;/h3&gt;

&lt;p&gt;Commit your code to you forked GitHub repo and submit your project on the &lt;a href="https://earn.superteam.fun/listing/nosana-builders-challenge-agents-102" rel="noopener noreferrer"&gt;SuperTeam Challenge Page&lt;/a&gt; before the deadline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Minimum Requirements
&lt;/h2&gt;

&lt;p&gt;Your submission &lt;strong&gt;must&lt;/strong&gt; include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ &lt;strong&gt;Agent with Tool Calling&lt;/strong&gt; - At least one custom tool/function&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Frontend Interface&lt;/strong&gt; - Working UI to interact with your agent&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Deployed on Nosana&lt;/strong&gt; - Complete stack running on Nosana network&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Docker Container&lt;/strong&gt; - Published to Docker Hub&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Video Demo&lt;/strong&gt; - 1-3 minute demonstration of your deployed agent&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Updated README&lt;/strong&gt; - Clear documentation in your forked repo&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Social Media Post&lt;/strong&gt; - Share on X/BlueSky/LinkedIn with #NosanaAgentChallenge and tag @nosana_ai&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Prizes
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Top 10 submissions will be rewarded:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🥇 1st Place: $1,000 USDC&lt;/li&gt;
&lt;li&gt;🥈 2nd Place: $750 USDC&lt;/li&gt;
&lt;li&gt;🥉 3rd Place: $450 USDC&lt;/li&gt;
&lt;li&gt;🏅 4th Place: $200 USDC&lt;/li&gt;
&lt;li&gt;🏅 5th-10th Place: $100 USDC each&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Judging Criteria
&lt;/h2&gt;

&lt;p&gt;Submissions evaluated on 4 key areas (25% each):&lt;/p&gt;

&lt;h3&gt;
  
  
  Innovation 🎨
&lt;/h3&gt;

&lt;p&gt;Originality of agent concept, creative use of AI capabilities, unique problem-solving approach&lt;/p&gt;

&lt;h3&gt;
  
  
  Technical Implementation 💻
&lt;/h3&gt;

&lt;p&gt;Code quality, proper use of Mastra framework, efficient tool implementation, error handling&lt;/p&gt;

&lt;h3&gt;
  
  
  Nosana Integration ⚡
&lt;/h3&gt;

&lt;p&gt;Successful deployment, resource efficiency, stability and performance, proper containerization&lt;/p&gt;

&lt;h3&gt;
  
  
  Real-World Impact 🌍
&lt;/h3&gt;

&lt;p&gt;Practical use cases, potential for adoption, clear value proposition, demonstration quality&lt;/p&gt;

&lt;h2&gt;
  
  
  Support &amp;amp; Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Discord&lt;/strong&gt;: Join &lt;a href="https://nosana.com/discord" rel="noopener noreferrer"&gt;Nosana Discord&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dev Chat&lt;/strong&gt;: &lt;a href="https://discord.com/channels/236263424676331521/1354391113028337664" rel="noopener noreferrer"&gt;Builders Challenge Channel&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Twitter&lt;/strong&gt;: Follow &lt;a href="https://x.com/nosana_ai" rel="noopener noreferrer"&gt;@nosana_ai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docs&lt;/strong&gt;: &lt;a href="https://docs.nosana.io" rel="noopener noreferrer"&gt;Nosana Documentation&lt;/a&gt; | &lt;a href="https://mastra.ai/docs" rel="noopener noreferrer"&gt;Mastra Docs&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Good luck, builders! We can't wait to see the innovative AI agents you create for the Nosana ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Happy Building!&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;Want access to exclusive builder perks, early challenges, and Nosana credits?&lt;br&gt;
Subscribe to our newsletter and never miss an update.&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://e86f0b9c.sibforms.com/serve/MUIFALaEjtsXB60SDmm1_DHdt9TOSRCFHOZUSvwK0ANbZDeJH-sBZry2_0YTNi1OjPt_ZNiwr4gGC1DPTji2zdKGJos1QEyVGBzTq_oLalKkeHx3tq2tQtzghyIhYoF4_sFmej1YL1WtnFQyH0y1epowKmDFpDz_EdGKH2cYKTleuTu97viowkIIMqoDgMqTD0uBaZNGwjjsM07T" rel="noopener noreferrer"&gt; Join the Nosana Builders Newsletter &lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Be the first to know about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🧠 Upcoming Builders Challenges&lt;/li&gt;
&lt;li&gt;💸 New reward opportunities&lt;/li&gt;
&lt;li&gt;⚙ Product updates and feature drops&lt;/li&gt;
&lt;li&gt;🎁 Early-bird credits and partner perks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Join the Nosana builder community today — and build the future of decentralized AI.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aiops</category>
      <category>llm</category>
      <category>hackathon</category>
    </item>
    <item>
      <title>How We're Helping AI Startups Cut Costs by 67% With Open-Source Models</title>
      <dc:creator>David Britt</dc:creator>
      <pubDate>Wed, 13 Aug 2025 20:24:36 +0000</pubDate>
      <link>https://dev.to/nosana/how-were-helping-ai-startups-cut-costs-by-67-with-open-source-models-1nl4</link>
      <guid>https://dev.to/nosana/how-were-helping-ai-startups-cut-costs-by-67-with-open-source-models-1nl4</guid>
      <description>&lt;h2&gt;
  
  
  The Hidden Cost of AI-Powered Products
&lt;/h2&gt;

&lt;p&gt;In today's AI-driven product landscape, impressive capabilities often come with significant cost challenges. One of our recent collaborations with an AI presentation tool startup illustrates this perfectly. Their sleek, intuitive platform generates professional slide decks in minutes—but behind the scenes, the economics were threatening their growth potential.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge: When AI Costs Threaten Profitability
&lt;/h2&gt;

&lt;p&gt;This startup's AI presentation generator delivers impressive results. Users can go from a simple prompt to a complete, professional slide deck in just 10-15 minutes. The magic behind this capability? A powerful proprietary AI model—but that magic comes at a price: approximately $0.30 per slide.&lt;/p&gt;

&lt;p&gt;For a typical 20-slide presentation, that's $6 in AI costs alone—before accounting for hosting, development, support, or any other business expenses. At scale, these costs threatened to make their unit economics unsustainable, especially for a startup looking to offer competitive pricing.&lt;/p&gt;

&lt;p&gt;They approached us with a challenge: explore if they could use an open-source model instead and cut their costs to around $0.05-0.10 cents per slide.&lt;/p&gt;

&lt;h2&gt;
  
  
  Evaluating the Technical Requirements
&lt;/h2&gt;

&lt;p&gt;After testing their platform, we were impressed with the quality and interactivity of the AI-generated presentations. This level of sophistication meant we needed to find an open-source alternative that could deliver comparable results.&lt;/p&gt;

&lt;p&gt;The startup's application required:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;High-quality text generation for professional content&lt;/li&gt;
&lt;li&gt;Sufficient context window to process complex presentation requirements&lt;/li&gt;
&lt;li&gt;Tool-calling capabilities for integration with their platform&lt;/li&gt;
&lt;li&gt;Reasonable generation speed for a good user experience&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Technical Solution: Optimized Open-Source Models
&lt;/h2&gt;

&lt;p&gt;After evaluating several open-source models, our team identified Qwen3-32B as the optimal starting point for their needs. While not identical to proprietary models, it offers comparable capabilities at a fraction of the cost when deployed on optimized infrastructure.&lt;/p&gt;

&lt;p&gt;Key technical aspects of our solution included:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Optimized deployment&lt;/strong&gt;: NVIDIA A100-80GB or H100 GPUs for maximum performance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel processing&lt;/strong&gt;: Support for 40-50 concurrent users on a single GPU&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficient resource utilization&lt;/strong&gt;: Careful memory management to maximize context window&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalable architecture&lt;/strong&gt;: Ability to grow with their user base&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our platform enables efficient deployment of these models with streamlined infrastructure management—crucial for a startup looking to minimize DevOps overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Business Impact: A 67% Cost Reduction
&lt;/h2&gt;

&lt;p&gt;The numbers tell a compelling story:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Current cost with proprietary model&lt;/strong&gt;: $0.30 per slide&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Projected cost with open-source model&lt;/strong&gt;: $0.10 per slide&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost reduction&lt;/strong&gt;: 67%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This dramatic cost reduction transforms the startup's business possibilities. With improved unit economics, they can now implement:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A viable freemium model&lt;/strong&gt;: Offer a free tier using open-source models to drive user acquisition&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tiered pricing strategy&lt;/strong&gt;: Reserve premium models for paid tiers with higher performance needs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Competitive pricing&lt;/strong&gt;: Maintain margins while offering more attractive price points&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sustainable scaling&lt;/strong&gt;: Grow their user base without proportional AI cost increases&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Implementation Strategy: "Model Discovery Phase"
&lt;/h2&gt;

&lt;p&gt;Rather than a one-size-fits-all approach, we proposed a "model discovery phase" to find the optimal balance between cost and performance:&lt;/p&gt;

&lt;p&gt;"We'll explore which model is the best for your use case. Even though the model is not as capable as proprietary alternatives, we can provide access at a significantly reduced price."&lt;/p&gt;

&lt;p&gt;The implementation plan included:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deploying a dedicated endpoint for testing&lt;/li&gt;
&lt;li&gt;Running performance benchmarks with real-world content&lt;/li&gt;
&lt;li&gt;Fine-tuning model parameters for presentation generation&lt;/li&gt;
&lt;li&gt;Gradually optimizing for the ideal cost/performance balance&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Future Possibilities: Grant Program Support
&lt;/h2&gt;

&lt;p&gt;Beyond the immediate cost benefits, this collaboration opens doors to additional opportunities through our grant program, which provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Financial support for implementation&lt;/li&gt;
&lt;li&gt;Technical resources for optimization&lt;/li&gt;
&lt;li&gt;A showcase example of effective open-source AI implementation&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Technical Details: For the Curious
&lt;/h2&gt;

&lt;p&gt;For those interested in the technical aspects, our team conducted detailed calculations on the economics of running these models at scale:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Enterprise GPU costs approximately $1.60/hour&lt;/li&gt;
&lt;li&gt;With parallel processing supporting 40+ users, effective cost drops to approximately $0.30 per million tokens&lt;/li&gt;
&lt;li&gt;For typical presentation workloads, this translates to roughly $0.10 per slide&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Making AI Sustainable for Startups
&lt;/h2&gt;

&lt;p&gt;This case demonstrates a critical reality in today's AI product landscape: proprietary models aren't always the most cost-effective solution. By leveraging open-source alternatives on optimized infrastructure, startups can dramatically improve unit economics while maintaining impressive capabilities.&lt;/p&gt;

&lt;p&gt;For this presentation tool startup, our solution represents the difference between a challenging cost structure and a sustainable, scalable business model. For us, it showcases the practical benefits of our infrastructure for AI-powered applications.&lt;/p&gt;

&lt;p&gt;This collaborative approach to AI implementation represents the future of sustainable AI-powered products—where technical innovation meets business reality to create truly viable solutions.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Interested in exploring how open-source AI models could reduce costs for your product? &lt;a href="https://docs.google.com/forms/d/e/1FAIpQLSdfh5RIw2hWa1vnXhRUA4QIGADhBMkAHnpjqoNCHbrdF283cg/viewform" rel="noopener noreferrer"&gt; Contact our team to discuss your specific use case. &lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Useful Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.nosana.com" rel="noopener noreferrer"&gt;Nosana Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://nosana.com/discord" rel="noopener noreferrer"&gt;Join the Discord&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://nosana.com/x" rel="noopener noreferrer"&gt;Follow us on X&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://nosana.com/github" rel="noopener noreferrer"&gt;Nosana on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>aiops</category>
      <category>llm</category>
      <category>startup</category>
    </item>
    <item>
      <title>Nosana Builders Challenge: Agent-101</title>
      <dc:creator>David Britt</dc:creator>
      <pubDate>Wed, 25 Jun 2025 09:52:48 +0000</pubDate>
      <link>https://dev.to/nosana/nosana-builders-challenge-agent-101-1po9</link>
      <guid>https://dev.to/nosana/nosana-builders-challenge-agent-101-1po9</guid>
      <description>&lt;p&gt;The main goal of this &lt;code&gt;Nosana Builders Challenge&lt;/code&gt; to teach participants to build and deploy agents. This first step will be in running a basic AI agent and giving it some basic functionality. Participants will add a tool, for the tool calling capabilities of the agent. These are basically some TypeScript functions, that will, for example, retrieve some data from a weather API, post a tweet via an API call, etc.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;a href="https://github.com/mastra-ai/mastra" rel="noopener noreferrer"&gt;Mastra&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;For this challenge, we will be using Mastra to build our tool.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Mastra is an opinionated TypeScript framework that helps you build AI applications and features quickly. It gives you the set of primitives you need: workflows, agents, RAG, integrations, and evals. You can run Mastra on your local machine, or deploy to a serverless cloud.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Required Reading
&lt;/h3&gt;

&lt;p&gt;We recommend reading the following sections to get started with how to create an Agent and how to implement Tool Calling.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/en/docs/agents/overview" rel="noopener noreferrer"&gt;Mastra Docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/en/guides/guide/stock-agent" rel="noopener noreferrer"&gt;Mastra Guide: Build an AI stock agent&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;To get started run the following command to start developing:&lt;br&gt;
We recommend using &lt;a href="https://pnpm.io/installation" rel="noopener noreferrer"&gt;pnpm&lt;/a&gt;, but you can try npm, or bun if you prefer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm &lt;span class="nb"&gt;install
&lt;/span&gt;pnpm run dev
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Assignment
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Challenge Overview
&lt;/h3&gt;

&lt;p&gt;Welcome to the Nosana AI Agent Hackathon! Your mission is to build and deploy an AI agent on Nosana.&lt;br&gt;
While we provide a weather agent as an example, your creativity is the limit. Build agents that:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Beginner Level:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Simple Calculator&lt;/strong&gt;: Perform basic math operations with explanations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Todo List Manager&lt;/strong&gt;: Help users track their daily tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Intermediate Level:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;News Summarizer&lt;/strong&gt;: Fetch and summarize latest news articles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Crypto Price Checker&lt;/strong&gt;: Monitor cryptocurrency prices and changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Stats Reporter&lt;/strong&gt;: Fetch repository statistics and insights&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Advanced Level:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Blockchain Monitor&lt;/strong&gt;: Track and alert on blockchain activities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trading Strategy Bot&lt;/strong&gt;: Automate simple trading strategies&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deploy Manager&lt;/strong&gt;: Deploy and manage applications on Nosana&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Or any other innovative AI agent idea at your skill level!&lt;/p&gt;
&lt;h3&gt;
  
  
  Getting Started
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Fork the &lt;a href="https://github.com/nosana-ai/agent-challenge" rel="noopener noreferrer"&gt;Nosana Agent Challenge&lt;/a&gt;&lt;/strong&gt; to your GitHub account&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clone your fork&lt;/strong&gt; locally&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Install dependencies&lt;/strong&gt; with &lt;code&gt;pnpm install&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run the development server&lt;/strong&gt; with &lt;code&gt;pnpm run dev&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build your agent&lt;/strong&gt; using the Mastra framework&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  How to build your Agent
&lt;/h3&gt;

&lt;p&gt;Here we will describe the steps needed to build an agent.&lt;/p&gt;
&lt;h4&gt;
  
  
  Folder Structure
&lt;/h4&gt;

&lt;p&gt;Provided in this repo, there is the &lt;code&gt;Weather Agent&lt;/code&gt;.&lt;br&gt;
This is a fully working agent that allows a user to chat with an LLM, and fetches real time weather data for the provided location.&lt;/p&gt;

&lt;p&gt;There are two main folders we need to pay attention to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="//./src/mastra/agents/weather-agent/"&gt;src/mastra/agents/weather-agent/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="//./src/mastra/agents/your-agent/"&gt;src/mastra/agents/your-agents/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In &lt;code&gt;src/mastra/agents/weather-agent/&lt;/code&gt; you will find a complete example of a working agent. Complete with Agent definition, API calls, interface definition, basically everything needed to get a full fledged working agent up and running.&lt;br&gt;
In &lt;code&gt;src/mastra/agents/your-agents/&lt;/code&gt; you will find a bare bones example of the needed components, and imports to get started building your agent, we recommend you rename this folder, and it's files to get started.&lt;/p&gt;

&lt;p&gt;Rename these files to represent the purpose of your agent and tools. You can use the Weather Agent Example as a guide until you are done with it, and then you can delete these files before submitting your final submission.&lt;/p&gt;

&lt;p&gt;As a bonus, for the ambitious ones, we have also provided the &lt;a href="//./src/mastra/agents/weather-agent/weather-workflow.ts"&gt;src/mastra/agents/weather-agent/weather-workflow.ts&lt;/a&gt; file as an example. This file contains an example of how you can chain agents and tools to create a workflow, in this case, the user provides their location, and the agent retrieves the weather for the specified location, and suggests an itinerary.&lt;/p&gt;
&lt;h3&gt;
  
  
  LLM-Endpoint
&lt;/h3&gt;

&lt;p&gt;Agents depend on an LLM to be able to do their work.&lt;/p&gt;
&lt;h4&gt;
  
  
  Running Your Own LLM with Ollama
&lt;/h4&gt;

&lt;p&gt;The default configuration uses a local &lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; LLM.&lt;br&gt;
For local development or if you prefer to use your own LLM, you can use &lt;a href="https://ollama.ai" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; to serve the lightweight &lt;code&gt;qwen2.5:1.5b&lt;/code&gt; mode.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Installation &amp;amp; Setup:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://ollama.com/download" rel="noopener noreferrer"&gt; Install Ollama &lt;/a&gt;&lt;/strong&gt;:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start Ollama service&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Pull and run the &lt;code&gt;qwen2.5:1.5b&lt;/code&gt; model&lt;/strong&gt;:
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull qwen2.5:1.5b
ollama run qwen2.5:1.5b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Update your &lt;code&gt;.env&lt;/code&gt; file&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There are two predefined environments defined in the &lt;code&gt;.env&lt;/code&gt; file. One for local development and another, with a larger model, &lt;code&gt;qwen2.5:32b&lt;/code&gt;, for more complex use cases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why &lt;code&gt;qwen2.5:1.5b&lt;/code&gt;?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lightweight (only ~1GB)&lt;/li&gt;
&lt;li&gt;Fast inference on CPU&lt;/li&gt;
&lt;li&gt;Supports tool calling&lt;/li&gt;
&lt;li&gt;Great for development and testing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Do note &lt;code&gt;qwen2.5:1.5b&lt;/code&gt; is not suited for complex tasks.&lt;/p&gt;

&lt;p&gt;The Ollama server will run on &lt;code&gt;http://localhost:11434&lt;/code&gt; by default and is compatible with the OpenAI API format that Mastra expects.&lt;/p&gt;
&lt;h3&gt;
  
  
  Testing your Agent
&lt;/h3&gt;

&lt;p&gt;You can read the &lt;a href="https://mastra.ai/en/docs/local-dev/mastra-dev" rel="noopener noreferrer"&gt;Mastra Documentation: Playground&lt;/a&gt; to learn more on how to test your agent locally.&lt;br&gt;
Before deploying your agent to Nosana, it's crucial to thoroughly test it locally to ensure everything works as expected. Follow these steps to validate your agent:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Local Testing:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Start the development server&lt;/strong&gt; with &lt;code&gt;pnpm run dev&lt;/code&gt; and navigate to &lt;code&gt;http://localhost:8080&lt;/code&gt; in your browser&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test your agent's conversation flow&lt;/strong&gt; by interacting with it through the chat interface&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verify tool functionality&lt;/strong&gt; by triggering scenarios that call your custom tools&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check error handling&lt;/strong&gt; by providing invalid inputs or testing edge cases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor the console logs&lt;/strong&gt; to ensure there are no runtime errors or warnings&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Docker Testing:&lt;/strong&gt;&lt;br&gt;
After building your Docker container, test it locally before pushing to the registry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Build your container&lt;/span&gt;
docker build &lt;span class="nt"&gt;-t&lt;/span&gt; yourusername/agent-challenge:latest &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# Run it locally with environment variables&lt;/span&gt;
docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 8080:8080 &lt;span class="nt"&gt;--env-file&lt;/span&gt; .env yourusername/agent-challenge:latest

&lt;span class="c"&gt;# Test the containerized agent at http://localhost:8080&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ensure your agent responds correctly and all tools function properly within the containerized environment. This step is critical as the Nosana deployment will use this exact container.&lt;/p&gt;

&lt;h3&gt;
  
  
  Submission Requirements
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. Code Development
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Fork this repository and develop your AI agent&lt;/li&gt;
&lt;li&gt;Your agent must include at least one custom tool (function)&lt;/li&gt;
&lt;li&gt;Code must be well-documented and include clear setup instructions&lt;/li&gt;
&lt;li&gt;Include environment variable examples in a &lt;code&gt;.env.example&lt;/code&gt; file&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  2. Docker Container
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Create a &lt;code&gt;Dockerfile&lt;/code&gt; for your agent&lt;/li&gt;
&lt;li&gt;Build and push your container to Docker Hub or GitHub Container Registry&lt;/li&gt;
&lt;li&gt;Container must be publicly accessible&lt;/li&gt;
&lt;li&gt;Include the container URL in your submission&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;
  
  
  Build, Run, Publish
&lt;/h5&gt;

&lt;p&gt;Note: You'll need an account on &lt;a href="https://hub.docker.com/" rel="noopener noreferrer"&gt;Dockerhub&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="c"&gt;# Build and tag&lt;/span&gt;
docker build &lt;span class="nt"&gt;-t&lt;/span&gt; yourusername/agent-challenge:latest &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# Run the container locally&lt;/span&gt;
docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 8080:8080 yourusername/agent-challenge:latest

&lt;span class="c"&gt;# Login&lt;/span&gt;
docker login

&lt;span class="c"&gt;# Push&lt;/span&gt;
docker push yourusername/agent-challenge:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  3. Nosana Deployment
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Deploy your Docker container on Nosana&lt;/li&gt;
&lt;li&gt;Your agent must successfully run on the Nosana network&lt;/li&gt;
&lt;li&gt;Include the Nosana job ID or deployment link&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;
  
  
  Nosana Job Definition
&lt;/h5&gt;

&lt;p&gt;We have included a Nosana job definition at &amp;lt;./nos_job_def/nosana_mastra.json&amp;gt;, that you can use to publish your agent to the Nosana network.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A. Deploying using &lt;a href="https://github.com/nosana-ci/nosana-cli/" rel="noopener noreferrer"&gt;@nosana/cli&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Edit the file and add in your published docker image to the &lt;code&gt;image&lt;/code&gt; property. &lt;code&gt;"image": "docker.io/yourusername/agent-challenge:latest"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Download and install the &lt;a href="https://github.com/nosana-ci/nosana-cli/" rel="noopener noreferrer"&gt;@nosana/cli&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Load your wallet with some funds

&lt;ul&gt;
&lt;li&gt;Retrieve your address with: &lt;code&gt;nosana address&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Go to our &lt;a href="https://nosana.com/discord" rel="noopener noreferrer"&gt;Discord&lt;/a&gt; and ask for some NOS and SOL to publish your job.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Run: &lt;code&gt;nosana job post --file nosana_mastra.json --market nvidia-3060 --timeout 30&lt;/code&gt;
&lt;/li&gt;

&lt;li&gt;Go to the &lt;a href="https://dashboard.nosana.com/deploy" rel="noopener noreferrer"&gt;Nosana Dashboard&lt;/a&gt; to see your job&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;B. Deploying using the &lt;a href="https://dashboard.nosana.com/deploy" rel="noopener noreferrer"&gt;Nosana Dashboard&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Make sure you have &lt;a href="https://phantom.com/" rel="noopener noreferrer"&gt;https://phantom.com/&lt;/a&gt;, installed for your browser.&lt;/li&gt;
&lt;li&gt;Go to our &lt;a href="https://nosana.com/discord" rel="noopener noreferrer"&gt;Discord&lt;/a&gt; and ask for some NOS and SOL to publish your job.&lt;/li&gt;
&lt;li&gt;Click the &lt;code&gt;Expand&lt;/code&gt; button, on the &lt;a href="https://dashboard.nosana.com/deploy" rel="noopener noreferrer"&gt;Nosana Dashboard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Copy and Paste your edited Nosana Job Definition file into the Textarea&lt;/li&gt;
&lt;li&gt;Choose an appropriate GPU for the AI model that you are using&lt;/li&gt;
&lt;li&gt;Click &lt;code&gt;Deploy&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  4. Video Demo
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Record a 1-3 minute video demonstrating:

&lt;ul&gt;
&lt;li&gt;Your agent running on Nosana&lt;/li&gt;
&lt;li&gt;Key features and functionality&lt;/li&gt;
&lt;li&gt;Real-world use case demonstration&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Upload to YouTube, Loom, or similar platform&lt;/li&gt;

&lt;/ul&gt;

&lt;h4&gt;
  
  
  5. Documentation
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Update this README with:

&lt;ul&gt;
&lt;li&gt;Agent description and purpose&lt;/li&gt;
&lt;li&gt;Setup instructions&lt;/li&gt;
&lt;li&gt;Environment variables required&lt;/li&gt;
&lt;li&gt;Docker build and run commands&lt;/li&gt;
&lt;li&gt;Example usage&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Submission Process
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Complete all requirements&lt;/strong&gt; listed above&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Commit all of your changes to the &lt;code&gt;main&lt;/code&gt; branch of your forked repository&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;All your code changes&lt;/li&gt;
&lt;li&gt;Updated README&lt;/li&gt;
&lt;li&gt;Link to your Docker container&lt;/li&gt;
&lt;li&gt;Link to your video demo&lt;/li&gt;
&lt;li&gt;Nosana deployment proof&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Social Media Post&lt;/strong&gt;: Share your submission on X (Twitter)

&lt;ul&gt;
&lt;li&gt;Tag @nosana_ai&lt;/li&gt;
&lt;li&gt;Include a brief description of your agent&lt;/li&gt;
&lt;li&gt;Add hashtag #NosanaAgentChallenge&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Finalize your submission on the &lt;a href="https://earn.superteam.fun/agent-challenge" rel="noopener noreferrer"&gt;https://earn.superteam.fun/agent-challenge&lt;/a&gt; page&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Remember to add your forked GitHub repository link&lt;/li&gt;
&lt;li&gt;Remember to add a link to your X post.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Judging Criteria
&lt;/h3&gt;

&lt;p&gt;Submissions will be evaluated based on:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Innovation&lt;/strong&gt; (25%)&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Originality of the agent concept&lt;/li&gt;
&lt;li&gt;Creative use of AI capabilities&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Technical Implementation&lt;/strong&gt; (25%)&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Code quality and organization&lt;/li&gt;
&lt;li&gt;Proper use of the Mastra framework&lt;/li&gt;
&lt;li&gt;Efficient tool implementation&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Nosana Integration&lt;/strong&gt; (25%)&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Successful deployment on Nosana&lt;/li&gt;
&lt;li&gt;Resource efficiency&lt;/li&gt;
&lt;li&gt;Stability and performance&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Real-World Impact&lt;/strong&gt; (25%)

&lt;ul&gt;
&lt;li&gt;Practical use cases&lt;/li&gt;
&lt;li&gt;Potential for adoption&lt;/li&gt;
&lt;li&gt;Value proposition&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Prizes
&lt;/h3&gt;

&lt;p&gt;We’re awarding the &lt;strong&gt;top 10 submissions&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🥇 1st: $1,000 USDC&lt;/li&gt;
&lt;li&gt;🥈 2nd: $750 USDC&lt;/li&gt;
&lt;li&gt;🥉 3rd: $450 USDC&lt;/li&gt;
&lt;li&gt;🏅 4th: $200 USDC&lt;/li&gt;
&lt;li&gt;🔟 5th–10th: $100 USDC&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All prizes are paid out directly to participants on &lt;a href="https://superteam.fun" rel="noopener noreferrer"&gt;SuperTeam&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Resources
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.nosana.io" rel="noopener noreferrer"&gt;Nosana Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/docs" rel="noopener noreferrer"&gt;Mastra Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mastra.ai/en/guides/guide/stock-agent" rel="noopener noreferrer"&gt;Mastra Guide: Build an AI stock agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/nosana-ci/nosana-cli" rel="noopener noreferrer"&gt;Nosana CLI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.docker.com" rel="noopener noreferrer"&gt;Docker Documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Support
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Join &lt;a href="https://discord.gg/nosana" rel="noopener noreferrer"&gt;Nosana Discord&lt;/a&gt; for technical support where we have dedicated &lt;a href="https://discord.com/channels/236263424676331521/1354391113028337664" rel="noopener noreferrer"&gt;Builders Challenge Dev chat&lt;/a&gt; channel.&lt;/li&gt;
&lt;li&gt;Follow &lt;a href="https://x.com/nosana_ai" rel="noopener noreferrer"&gt;@nosana_ai&lt;/a&gt; for updates.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Important Notes
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Ensure your agent doesn't expose sensitive data&lt;/li&gt;
&lt;li&gt;Test thoroughly before submission&lt;/li&gt;
&lt;li&gt;Keep your Docker images lightweight&lt;/li&gt;
&lt;li&gt;Document all dependencies clearly&lt;/li&gt;
&lt;li&gt;Make your code reproducible&lt;/li&gt;
&lt;li&gt;You can vibe code it if you want 😉&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Only one submission per participant&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Submissions that do not compile, and do not meet the specified requirements, will not be considered&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deadline is: 9 July 2025, 12.01 PM&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Announcement will be announced about one week later, stay tuned for our socials for exact date&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Don’t Miss Nosana Builder Challenge Updates
&lt;/h3&gt;

&lt;p&gt;Good luck, builders! We can't wait to see the innovative AI agents you create for the Nosana ecosystem.&lt;br&gt;
&lt;strong&gt;Happy Building!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aiops</category>
      <category>llm</category>
      <category>datascience</category>
    </item>
    <item>
      <title>LLM Benchmarking: Cost-Efficient Performance</title>
      <dc:creator>David Britt</dc:creator>
      <pubDate>Wed, 09 Apr 2025 05:08:00 +0000</pubDate>
      <link>https://dev.to/nosana/llm-benchmarking-cost-efficient-performance-5h69</link>
      <guid>https://dev.to/nosana/llm-benchmarking-cost-efficient-performance-5h69</guid>
      <description>&lt;p&gt;Economic viability is one of the most important factors in the success of new products and applications. No less so for Nosana. We show that the consumer-grade flagship RTX 4090 can provide LLM inference at a staggering 2.5X lower cost compared to the industry-standard enterprise A100 GPU.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://nosana.io/blog/llm_benchmarking_on_the_nosana_grid/" rel="noopener noreferrer"&gt;Our previous article&lt;/a&gt; showed how we implemented a uniform LLM benchmark that helps track individual node performance and configurations. With this information, we are able to design fairer GPU compute markets by lowering their performance variation. But although the initial benchmark data is valuable in terms of market design optimization, it does not give meaningful insights into the realistic performance we are interested in. This is because the benchmark was designed to be compatible with all nodes on the network but it wasn’t able to test the full capacity of each node.&lt;/p&gt;

&lt;p&gt;In this article, we address this limitation and zoom in on the performance comparison between consumer-grade and enterprise hardware. We implement benchmarks and use the results in a cost-adjusted performance analysis to highlight the competitive advantage of the Nosana Grid over traditional compute providers.&lt;/p&gt;

&lt;h3&gt;
  
  
  LLM Inference
&lt;/h3&gt;

&lt;p&gt;When we talk about performance measurements in the context of LLM inference, we are mostly interested in inference speed. To better understand the factors influencing this speed, let’s begin with a brief overview of how LLM inference works.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://nosana.io/blog/llm_benchmarking_on_the_nosana_grid" rel="noopener noreferrer"&gt;&lt;em&gt;The previous blog post&lt;/em&gt;&lt;/a&gt; &lt;em&gt;went into more detail on this topic. If you have read it, you can skip ahead to this section. Readers who are interested in an in-depth explanation should refer to the previous blog post.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;As far as computers are concerned, LLMs consist of two files. A large file containing the model parameters, and a smaller file that is able to run the model. The size of an LLM is determined by the amount of parameters it has and the precision of its parameters. Precision means the accuracy with which the model’s parameters are represented and is measured in bits. To calculate an example, let's take the popular LLM Llama 3.1 with 8 billion parameters and a commonly used 16-bit floating-point precision. One parameter with 16-bit floating point precision equals 2 bytes times 8 billion parameters, giving us a total model size of 16 GB. The model size is an important factor in the usability of LLMs because it determines which types of hardware are able to load the model.&lt;/p&gt;

&lt;p&gt;Once loaded onto hardware, LLMs perform next-token prediction. This means that LLMs iteratively predict and add single tokens to an input sequence that is provided as context. This process of generating tokens is called inference. To perform inference, an LLM goes through two stages, the &lt;strong&gt;prefill&lt;/strong&gt; phase and the &lt;strong&gt;decoding&lt;/strong&gt; phase. During the prefill phase, the model processes all input tokens simultaneously to compute all the necessary information for generating subsequent tokens. During the decoding phase, the model uses the cached information computed during the prefill phase to generate new tokens.&lt;/p&gt;

&lt;p&gt;In practice, the prefill phase corresponds to the time you have to wait until the LLM starts generating its response. It is a relatively short period that makes efficient use of available computing capacity through highly parallelized computations. We call the prefill phase &lt;strong&gt;compute-bound&lt;/strong&gt; because it is limited by the computational capacity of the hardware running the LLM.&lt;/p&gt;

&lt;p&gt;The decoding phase generally takes up the bulk of the inference time and corresponds to the period between the generation of the first and the completion of the last token. This process is not as computationally efficient as the prefill phase because it requires the constant on and offloading of cached computations between the processing units and memory. We call the decoding phase &lt;strong&gt;memory-bound&lt;/strong&gt; because its performance is limited by how fast data can be moved to and from memory.&lt;/p&gt;

&lt;h2&gt;
  
  
  GPUs &amp;amp; Inference
&lt;/h2&gt;

&lt;p&gt;In large production use cases, LLM inference is predominantly performed on high-end graphics processing units, or GPUs. Three key specifications of GPUs are particularly relevant to LLM inference:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;VRAM (Video Random Access Memory): The amount of available memory on the GPU&lt;/li&gt;
&lt;li&gt;FLOPS (Floating Point Operations Per Second): A measure of the GPU’s computational capacity&lt;/li&gt;
&lt;li&gt;Memory bandwidth: The speed at which data can be transferred within the GPU&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The processing of single sequences as described in the previous section usually leaves the VRAM and computational capacity of GPUs underutilized. To make better use of these resources we need to increase the amount of tokens processed and the computations performed. We can do this by processing a batch of multiple sequences at once. In production use cases this means that prompts from different users get bundled together and processed at the same time. Handling multiple requests, or &lt;strong&gt;concurrent users&lt;/strong&gt;, plays an important role in the optimization of GPU usage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Current Research
&lt;/h2&gt;

&lt;p&gt;Alright, with the basics of LLM inference in mind, let's get more specific about the goal of the current research. Previously, we benchmarked the performance of all GPU types on the Nosana grid using Llama 3.1–8B with a single concurrent user. Running inference with a single concurrent user leads to GPU underutilization, limiting the insights gained when comparing performance with other compute providers. In this article, we set up benchmarks for accurate performance comparisons. We’ll focus our analysis on comparing Nosana’s performance against established cloud computing platforms. This comparison involves two key benchmarks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A baseline assessment measuring the performance of current market leaders&lt;/li&gt;
&lt;li&gt;An experimental evaluation of the Nosana grid’s performance&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Baseline Benchmark
&lt;/h3&gt;

&lt;p&gt;Similar to running models on the Nosana grid, you can use a fully customized Docker image when renting a GPU from a compute provider. This means that we can keep important variables such as the model files and LLM serving framework constant for our experiment and only have to pick the &lt;em&gt;GPU type&lt;/em&gt; and the &lt;em&gt;price of usage&lt;/em&gt; for a fair comparison.&lt;/p&gt;

&lt;p&gt;Because running LLMs in a production setting requires high capacity in terms of computation and memory, there are two main types to consider when renting a GPU, the A100 and the H100. The H100 is a newer and more powerful GPU than the A100, but both cards are able to load in and effectively run most open-source models. Given its relative affordability and arguable cost-effectiveness, we opt for the A100 as our baseline GPU.&lt;/p&gt;

&lt;p&gt;For the price of usage variable there are more options to consider because there are various compute providers that offer a specific rental price per hour. To pick a competitive price we made use of the website &lt;a href="https://getdeploying.com/reference/cloud-gpu" rel="noopener noreferrer"&gt;https://getdeploying.com&lt;/a&gt;, which shows aggregated GPU rental prices for all cloud providers. At the time of writing the cheapest rental price for an A100–80GB is offered by &lt;a href="https://crusoe.ai/" rel="noopener noreferrer"&gt;Crusoe&lt;/a&gt; at $1.65 per hour, so we will use this price for our analysis.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Experimental Benchmark
&lt;/h3&gt;

&lt;p&gt;To compare the Nosana grid with our baseline approach, we need to determine the GPU type and an accompanying price per hour for our experimental benchmark. We’ll leave the price per hour as a variable to allow comparisons across multiple hypothetical pricing scenarios. This means that we only have to choose the GPU type.&lt;/p&gt;

&lt;p&gt;The RTX 4090 is the most frequently encountered GPU on the Nosana grid, closely followed by the RTX 3090. The prevalence of the RTX 4090 and RTX 3090 GPUs on the Nosana grid highlights one of the network’s primary advantages over centralized compute providers: its ability to tap into a pool of underutilized consumer-grade hardware. Consequently, the most interesting comparison to make for Nosana is between popular enterprise hardware such as the A100 and underutilized consumer hardware such as the RTX 4090. Therefore, we pick the RTX 4090 for our experimental benchmark.&lt;/p&gt;

&lt;h3&gt;
  
  
  Research Setup
&lt;/h3&gt;

&lt;p&gt;Let's go over the rest of the research setup. Now that we have determined the fixed variables for the baseline and the experimental condition, we have to pick the shared variables. The model, the LLM serving framework, and the number of concurrent users.&lt;/p&gt;

&lt;p&gt;For the &lt;em&gt;model,&lt;/em&gt; we picked Llama 3.1–8B. Llama models are the most used open-source LLMs in the world, and the 8 billion variant makes it possible to easily load the model on both the A100 and the RTX 4090 GPUs.&lt;/p&gt;

&lt;p&gt;As an LLM &lt;em&gt;serving framework,&lt;/em&gt; we experimented with both &lt;a href="https://github.com/vllm-project/vllm" rel="noopener noreferrer"&gt;vLLM&lt;/a&gt; and &lt;a href="https://github.com/InternLM/lmdeploy" rel="noopener noreferrer"&gt;LMdeploy&lt;/a&gt;. vLLM is one of the most popular frameworks and is frequently mentioned by our prospective clients. LMdeploy is a highly optimized framework and has shown the highest inference speed in &lt;a href="https://www.bentoml.com/blog/benchmarking-llm-inference-backends" rel="noopener noreferrer"&gt;recent benchmarking research&lt;/a&gt;. When using these frameworks, we used the out-of-the-gate inference configurations for both the baseline and experimental benchmark.&lt;/p&gt;

&lt;p&gt;In our benchmarking script we implemented functionality to send &lt;em&gt;concurrent user&lt;/em&gt; requests. While our previous article demonstrated that the 4090 slightly outperforms the A100 for a single concurrent user, this scenario rarely reflects optimized production environments. Therefore, we tested performance using 1, 5, 10, 50, and 100 concurrent users to see how the comparison holds up under different workloads.&lt;/p&gt;

&lt;p&gt;As an evaluation metric, we used tokens produced per second, which directly measures inference speed. We evaluated both the A100 and RTX 4090 GPUs across all combinations of the variables mentioned above.&lt;/p&gt;

&lt;h3&gt;
  
  
  Results
&lt;/h3&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxyzf6zxo0ty6nozempg4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxyzf6zxo0ty6nozempg4.png" alt="Image description" width="800" height="552"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;In the above graphs, we can see the performance of the RTX 4090 and the A100 with the LMdeploy and vLLM frameworks for different levels of concurrency. The graphs show that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;At a low number of concurrent users, the A100s outperform the 4090s. However, this outperformance decreases relatively with the increase of concurrent users.&lt;/li&gt;
&lt;li&gt;At a higher number of concurrent users, LMdeploy greatly outperforms vLLM with its standard settings. The RTX 4090 with LMdeploy even outperforms the A100 with vLLM at 50 and 100 concurrent users.&lt;/li&gt;
&lt;li&gt;You need 1.5–2 RTX 4090s to reproduce the performance of an A100.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Price Comparison
&lt;/h2&gt;

&lt;p&gt;Considering the respective purchase costs of the RTX 4090 and the A100, the performance results of the RTX 4090 are quite impressive. In this section, we analyze both GPUs’ performance while taking into account their purchase cost and operational expenses. For the cost-adjusted analysis we assume:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The purchase cost of an RTX 4090 is $1,750.&lt;/li&gt;
&lt;li&gt;The purchase cost of an A100–80GB is $10,000.&lt;/li&gt;
&lt;li&gt;2 RTX 4090s are required to reproduce the performance of an A100.&lt;/li&gt;
&lt;li&gt;The price of energy is equal to the average American price of $0.16 per kWh.&lt;/li&gt;
&lt;li&gt;The energy consumption of an RTX 4090 is 300W.&lt;/li&gt;
&lt;li&gt;The energy consumption of an A100 is 250W.&lt;/li&gt;
&lt;li&gt;The price for renting an A100 is $1.65.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s start by calculating the return on investment (ROI) for the A100, which measures the amount of return relative to the investment cost. This helps us determine how quickly each GPU setup can earn its initial cost and start generating profit.&lt;/p&gt;

&lt;h4&gt;
  
  
  A100 ROI
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;Initial Investment: $10,000&lt;/li&gt;
&lt;li&gt;Hourly Energy Cost: 0.25kW * 1 hour * $0.16/kWh = $0.04 per hour&lt;/li&gt;
&lt;li&gt;Hourly Rental Revenue: $1.65 per hour&lt;/li&gt;
&lt;li&gt;Hourly Net Profit: $1.65 — $0.04 = $1.61 per hour&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To find the break-even point, we divide the initial investment of $10,000 by the hourly net profit of $1.61, which gives us approximately 6,211 hours or 259 days. Therefore, it would take about 259 days of continuous operation and rental to earn back the initial investment on the A100 GPU.&lt;/p&gt;

&lt;h4&gt;
  
  
  RTX 4090 ROI
&lt;/h4&gt;

&lt;p&gt;Let’s perform a similar analysis for the RTX 4090 setup where we deliver the same performance as the A100 setup. Remember, we’re assuming that two RTX 4090s are required to match the performance of one A100.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Initial Investment: $1,750 * 2 = $3,500&lt;/li&gt;
&lt;li&gt; Hourly Energy Cost: (0.3kW * 2) * 1 hour * $0.16/kWh = $0.096 per hour&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let’s first calculate the ROI assuming we rent out the RTX 4090 setup at the same price as the A100:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Hourly Rental Revenue: $1.65 per hour&lt;/li&gt;
&lt;li&gt; Hourly Net Profit: $1.65 — $0.096 = $1.554 per hour&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To find the break-even point: $3,500 \/ $1.554 per hour ≈ 2,252 hours or about 94 days&lt;/p&gt;

&lt;p&gt;In this scenario, the RTX 4090 setup would break even much faster than the A100, in about 94 days compared to 259 days for the A100.&lt;/p&gt;

&lt;p&gt;Now, let’s determine the hourly rental price that would allow the RTX 4090 setup to break even in the same timeframe as the A100. Here’s the calculation:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Hourly rate to cover initial investment: $3,500 \/ 6,211 hours = $0.56 per hour&lt;/li&gt;
&lt;li&gt;Total hourly rate including energy cost: $0.563 + $0.096 = $0.66 per hour&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This means that if we set the hourly rental price for the RTX 4090 setup at $0.66, it would break even at the same point as the A100.&lt;/p&gt;

&lt;p&gt;Comparing this to the A100’s rental price of $1.65 per hour, we can see that the RTX 4090 setup could potentially be rented out 2.5X cheaper than the A100 while still achieving the same return on investment timeline. On top of that, the initial investment for the RTX 4090 setup is significantly lower than that of the A100, which reduces the barrier to entry for those looking to offer GPU rental services.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;Through our comparison of the A100 and RTX 4090, we have demonstrated the potential competitive advantage that consumer-grade hardware has over enterprise hardware. As production models currently seem to trend toward smaller sizes, this benefit will only grow as more consumer-grade hardware becomes capable of running AI models efficiently. This trend holds enormous potential benefits for the Nosana grid, which primarily consists of consumer-grade technology.&lt;/p&gt;

</description>
      <category>mlops</category>
      <category>llm</category>
      <category>machinelearning</category>
      <category>ai</category>
    </item>
    <item>
      <title>LLM Benchmarking on the Nosana grid</title>
      <dc:creator>David Britt</dc:creator>
      <pubDate>Mon, 07 Apr 2025 11:12:20 +0000</pubDate>
      <link>https://dev.to/nosana/llm-benchmarking-on-the-nosana-grid-fmk</link>
      <guid>https://dev.to/nosana/llm-benchmarking-on-the-nosana-grid-fmk</guid>
      <description>&lt;h3&gt;
  
  
  Intro
&lt;/h3&gt;

&lt;p&gt;The Nosana grid contains about two thousand nodes with various hardware configurations, which are actively running AI models. At the start of the Nosana test phase, these nodes have mostly been running image generation or transcription jobs through Stable Diffusion and Whisper. Although these jobs are suitable to make sure nodes are functioning properly, they do not provide any additional benefit from an AI use case perspective.&lt;/p&gt;

&lt;p&gt;So to make the best use of the nodes until the launch of the mainnet, at the beginning of 2024 we started looking for opportunities to run jobs that would be useful to the Nosana community. As Large Language Models (LLMS) are projected to be the center pillar of Nosana’s AI demand in the foreseeable future, we decided to hire a dedicated AI specialist team to start working on a large-scale LLM Benchmarking project on the Nosana grid. This project aims to provide information that will help clients make better informed decisions, help the Nosana team implement a fairer market system, and contribute valuable information to the LLM research community. In this blogpost, we will go over the required fundamentals to understand how benchmarking works, and then show how we can use the results of the benchmarks to create fair markets.&lt;/p&gt;

&lt;h3&gt;
  
  
  LLM Fundamentals
&lt;/h3&gt;

&lt;p&gt;Let’s start with the fundamentals of LLMs. What is an LLM? How does it work? And what do we need to run one?&lt;/p&gt;

&lt;h4&gt;
  
  
  Architecture
&lt;/h4&gt;

&lt;p&gt;For anyone reading up on LLMs they might seem like complex neural networks used in artificial intelligence. While this is true to some extent, in practice, LLMs essentially consist of two easy to understand files. The model weights file, and the model code file. The model weights file contains the parameters of the model and determines the model size which is measured in bytes. The model code file contains the instructions on how to load and run the model.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fngd8fkiks33vab60kurl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fngd8fkiks33vab60kurl.png" alt="Image description" width="508" height="392"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When looking at llama3.1–70B, the identifier 70B means the model contains 70 billion parameters. The parameters of the model are stored with 16-bit floating point precision, which equals to 2 bytes, making the model weights file size 140 gigabytes.&lt;/p&gt;

&lt;p&gt;Each parameter in the model weights file corresponds to a neuron in the architecture described in the model code file. For most modern day LLMs this architecture is called a transformer. The image below shows a generalized transformer architecture used for producing text.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq5yds1pr95xxtvj6wc01.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq5yds1pr95xxtvj6wc01.png" alt="Image description" width="612" height="856"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A detailed explanation of transformers is beyond the scope of this article, so we will focus only on the part that is the most important to understand for this benchmarking research. For a LLM to output text it needs to perform computations at every layer, and to do this, it needs to have its parameters and specific cached computations loaded into memory at the respective layers in the model.&lt;/p&gt;

&lt;h4&gt;
  
  
  Inference
&lt;/h4&gt;

&lt;p&gt;Now that we know what an LLM is, let’s see how we actually produce language with them. LLMs are trained on the task of next token prediction. Tokens are units of text that correspond to words or parts of words, and they are the vocabulary that is understood by LLMs. So as far as LLMs are concerned, producing language is nothing more than correctly predicting the next word or subword given the preceding ones. This process of producing tokens with LLMs is called inference. The speed of inference is an important factor in the usability of LLMs, and it is influenced by the model size, architecture, and the hardware &amp;amp; software configuration on which it is run.&lt;/p&gt;

&lt;p&gt;So how does inference work? To facilitate inference, LLMs go through 2 main stages. The &lt;strong&gt;prefill&lt;/strong&gt; phase and the &lt;strong&gt;decoding&lt;/strong&gt; phase.&lt;/p&gt;

&lt;p&gt;In the prefill phase, the model processes all input tokens simultaneously to compute all the necessary information for generating subsequent tokens. In practice, the duration of this phase corresponds to the time you wait until the LLM starts generating its response. The prefill phase is highly parallelized and makes efficient use of the computing capacity, passing through the model only once.&lt;/p&gt;

&lt;p&gt;At the start of the decoding phase, the model uses the cached information computed during the prefill phase to generate a token. From this point on, for every newly generated token, the previous token needs to pass through the network together with the cached computations. This process of repeatedly going through the network is not computationally intensive as computations only have to be performed for a single token. Instead, the decoding phase is memory intensive, because the cached information has to be moved around to perform the necessary computations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F92slfazvnsf0nfu0wlhs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F92slfazvnsf0nfu0wlhs.png" alt="Image description" width="800" height="485"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The prefill phase is &lt;strong&gt;compute bound&lt;/strong&gt;, while the decoding phase is &lt;strong&gt;memory bound&lt;/strong&gt;. A process is considered compute-bound when it requires significant computation and its speed or performance is limited primarily by the amount of processing power of the hardware. A process is memory bound when its performance is limited by the rate at which data moves to and from memory. This rate is called the &lt;strong&gt;memory bandwidth&lt;/strong&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Hardware Requirements
&lt;/h4&gt;

&lt;p&gt;Alright, so we need compute and memory capacity to run LLMs. This is where our GPUs come in. Let's figure out what we would be able to run with the Nosana grid's most popular GPU, the RTX 4090. &lt;br&gt;
When considering GPUs, the available memory is expressed in VRAM, the memory bandwidth is expressed in bytes per second, and the computational capacity is expressed in FLOPS, floating point operations per second. The RTX 4090 has 24 GB VRAM, a memory bandwidth of 1,008 GB/s, and can produce 82.58 teraFLOPS.&lt;/p&gt;

&lt;p&gt;The hard requirement for running an LLM is having enough VRAM to store its parameters + cached computations. For llama3–8B with 16-bit floating point precision, this would approximately be 18GB of VRAM. Our RTX 4090 will be able to handle that.&lt;/p&gt;

&lt;p&gt;Figuring out &lt;em&gt;if&lt;/em&gt; a model can be run is fairly straightforward, but figuring out &lt;em&gt;how it will&lt;/em&gt; perform is hard to determine as it is dependent on many variables. That being said, given the constraint of a memory bound process we can give a rough theoretical estimation of the amount of tokens per second by dividing the model size by the memory bandwidth.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;18GB ÷ 1,008 GB/s = 18ms&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This calculation gives us 18ms per token or 56 tokens per second. So this calculation would give a rough estimation for an inference run with a single input sequence, which is a predominantly memory-bound process. However, if we would increase the amount of sequences processed at the same time, it could shift from memory-bound to compute-bound and the amount of FLOPS of the GPU would start to play a more important role.&lt;/p&gt;

&lt;h4&gt;
  
  
  Optimization
&lt;/h4&gt;

&lt;p&gt;We have covered the fundamentals of LLMs and understand which variables are important for running them. By using optimization techniques we can tweak these variables resulting in beneficial tradeoffs that will allow us to run bigger models, or increase inference speed. There are many knobs to turn when it comes to optimization, but the three most important ones are &lt;strong&gt;quantization, computation enhancement&lt;/strong&gt; and &lt;strong&gt;caching strategies.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Quantization reduces the precision of the model’s weights to lower bit representations. For example, if we would use quantization to transform the 16-bit floating point precision of our llama3.1–8B model to 8-bit integer, then we would reduce the model size from 16GB to 8GB. While this can have an impact on the accuracy of the model, the significant decrease in memory usage makes it a viable optimization technique.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr49bra0kqf7z77zoikag.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr49bra0kqf7z77zoikag.png" alt="Image description" width="615" height="423"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Computation enhancement focuses on optimizing the operations performed within the model such as the attention mechanism. This mechanism is an important part of the transformer’s success, but also computationally expensive. By modifying the order of computations or by fusing together certain model layers we can reduce the data that needs to be written from and to memory.&lt;/p&gt;

&lt;p&gt;Caching strategies involve the reduction of the cached computations that are kept in memory. By simplifying the structure of the cached computations it is possible to significantly reduce the memory footprint in exchange for a slight decrease in model accuracy.&lt;/p&gt;

&lt;h4&gt;
  
  
  Inference Frameworks
&lt;/h4&gt;

&lt;p&gt;We have briefly gone over the main optimization techniques and it is already becoming apparent that implementing any potentially desired change might become complicated. Luckily, there are various LLM inference frameworks that provide an interface to a wide range of models with built-in options for optimization. As the field of LLM inference is rapidly evolving, these frameworks roll out updates frequently and there is no clear-cut best optimization framework.&lt;/p&gt;

&lt;p&gt;One such framework is &lt;strong&gt;Ollama&lt;/strong&gt;. We will give it a special mention here because it is the framework that was used to gather the initial benchmarking results. Ollama originated as a user-friendly framework with the goal of democratizing the use of LLMs. The Ollama team has impressively succeeded in achieving this goal, as it is undoubtedly the easiest framework for anyone with minimal hardware requirements to spin up an LLM. It is especially fitting for running and testing consumer grade hardware as its optimization techniques seamlessly allow models of any size to be run on GPUs with any amount of VRAM.&lt;/p&gt;



&lt;h3&gt;
  
  
  Nosana Benchmarking
&lt;/h3&gt;

&lt;p&gt;Enough preliminaries. It is time to get into the actual benchmarking. Lets start off by going over the data we collected, and how we collected it.&lt;/p&gt;

&lt;h4&gt;
  
  
  Benchmarking Setup
&lt;/h4&gt;

&lt;p&gt;As mentioned, Ollama was picked as the initial framework for benchmarking due to its compatibility with consumer grade hardware. With this framework in place, we implemented a custom made benchmarking script to gather data in two distinct categories, &lt;strong&gt;model performance&lt;/strong&gt; and &lt;strong&gt;system specifications&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The performance data contains variables on the total amount of produced tokens and how long it took to produce them, but also the clockspeed and the wattage of the GPU. The system specifications data contains variables on an extensive set of system configurations that can have either large or small effects on model performance. The tables below illustrate the kinds of variables and their potential values.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flwb7ixrfee2ix9nr0o8m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flwb7ixrfee2ix9nr0o8m.png" alt="Image description" width="743" height="256"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwhrhong3ix5jneyn94hu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwhrhong3ix5jneyn94hu.png" alt="Image description" width="800" height="228"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With the to be collected variables defined, we now had to pick a model for benchmarking. Given the notable performance of the newly launched llama models and the Nosana nodes with a wide variety of VRAM capacities, we decided to pick llama3.1–8B that can fit on all GPUs.&lt;/p&gt;

&lt;p&gt;Now for the actual procedure of benchmarking we had to create a method compatible with the current job posting structure of Nosana. A job has a maximum length of X hours, giving us plenty of time to load in one or more models, prompt them, and measure their performance. During a job every model got prompted with randomly sampled sequences such as &lt;em&gt;“Write a step-by-step guide on how to bake a chocolate cake from scratch”&lt;/em&gt;. The content of a prompt does not have an influence on the performance of the model, but it does determine the length of the response, so to make sure the LLMs spent most of their time on actual inference the pool of prompts we made encouraged longer answers. At the end of each job the output contains the model performance and system specs variables which we extracted and added to a large dataset.&lt;/p&gt;

&lt;h4&gt;
  
  
  Evaluation
&lt;/h4&gt;

&lt;p&gt;Before we get into the results, let's quickly look at the key evaluation metrics for LLM inference, &lt;strong&gt;inference speed&lt;/strong&gt; and &lt;strong&gt;time to first token (TTFT)&lt;/strong&gt;. The inference speed is measured in tokens produced per second during the decoding phase, and largely determines how long a user has to wait for a full response. The TTFT is a measurement of the time a user has to wait for the first response token, which is a crucial component in the usability and desirability of many LLM applications.&lt;/p&gt;

&lt;p&gt;In this first article, where we test consecutive single user queries, we will mainly focus on inference speed as a measure of performance, not the TFTT. The TTFT measurement is dependent on the pre-fill phase, which is a compute bound process as we mentioned. In our setting where we process single queries at a time, the amount of computations needed during the prefill phase is low, resulting in uniformly low TTFTs for all GPUs. Inference speed on the other hand, which is a measurement of how fast the decoding process is executed, is heavily dependent on memory bandwidth. There is a large variety of memory bandwidth capacity between the GPUs on the Nosana grid, so focusing on the inference speed metric will provide us with the most insightful observations.&lt;/p&gt;

&lt;p&gt;As a final note on our evaluation procedure, it is worth mentioning that most LLM inference benchmarking research focuses on the hardware’s capacity to handle &lt;strong&gt;concurrent users&lt;/strong&gt;. When dealing with concurrent users, the model has to handle multiple queries at the same time. This setting makes it possible to maximally utilize the GPUs memory and computational capacities, especially for enterprise hardware with large amounts of VRAM. We have deliberately chosen to perform this initial benchmark with consecutive single queries, or 1 concurrent user, to define a setting in which we can evaluate all GPUs on the Nosana grid. This makes it so that our results will help us identify well performing and underperforming nodes across all markets, which enables the implementation of a fair market structure. However, the current setup does not benchmark the actual maximum capabilities of the nodes, which will be a topic in one of our upcoming articles.&lt;/p&gt;



&lt;h3&gt;
  
  
  Results
&lt;/h3&gt;

&lt;p&gt;The dataset we used contains information on 10,596 jobs performed by 550 unique nodes in 15 Nosana Markets. Within these nodes there are 39 unique types of GPUs. The RTX 4090 and 3090 are the most common GPUs by far with 122 and 101 counts respectively. &lt;/p&gt;

&lt;h4&gt;
  
  
  Market Performance
&lt;/h4&gt;

&lt;p&gt;As an initial goal of our benchmarking research we set out to create fair markets. In the presented results we have aggregated the data on market level, so we can show the performance for each market and highlight opportunities for improvements.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fweaqxjmmzztgm806dp7s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fweaqxjmmzztgm806dp7s.png" alt="Image description" width="800" height="454"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the above visual we see the average tokens per second for each market. All the way at the top we have the H100 with 111 tokens per second, and at the bottom we have the RTX 4060 with 42 tokens per second. At first glance, this graph does not indicate anything out of the ordinary. On average the general trend seems to be, the more expensive the GPU the better the performance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgzsizrnsrjlyvys54w90.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgzsizrnsrjlyvys54w90.png" alt="Image description" width="800" height="454"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When we look at the performance variation within markets, indicated by the black bar, we get some more interesting findings. The larger the variation within a market the more varied the performance of different nodes within a market is. Varied performance within markets is undesirable for both clients and node runners. When clients use a Nosana node for inference compute, they want reliable performance suitable for their application. When node runners provide compute, they want to be paid based on the quality of compute they provide. A high variance within markets interferes with both of these objectives.&lt;/p&gt;

&lt;p&gt;So having completely fair markets would mean that we would have a variance of 0 within all of them. However, getting to 0 variance within each market would require drastic solutions that impede the functionality of the Nosana grid. Fortunately though, designing the markets to minimize performance variance is something we can do. For example, we can implement a minimum performance threshold based on the average of each market. Every node performing worse than the average minus the performance threshold would be removed from the market. This would not only reduce the variance of the market, which is caused predominantly by underperforming nodes, but also increase the average tokens per second within the market.&lt;/p&gt;

&lt;p&gt;The visual below illustrates what would happen to the market if a 20% or a stricter 10% threshold were to be implemented.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0x7fikocfjtfl9fcgm44.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0x7fikocfjtfl9fcgm44.png" alt="Image description" width="800" height="523"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As we can see, for the markets with high variation, the threshold causes a significant increase in performance. This is because the threshold removes a larger amount of underperforming nodes within these markets. As a result, these markets become fairer because they provide more reliable compute to clients, and payout similar amounts for similar quality compute.&lt;/p&gt;

&lt;h4&gt;
  
  
  Performance Monitoring
&lt;/h4&gt;

&lt;p&gt;After we observed the variance between nodes, we started analyzing variables that cause performance fluctuations. Even though we used a large set of hardware specs for our analysis, the results were unambiguous and pointed at two main factors responsible for performance, &lt;strong&gt;GPU type &amp;amp; Wattage&lt;/strong&gt;. GPU type is the foundation that determines the performance range of nodes, but the wattage plays an arguably more crucial role by determining the location within this range. For example, a 3070 GPU running at full power can outperform a 4090 GPU that’s not getting enough, showing how proper wattage allocation can be just as important as the GPU model itself.&lt;/p&gt;

&lt;p&gt;Now with this knowledge we are able to categorize 3 types of node runners that deviate from the expected performance. &lt;strong&gt;Spoofers&lt;/strong&gt;, malignant node runners that fake hardware configurations &amp;amp; performance. &lt;strong&gt;Underclockers&lt;/strong&gt;, economically greedy node runners that do not provide enough power to their hardware setup. And a third category of node runners with unforeseen technical issues. As the Nosana team we want to &lt;strong&gt;remove&lt;/strong&gt; spoofers, keep underclockers in &lt;strong&gt;check&lt;/strong&gt;, and &lt;strong&gt;help&lt;/strong&gt; any node runner facing technical difficulties. Due to the unfakeable nature of model inference performance, monitoring this metric helps us identify which category of underperforming node runner we are dealing with, and take appropriate measures to balance the markets.&lt;/p&gt;

&lt;h4&gt;
  
  
  Node Leaderboard
&lt;/h4&gt;

&lt;p&gt;As a first practical step towards fair markets we introduce the &lt;a href="https://leaderboard.nosana.io/" rel="noopener noreferrer"&gt;Nosana Node Leaderboard&lt;/a&gt;. Here we track the performance of each node within the market and display relevant hardware configurations. This allows us, together with the community, to monitor the performance of Nosana nodes in a transparent way. Go check it out!&lt;/p&gt;



&lt;h3&gt;
  
  
  Next Steps
&lt;/h3&gt;

&lt;p&gt;By doing research using the Nosana Grid we aim to accomplish two main goals. First, to create the optimal Nosana experience by incorporating data-driven insights into our decision making. Second, to contribute valuable research findings to the broader large language model community. In this article we mainly focused on the first goal, as the current benchmark results are practically useful for the Nosana Grid, but do not provide information about the maximum performance capacity of specific model-hardware combinations in realistic settings. In our next article, we’ll explore maximum performance in real-world conditions.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aiops</category>
      <category>llm</category>
      <category>datascience</category>
    </item>
    <item>
      <title>[Boost]</title>
      <dc:creator>David Britt</dc:creator>
      <pubDate>Thu, 03 Apr 2025 09:29:37 +0000</pubDate>
      <link>https://dev.to/djmbritt/-2lc3</link>
      <guid>https://dev.to/djmbritt/-2lc3</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/nosana/nosana-builder-challenge-create-a-nosana-template-2nca" class="crayons-story__hidden-navigation-link"&gt;Nosana Builders' Challenge - $3,000 USDC in prizes&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;
          &lt;a class="crayons-logo crayons-logo--l" href="/nosana"&gt;
            &lt;img alt="Nosana logo" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F6410%2F2a3099c0-43c4-4fe0-a455-e4359f4b8fae.png" class="crayons-logo__image"&gt;
          &lt;/a&gt;

          &lt;a href="/djmbritt" class="crayons-avatar  crayons-avatar--s absolute -right-2 -bottom-2 border-solid border-2 border-base-inverted  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F990226%2F7df27c00-8b86-421a-9853-027f52254fb0.png" alt="djmbritt profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/djmbritt" class="crayons-story__secondary fw-medium m:hidden"&gt;
              David Britt
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                David Britt
                
              
              &lt;div id="story-author-preview-content-2370872" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/djmbritt" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F990226%2F7df27c00-8b86-421a-9853-027f52254fb0.png" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;David Britt&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

            &lt;span&gt;
              &lt;span class="crayons-story__tertiary fw-normal"&gt; for &lt;/span&gt;&lt;a href="/nosana" class="crayons-story__secondary fw-medium"&gt;Nosana&lt;/a&gt;
            &lt;/span&gt;
          &lt;/div&gt;
          &lt;a href="https://dev.to/nosana/nosana-builder-challenge-create-a-nosana-template-2nca" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Apr 1 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/nosana/nosana-builder-challenge-create-a-nosana-template-2nca" id="article-link-2370872"&gt;
          Nosana Builders' Challenge - $3,000 USDC in prizes
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/web3"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;web3&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/hackathon"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;hackathon&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/gpu"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;gpu&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/nosana/nosana-builder-challenge-create-a-nosana-template-2nca" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/fire-f60e7a582391810302117f987b22a8ef04a2fe0df7e3258a5f49332df1cec71e.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;3&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/nosana/nosana-builder-challenge-create-a-nosana-template-2nca#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              &lt;span class="hidden s:inline"&gt;Add Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            3 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
      <category>web3</category>
      <category>ai</category>
      <category>hackathon</category>
      <category>gpu</category>
    </item>
    <item>
      <title>Nosana Builders' Challenge - $3,000 USDC in prizes</title>
      <dc:creator>David Britt</dc:creator>
      <pubDate>Tue, 01 Apr 2025 12:00:00 +0000</pubDate>
      <link>https://dev.to/nosana/nosana-builder-challenge-create-a-nosana-template-2nca</link>
      <guid>https://dev.to/nosana/nosana-builder-challenge-create-a-nosana-template-2nca</guid>
      <description>&lt;p&gt;We’re thrilled to launch the &lt;strong&gt;Nosana Builder Challenge&lt;/strong&gt;, a developer-focused contest designed to push the boundaries of AI model deployment on the &lt;strong&gt;Nosana Network&lt;/strong&gt;. This is your chance to showcase your skills, gain visibility, learn new tools — and compete for over &lt;strong&gt;$3,000 USDC in prizes&lt;/strong&gt;!&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Create reusable Nosana Templates for deploying AI models.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Submit via GitHub PR to win USDC token prizes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;$3,000+&lt;/strong&gt; total rewards for top 10 submissions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Deadline is &lt;strong&gt;14 of April 12.00 UTC&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Submission details: &lt;a href="https://earn.superteam.fun/listing/nosana-builders-challenge" rel="noopener noreferrer"&gt;Builders Challenge Page&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What is the Builder's Challenge?
&lt;/h2&gt;

&lt;p&gt;The Builder Challenge empowers developers to build powerful tools, features, and in this 1st edition &lt;a href="https://dashboard.nosana.com/jobs/templates/" rel="noopener noreferrer"&gt;Templates&lt;/a&gt; using the &lt;a href="https://github.com/nosana-ci/nosana-sdk" rel="noopener noreferrer"&gt;Nosana SDK&lt;/a&gt;, &lt;a href="https://github.com/nosana-ci/nosana-cli" rel="noopener noreferrer"&gt;CLI&lt;/a&gt;, and &lt;a href="https://dashboard.nosana.com/" rel="noopener noreferrer"&gt;Dashboard&lt;/a&gt;. It's all about growing a strong community of builders who can unlock the full potential of decentralized AI inferencing on Nosana.&lt;/p&gt;

&lt;h2&gt;
  
  
  First Challenge: Create Nosana Templates
&lt;/h2&gt;

&lt;p&gt;For our first edition, we’re zooming in on &lt;strong&gt;Nosana Templates&lt;/strong&gt; — reusable, pre-built job definition files that simplify AI model deployment on Nosana’s decentralized GPU network.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Nosana Templates&lt;/strong&gt; are reusable, pre-built job definition files designed to simplify AI model deployment on Nosana's GPU grid. They allow users to quickly set up complex AI tasks without extensive configuration.&lt;/p&gt;

&lt;p&gt;Templates let users launch complex AI workloads quickly, without deep configuration. Current templates include deploying &lt;a href="https://dashboard.nosana.com/jobs/create?templateId=qwen1.5b&amp;amp;randKey=3z707fh1chn" rel="noopener noreferrer"&gt;DeepSeek LLMs&lt;/a&gt; or running a &lt;a href="https://dashboard.nosana.com/jobs/create?templateId=vscode-server&amp;amp;randKey=akqekx4zh0n" rel="noopener noreferrer"&gt;VSCode instance&lt;/a&gt;. You can explore more examples in the &lt;a href="https://dashboard.nosana.com/jobs/templates" rel="noopener noreferrer"&gt;Templates section of the Dashboard&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Participate
&lt;/h2&gt;

&lt;p&gt;Build a new template by creating a Nosana Job Definition File. You can do this:&lt;/p&gt;

&lt;p&gt;You can create a new template by crafting a &lt;a href="https://docs.nosana.com/inference/writing_a_job.html" rel="noopener noreferrer"&gt;Nosana Job Definition File&lt;/a&gt;. This can be done either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Directly through the Nosana Dashboard Interface, or&lt;/li&gt;
&lt;li&gt;By creating and editing a JSON file locally with your preferred text editor, then submitting it to the Nosana Network using the Nosana CLI Tool.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While we encourage AI models, feel free to get creative — analytics or dev tools are welcome too!&lt;/p&gt;

&lt;h3&gt;
  
  
  Submission Instructions
&lt;/h3&gt;

&lt;p&gt;Follow these clear steps to submit your template:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fork the &lt;a href="https://github.com/nosana-ci/pipeline-templates/tree/main" rel="noopener noreferrer"&gt;Nosana GitHub Template Repository&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Create your new template JSON file based on your chosen AI model or other innovative use-case.&lt;/li&gt;
&lt;li&gt;Submit a Pull Request clearly describing your template, its intended use-case, and implementation specifics.

&lt;ul&gt;
&lt;li&gt;A new folder for your Nosana Template with the following files:&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;job-definition.json&lt;/code&gt;: Standard Nosana Job Definition JSON File&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;info.json&lt;/code&gt;: JSON file with display information for the dashboard&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;README.md&lt;/code&gt;: README file with a description of the Job Definition, Models, any other relevant information about the job.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Ensure your template is functional and deployable directly from the Nosana Dashboard.&lt;/li&gt;
&lt;li&gt;Last but not least, also do a submission at the &lt;a href="https://earn.superteam.fun/listing/nosana-builders-challenge" rel="noopener noreferrer"&gt;Builder Challenge Page&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Example Template
&lt;/h3&gt;

&lt;p&gt;Here’s an example for deploying a DeepSeek R1 LLM:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0.1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"container"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"trigger"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cli"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ops"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"container/run"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"deepseek-r1-qwen-1.5b"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"entrypoint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"/bin/sh"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"-c"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"python3 -m vllm.entrypoints.openai.api_server --model deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --served-model-name R1-Qwen-1.5B --port 9000 --max-model-len 130000"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"image"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"docker.io/vllm/vllm-openai:latest"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"gpu"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"expose"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;9000&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example, we're using the &lt;a href="https://vllm.com/" rel="noopener noreferrer"&gt;vLLM&lt;/a&gt; Docker image, but feel free to choose any container image suitable for your needs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Judging Criteria
&lt;/h2&gt;

&lt;p&gt;Submissions will be evaluated based on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Creativity:&lt;/strong&gt; Original and innovative template ideas.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Popularity of AI Model:&lt;/strong&gt; Implementation of widely-adopted or cutting-edge AI models.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Technical Interest:&lt;/strong&gt; Efficient, scalable, or uniquely creative use of Nosana’s capabilities.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Diversity of Models:&lt;/strong&gt; Varied implementations including LLMs, GANs, Stable Diffusion, analytics, and other inferencing models.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Prizes
&lt;/h2&gt;

&lt;p&gt;We’re awarding the &lt;strong&gt;top 10 submissions&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🥇 1st: $1,000 USDC&lt;/li&gt;
&lt;li&gt;🥈 2nd: $750 USDC&lt;/li&gt;
&lt;li&gt;🥉 3rd: $500 USDC&lt;/li&gt;
&lt;li&gt;🏅 4th: $250 USDC&lt;/li&gt;
&lt;li&gt;🔟 5th–10th: $100 USDC&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All prizes are paid out directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tutorial &amp;amp; Resources
&lt;/h2&gt;

&lt;p&gt;For a comprehensive tutorial and additional insights into how Nosana works, how to run models, and best practices, visit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.nosana.io/tutorials/llm/deepseek.html" rel="noopener noreferrer"&gt;Nosana Full Tutorial&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://earn.superteam.fun/listing/nosana-builders-challenge" rel="noopener noreferrer"&gt;Builders Challenge Page&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Don’t Miss Nosana Builder Challenge Updates
&lt;/h3&gt;




&lt;p&gt;Join our &lt;a href="https://nosana.com/discord" rel="noopener noreferrer"&gt;Discord&lt;/a&gt; where we have dedicated &lt;a href="https://discord.com/channels/236263424676331521/1354391113028337664" rel="noopener noreferrer"&gt;Builders Challenge Dev chat&lt;/a&gt; for technical support and information.&lt;/p&gt;

&lt;p&gt;Join our &lt;a href="https://nosana.com/telegram" rel="noopener noreferrer"&gt;Telegram&lt;/a&gt; or follow us on &lt;a href="https://nosana.com/x" rel="noopener noreferrer"&gt;X&lt;/a&gt; for the latest Nosana and NOS announcements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Happy Building!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>web3</category>
      <category>ai</category>
      <category>hackathon</category>
      <category>gpu</category>
    </item>
  </channel>
</rss>
