<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Joel Etse</title>
    <description>The latest articles on DEV Community by Joel Etse (@joelhuman).</description>
    <link>https://dev.to/joelhuman</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2544358%2Ff445b6c0-eb9d-4106-a7e3-015de442dab7.jpeg</url>
      <title>DEV Community: Joel Etse</title>
      <link>https://dev.to/joelhuman</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/joelhuman"/>
    <language>en</language>
    <item>
      <title>AI Models Monitoring in production</title>
      <dc:creator>Joel Etse</dc:creator>
      <pubDate>Thu, 12 Dec 2024 19:19:50 +0000</pubDate>
      <link>https://dev.to/joelhuman/ai-models-monitoring-in-production-3nbd</link>
      <guid>https://dev.to/joelhuman/ai-models-monitoring-in-production-3nbd</guid>
      <description>&lt;p&gt;In the rapidly evolving world of artificial intelligence, understanding and optimizing your AI model usage has never been more critical. Humiris is stepping up to the challenge with an innovative dashboard that provides comprehensive insights into AI model performance, costs, and environmental impact.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost Transparency at Your Fingertips&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9l2bu69505l1xegy0nw7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9l2bu69505l1xegy0nw7.png" alt="Image description" width="800" height="269"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Humiris Monitoring offers granular cost tracking for AI models. Users can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;View spending over custom time periods and savings with MoAI&lt;/li&gt;
&lt;li&gt;See detailed token usage across different AI models&lt;/li&gt;
&lt;li&gt;Analyze costs by model provider (OpenAI, Anthropic, Google, and more)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffuwcja93spz00tqkgzjj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffuwcja93spz00tqkgzjj.png" alt="Image description" width="800" height="562"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A colorful visual breakdown helps users quickly understand their AI spending, with each model provider represented by a distinct color.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Environmental Impact Tracking&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnurd322sis6dkeoz0rsk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnurd322sis6dkeoz0rsk.png" alt="Image description" width="800" height="549"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Beyond financial metrics, the dashboard introduces a groundbreaking feature: carbon emission tracking.&lt;br&gt;
Users can now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Monitor total carbon emissions from AI model usage and carbon savings with MoAI&lt;/li&gt;
&lt;li&gt;Compare emissions across different AI models&lt;/li&gt;
&lt;li&gt;Understand the environmental savings by using Humiris's model optimization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Performance Insights&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl424asmqym6lsttg46fs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl424asmqym6lsttg46fs.png" alt="Image description" width="800" height="536"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The performance dashboard categorizes models into three tiers:&lt;/p&gt;

&lt;p&gt;Ultra-High Performance: Top-tier models like GPT-4o and Sonnet 3.5&lt;br&gt;
High Performance: Powerful models such as Llama3.170b&lt;br&gt;
Normal Performance: Smaller, more economical models like gemma2-7B.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Speed Optimization&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fktn509kvgkbcrm0ep9vc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fktn509kvgkbcrm0ep9vc.png" alt="Image description" width="800" height="239"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The speed dashboard provides another layer of insight, categorizing model performance:&lt;/p&gt;

&lt;p&gt;Ultra-High Speed: Typically smaller language models (SLMs) and fast inference servers&lt;br&gt;
High Speed: Larger models with rapid processing capabilities&lt;br&gt;
Normal Speed: Standard models like GPT-4&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why This Matters&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In an era where AI adoption is skyrocketing, tools like Humiris Monitoring are invaluable. They empower organizations to:&lt;/p&gt;

&lt;p&gt;Make data-driven decisions about AI model selection&lt;br&gt;
Optimize costs and performance&lt;br&gt;
Reduce environmental impact&lt;br&gt;
Gain transparent insights into AI infrastructure.&lt;/p&gt;

&lt;p&gt;What Humiris represents isn't just a monitoring tool, it's a philosophy. As AI becomes more pervasive, understanding its nuanced implications becomes crucial.&lt;br&gt;
By providing granular insights into performance, cost, and environmental impact, Humiris is helping organizations move from passive AI consumers to strategic, responsible innovators.&lt;br&gt;
The era of black box AI is over. Welcome to intelligent, transparent computing.&lt;/p&gt;

&lt;p&gt;Stay tuned for more updates from Humiris's launch week!&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Introducing Humiris MoAI Basic : A New Way to Build Hybrid AI Models</title>
      <dc:creator>Joel Etse</dc:creator>
      <pubDate>Tue, 10 Dec 2024 06:12:01 +0000</pubDate>
      <link>https://dev.to/joelhuman/introducing-humiris-moai-basic-a-new-way-to-build-hybrid-ai-models-10hg</link>
      <guid>https://dev.to/joelhuman/introducing-humiris-moai-basic-a-new-way-to-build-hybrid-ai-models-10hg</guid>
      <description>&lt;p&gt;Today, we’re excited to introduce Humiris MoAI Basic, an AI infrastructure designed to help AI engineers and developers seamlessly mix multiple LLMs into tailored, high-performance AI solutions. With MoAI Basic, you’re not constrained to a single model’s strengths or weaknesses. &lt;br&gt;
Instead, you can tune your AI by mixing models that excel in speed, cost-efficiency, quality, sustainability, or data privacy enabling you to create a uniquely optimized model for your organization’s needs.&lt;/p&gt;

&lt;p&gt;Modern AI applications often face complex and shifting requirements. Some projects demand near-instant responses at scale, while others need to adhere to strict data compliance laws or curb computational overhead for environmental responsibility. &lt;br&gt;
Traditional single-model approaches often force trade-offs, but MoAI Basic changes the equation. By blending and balancing multiple LLMs, you have the freedom to align your model configurations directly with your evolving objectives, all without getting locked into a single provider or architectural limitation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why MoAI Basic?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Existing LLMs are powerful but come with trade-offs. High-end models deliver remarkable depth but can be expensive and slower, while lightweight, open-source models offer speed and affordability at the expense of sophistication. MoAI Basic bridges these gaps by orchestrating a diverse set of models behind the scenes. &lt;br&gt;
It selects the right combination at the right moment, optimizing for your chosen criteria without locking you into a single model’s limitations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc9g69d6d3m6jul9kxq2w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc9g69d6d3m6jul9kxq2w.png" alt="Image description" width="800" height="468"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How It Works&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;At its core is a “gating model” a specialized AI model trained to evaluate each incoming query and decide which LLMs to involve. For example, a complex research request might tap into a more advanced model, while a quick, routine query might lean on a cost-efficient one. Over time, this system refines its approach based on real world performance data, making your AI experience progressively more aligned with your goals.&lt;/p&gt;

&lt;p&gt;When a query is received, the gating model begins by analyzing its characteristics to understand its requirements. This process involves:&lt;/p&gt;

&lt;p&gt;Intent Recognition: Identifying the type of task (e.g., creative writing, technical analysis, summarization).&lt;br&gt;
Complexity Assessment: Determining how complex the query is and whether it requires deep reasoning or factual precision.&lt;br&gt;
Domain Identification: Understanding the subject matter to ensure the query is routed to a model with expertise in that field.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
A query like “What is the capital of France?” is classified as simple factual retrieval.&lt;br&gt;
A query like “Analyze the economic implications of AI adoption on labor markets.” is marked as complex and multidisciplinary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mix-Tuning: Customizing Model Behavior with Mix-Instruction Parameters&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9e5ky3lcgi021cjp0ph3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9e5ky3lcgi021cjp0ph3.png" alt="Image description" width="800" height="442"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Mix-Tuning (or mix instructions) in MoAI Basic allows users to define how the gating model select and orchestrates models based on their specific goals. This feature empowers the gating model to prioritize and balance parameters such as cost, speed, quality, privacy, and environmental impact.&lt;/p&gt;

&lt;p&gt;Through mix instructions, users can fine-tune how queries are processed, ensuring that the system adapts to both the complexity of the task and the operational priorities.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Core Parameters for Mix-Tuning&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cost Optimization &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Objective: Minimize expenses while maintaining acceptable response quality.&lt;br&gt;
Use Case: Applications with budget constraints or large-scale deployments.&lt;br&gt;
Behavior:Simple queries are routed to lightweight, cost-efficient models.&lt;br&gt;
Complex queries may involve higher-cost models but with a trade-off against quality thresholds.&lt;/p&gt;

&lt;p&gt;Example Instruction:&lt;br&gt;
"Minimize cost by 50% while keeping 70% response quality."&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Performance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Objective: Achieve the highest-quality and most accurate responses.&lt;br&gt;
Use Case: Research, critical decision-making, or high-stakes applications.&lt;br&gt;
Behavior:&lt;br&gt;
Prioritizes high-performance models, regardless of cost or speed.&lt;br&gt;
Aggregates responses from multiple models to ensure depth and precision.&lt;br&gt;
Example Instruction:&lt;/p&gt;

&lt;p&gt;"Optimize for 90% performance, regardless of cost."&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Speed &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Objective: Minimize latency for time sensitive tasks.&lt;br&gt;
Use Case: Real-time applications such as customer support or emergency systems.&lt;br&gt;
Behavior:Routes queries to the fastest models, even at the expense of quality or cost.&lt;br&gt;
Limits the involvement of models with high latency.&lt;br&gt;
Example Instruction:&lt;/p&gt;

&lt;p&gt;"Maximize speed to 80%, even if it sacrifices 20% performance."&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Privacy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Objective: Ensure secure handling of sensitive data.&lt;br&gt;
Use Case: Healthcare, finance, and confidential data processing.&lt;br&gt;
Behavior:Utilizes secure, open-source models or private servers.&lt;br&gt;
Excludes external APIs for privacy-critical queries.&lt;br&gt;
Example Instruction:&lt;br&gt;
"Guarantee 100% privacy, even if speed and cost are compromised."&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Environmental Impact &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Objective: Reduce energy consumption and carbon footprint.&lt;br&gt;
Use Case: Green AI initiatives or sustainability-focused organizations.&lt;br&gt;
Behavior:Prefers energy-efficient models and infrastructure.&lt;br&gt;
Avoids models with a high computational load.&lt;br&gt;
Example Instruction:&lt;br&gt;
"Reduce carbon footprint by 70% while maintaining 60% performance."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Customizable Mix-Instructions&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Simple Mix-Instructions: Single parameter optimization directives that focus on one priority.&lt;br&gt;
"Minimize cost by 50%."&lt;br&gt;
"Ensure responses within 100 milliseconds."&lt;br&gt;
"Optimize for performance at 85% quality."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Compound Mix-Instructions: Complex directives that balance multiple parameters.&lt;br&gt;
"Optimize for 60% speed and 70% privacy."&lt;br&gt;
"Minimize cost by 50% while maintaining 80% performance."&lt;br&gt;
"Ensure 90% privacy and 70% speed, even at increased costs."&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Examples of Mix-Tuning in Action&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Scenario 1: Speed-Centric Query&lt;br&gt;
Mix Instruction: "Maximize speed at 80%, allow up to 20% quality reduction."&lt;br&gt;
Gating System Action:&lt;br&gt;
Selects fast models like &lt;strong&gt;Llama 3.1 8B.&lt;/strong&gt;&lt;br&gt;
Avoids slower, high-quality models like &lt;strong&gt;Claude 3.5 Sonnet&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Scenario 2: Privacy-First Query&lt;br&gt;
Mix Instruction: "Ensure 100% privacy with 60% performance."&lt;br&gt;
Gating System Action:&lt;br&gt;
Routes queries to secure, open-source models like &lt;strong&gt;Gemma 2B&lt;/strong&gt; on &lt;strong&gt;private infrastructure&lt;/strong&gt;.&lt;br&gt;
Excludes external APIs or commercial closed models.&lt;/p&gt;

&lt;p&gt;Scenario 3: Balanced Optimization&lt;br&gt;
Mix Instruction: "Reduce costs by 40%, improve speed by 60%, and maintain 70% quality."&lt;br&gt;
Gating System Action:&lt;br&gt;
Combines a lightweight proposer model (e.g., Llama 3.1 8B) with a high-quality aggregator (e.g., Claude 3.5 Sonnet).&lt;br&gt;
Dynamically adjusts resource allocation to achieve the balance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-World Applications&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Cost-Effective AI for Enterprises&lt;br&gt;
A customer support platform uses MoAI Basic to handle common queries with lightweight models, reducing operational costs while reserving powerful models for complex issues.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Real-Time Decision-Making&lt;br&gt;
In financial trading, MoAI Basic leverages fast models for instant responses, ensuring latency doesn’t impact profitability.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Privacy-First Healthcare Solutions&lt;br&gt;
A telemedicine provider routes patient data exclusively to secure, open-source models, ensuring compliance with strict privacy regulations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Green AI Initiatives&lt;br&gt;
MoAI Basic powers applications that minimize energy usage, contributing to corporate sustainability goals.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8spgeodzb4tcbpbmna7f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8spgeodzb4tcbpbmna7f.png" alt="Image description" width="800" height="442"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Looking Ahead: MoAI Advanced&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For organizations with even more demanding needs, MoAI Advanced takes the concept further. It enables collaborative interactions between multiple LLMs for highly nuanced outputs. With features like parallel processing, sequential thought chains, and iterative refinement, MoAI Advanced opens new horizons in AI capabilities.&lt;/p&gt;

&lt;p&gt;*&lt;em&gt;Join the Revolution&lt;br&gt;
*&lt;/em&gt;&lt;br&gt;
With MoAI Basic, Humiris is democratizing access to customizable, efficient, and sustainable AI. Whether you’re a startup looking to optimize costs or an enterprise aiming for cutting-edge performance, MoAI Basic is your gateway to the next generation of AI solutions.&lt;/p&gt;

&lt;p&gt;Learn more about how you can harness the power of MoAI Basic and redefine what’s possible with AI at &lt;a href="https://www.humiris.ai/routing" rel="noopener noreferrer"&gt;humiris.ai&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/CIkjijkQkcM"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

</description>
      <category>llm</category>
      <category>routing</category>
      <category>aimodel</category>
      <category>humiris</category>
    </item>
  </channel>
</rss>
