<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: tiramisu-framework</title>
    <description>The latest articles on DEV Community by tiramisu-framework (@tiramisuframework).</description>
    <link>https://dev.to/tiramisuframework</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3586956%2Fb8052fa7-f84a-4bb9-8a20-fc629725f4ef.png</url>
      <title>DEV Community: tiramisu-framework</title>
      <link>https://dev.to/tiramisuframework</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tiramisuframework"/>
    <language>en</language>
    <item>
      <title>Tiramisu 3.0: From Response Generation to Decision Governance</title>
      <dc:creator>tiramisu-framework</dc:creator>
      <pubDate>Sat, 20 Dec 2025 06:04:34 +0000</pubDate>
      <link>https://dev.to/tiramisuframework/tiramisu-30-from-response-generation-to-decision-governance-2goo</link>
      <guid>https://dev.to/tiramisuframework/tiramisu-30-from-response-generation-to-decision-governance-2goo</guid>
      <description>&lt;p&gt;published: true&lt;br&gt;
description: "Why we stopped improving how AI responds and started governing how AI decides. A Multi-Agent RAO system that validates before analyzing, and plans before responding."&lt;br&gt;
tags: ai, python, opensource, machinelearning&lt;/p&gt;

&lt;p&gt;Two months ago, I published Tiramisu Framework v2.0 — a multi-agent RAO system with 100% routing accuracy.&lt;br&gt;
Today, I'm releasing v3.0 — and it changes everything about how we think about AI systems.&lt;br&gt;
The shift: We stopped improving how AI responds and started governing how AI decides.&lt;/p&gt;

&lt;p&gt;🎯 TL;DR&lt;/p&gt;

&lt;h1&gt;
  
  
  What Tiramisu 3.0 does differently:
&lt;/h1&gt;

&lt;p&gt;✓ Governs decisions BEFORE generating responses&lt;br&gt;
✓ 3 personas collaborate (not compete)&lt;br&gt;
✓ Validates sufficiency, not just capability&lt;br&gt;
✓ Output = traceable plan, not loose text&lt;/p&gt;

&lt;h1&gt;
  
  
  Architecture:
&lt;/h1&gt;

&lt;p&gt;Query → Validation → Analysis → Plan → Result&lt;br&gt;
         (RAO-4)      (RAO-5)   (RAO-6)&lt;/p&gt;

&lt;h1&gt;
  
  
  Install:
&lt;/h1&gt;

&lt;p&gt;pip install tiramisu-framework==3.0.0&lt;/p&gt;

&lt;p&gt;📊 The Problem: Generation Without Governance&lt;br&gt;
Most AI frameworks focus on one thing: generating better responses.&lt;br&gt;
Better prompts. Better models. Better retrieval. Better context.&lt;br&gt;
But they skip a fundamental question:&lt;/p&gt;

&lt;p&gt;Should the system respond at all? And if yes, how should it decide what to say?&lt;/p&gt;

&lt;p&gt;Traditional frameworks:&lt;br&gt;
StepWhat Happens1Receive query2Retrieve context3Generate response4Return to user&lt;br&gt;
The problem? No governance. The system assumes it should always respond, with whatever data it has.&lt;br&gt;
This works for chatbots. It fails for systems where decisions matter.&lt;/p&gt;

&lt;p&gt;🏗️ Tiramisu 3.0: Governance First&lt;br&gt;
Tiramisu 3.0 introduces a different architecture:&lt;br&gt;
USER QUERY&lt;br&gt;
    │&lt;br&gt;
    ▼&lt;br&gt;
┌─────────────────────────────────┐&lt;br&gt;
│    RAO-4: COLLABORATIVE         │&lt;br&gt;
│         VALIDATION              │&lt;br&gt;
│                                 │&lt;br&gt;
│  "Do we have enough data for    │&lt;br&gt;
│   THIS type of problem?"        │&lt;br&gt;
│                                 │&lt;br&gt;
│  Decision: PROCEED or BLOCK     │&lt;br&gt;
└─────────────────────────────────┘&lt;br&gt;
    │&lt;br&gt;
    ▼&lt;br&gt;
┌─────────────────────────────────┐&lt;br&gt;
│    RAO-5: COLLABORATIVE         │&lt;br&gt;
│         ANALYSIS                │&lt;br&gt;
│                                 │&lt;br&gt;
│  Router selects LEADER          │&lt;br&gt;
│  Others provide SUPPORT         │&lt;br&gt;
│                                 │&lt;br&gt;
│  Output: Structured analysis    │&lt;br&gt;
└─────────────────────────────────┘&lt;br&gt;
    │&lt;br&gt;
    ▼&lt;br&gt;
┌─────────────────────────────────┐&lt;br&gt;
│    RAO-6: COLLABORATIVE         │&lt;br&gt;
│           PLAN                  │&lt;br&gt;
│                                 │&lt;br&gt;
│  Each component → 1 action      │&lt;br&gt;
│  System prioritizes             │&lt;br&gt;
│                                 │&lt;br&gt;
│  Output: Traceable plan         │&lt;br&gt;
└─────────────────────────────────┘&lt;br&gt;
    │&lt;br&gt;
    ▼&lt;br&gt;
GOVERNED RESULT&lt;br&gt;
The key insight: Before any analysis or generation, the system validates sufficiency.&lt;br&gt;
Not "can I respond?" but "do I have the right data for this specific type of problem?"&lt;/p&gt;

&lt;p&gt;🔄 What Changed from v2.0 to v3.0&lt;br&gt;
Aspectv2.0v3.0FocusRouting accuracyDecision governanceArchitectureSupervisor + AgentsCollaborative RAO levelsValidationAfter retrievalBefore analysisOutputResponseStructured planTraceabilityPartialComplete&lt;br&gt;
v2.0 asked: "Which agent should handle this?"&lt;br&gt;
v3.0 asks: "Should we proceed? With what confidence? Using which approach?"&lt;/p&gt;

&lt;p&gt;🎭 Collaborative Personas&lt;br&gt;
Tiramisu 3.0 uses three specialized personas that collaborate at each level:&lt;br&gt;
PersonaFocusRoleKStrategyPositioning, fundamentals, segmentationMChannelsDigital presence, metrics, technologyGExecutionContent, speed, practical action&lt;br&gt;
Important: These personas don't debate freely. Each has a fixed role per RAO level:&lt;/p&gt;

&lt;p&gt;RAO-4: Each validates its own area&lt;br&gt;
RAO-5: One leads, others support&lt;br&gt;
RAO-6: Each contributes one action&lt;/p&gt;

&lt;p&gt;This structured collaboration prevents the chaos of open-ended multi-agent debates.&lt;/p&gt;

&lt;p&gt;🚦 RAO-4: Sufficiency Validation&lt;br&gt;
The first level doesn't ask "can we help?" — it asks "do we have enough?"&lt;br&gt;
Query: "My product isn't selling well"&lt;/p&gt;

&lt;p&gt;RAO-4 Validation:&lt;br&gt;
┌────────────────────────────────────────┐&lt;br&gt;
│ Persona K: Checking strategy data...   │&lt;br&gt;
│   ⚠️ Missing: competitor analysis      │&lt;br&gt;
│   ⚠️ Missing: price positioning        │&lt;br&gt;
│   ✓ Has: target market                 │&lt;br&gt;
│                                        │&lt;br&gt;
│ Persona M: Checking channel data...    │&lt;br&gt;
│   ⚠️ Missing: current metrics          │&lt;br&gt;
│   ✓ Has: channel preferences           │&lt;br&gt;
│                                        │&lt;br&gt;
│ Persona G: Checking execution data...  │&lt;br&gt;
│   ✓ Has: product description           │&lt;br&gt;
│   ✓ Has: brand voice                   │&lt;br&gt;
└────────────────────────────────────────┘&lt;/p&gt;

&lt;p&gt;Decision: APPROVED (medium confidence)&lt;br&gt;
Gaps identified: 13&lt;br&gt;
Recommendation: Proceed with caveats&lt;br&gt;
Confidence levels:&lt;br&gt;
LevelMeaningActionHIGHSufficient dataProceed normallyMEDIUMPartial dataProceed with caveatsVERIFYInsufficientRequest more dataBLOCKEDCannot proceedStop execution&lt;/p&gt;

&lt;p&gt;🎯 RAO-5: Leader Selection&lt;br&gt;
After validation, the system selects a leader based on query type:&lt;br&gt;
Query analyzed: "My product isn't selling well"&lt;/p&gt;

&lt;p&gt;Routing:&lt;br&gt;
┌────────────────────────────────────────┐&lt;br&gt;
│ Method: keywords                       │&lt;br&gt;
│ Detected: sales, product, positioning  │&lt;br&gt;
│                                        │&lt;br&gt;
│ Decision:                              │&lt;br&gt;
│   LEADER: Persona K (strategy focus)   │&lt;br&gt;
│   SUPPORT: Persona M, Persona G        │&lt;br&gt;
└────────────────────────────────────────┘&lt;br&gt;
Cascading router:&lt;/p&gt;

&lt;p&gt;Keywords — Fast pattern matching&lt;br&gt;
Embeddings — Semantic similarity (if keywords fail)&lt;br&gt;
Fallback — Default assignment&lt;/p&gt;

&lt;p&gt;The leader drives the analysis. Supporters add perspective without taking over.&lt;/p&gt;

&lt;p&gt;📋 RAO-6: Structured Plan&lt;br&gt;
The final level produces a traceable plan, not loose text:&lt;br&gt;
RAO-6 Output:&lt;br&gt;
┌────────────────────────────────────────┐&lt;br&gt;
│ PRIORITIZED ACTION PLAN                │&lt;br&gt;
│                                        │&lt;br&gt;
│ P1: Define strategic positioning       │&lt;br&gt;
│     Owner: K | Timeline: 30 days       │&lt;br&gt;
│                                        │&lt;br&gt;
│ P2: Activate priority channels         │&lt;br&gt;
│     Owner: M | Timeline: 14 days       │&lt;br&gt;
│                                        │&lt;br&gt;
│ P3: Create authentic content           │&lt;br&gt;
│     Owner: G | Timeline: 7 days        │&lt;br&gt;
│                                        │&lt;br&gt;
│ Quality Score: 100%                    │&lt;br&gt;
│ Actions: 3 | Priorities: [1, 2, 3]     │&lt;br&gt;
└────────────────────────────────────────┘&lt;br&gt;
Why this matters:&lt;/p&gt;

&lt;p&gt;Each action has an owner (traceable)&lt;br&gt;
Each action has a timeline (accountable)&lt;br&gt;
The plan has a quality score (measurable)&lt;/p&gt;

&lt;p&gt;💻 Quick Start&lt;br&gt;
bashpip install tiramisu-framework==3.0.0&lt;br&gt;
pythonfrom tiramisu import GovernanceOrchestrator&lt;/p&gt;

&lt;h1&gt;
  
  
  Initialize
&lt;/h1&gt;

&lt;p&gt;orchestrator = GovernanceOrchestrator()&lt;/p&gt;

&lt;h1&gt;
  
  
  Provide context
&lt;/h1&gt;

&lt;p&gt;context = {&lt;br&gt;
    'product': 'artisan coffee',&lt;br&gt;
    'target_market': 'urban professionals'&lt;br&gt;
}&lt;/p&gt;

&lt;h1&gt;
  
  
  Execute with governance
&lt;/h1&gt;

&lt;p&gt;result = orchestrator.execute(&lt;br&gt;
    'My product is not selling well', &lt;br&gt;
    context&lt;br&gt;
)&lt;/p&gt;

&lt;h1&gt;
  
  
  View governance logs
&lt;/h1&gt;

&lt;p&gt;print(orchestrator.display_logs(result))&lt;/p&gt;

&lt;h1&gt;
  
  
  Output:
&lt;/h1&gt;

&lt;h1&gt;
  
  
  TIRAMISU 3.0 - Decision Governance
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;RAO-4 Collaborative Validation &lt;br&gt;
   Confidence: medium | Gaps: 13 | Decision: APPROVED&lt;/p&gt;

&lt;p&gt;RAO-5 Collaborative Analysis &lt;br&gt;
   Leader: K | Method: keywords | Support: [M, G]&lt;/p&gt;

&lt;p&gt;RAO-6 Collaborative Plan &lt;/p&gt;
&lt;h1&gt;
  
  
     Actions: 3 | Quality: 100% | Priorities: [1, 2, 3]
&lt;/h1&gt;
&lt;/blockquote&gt;


&lt;/blockquote&gt;

&lt;p&gt;📊 Metrics&lt;br&gt;
MetricValueNew modules16Lines of code804Personas3RAO levels3Test coverage100%&lt;/p&gt;

&lt;p&gt;🎯 Key Innovations&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Governance Before Generation
The system decides if, how, and with whom to respond before generating anything.&lt;/li&gt;
&lt;li&gt;Sufficiency-Based Validation
Doesn't ask "can I respond?" but "do I have enough data for this type of problem?"&lt;/li&gt;
&lt;li&gt;Structured Collaboration
Personas don't chat freely. Each has a fixed role per level. No chaos.&lt;/li&gt;
&lt;li&gt;Complete Traceability
Every decision generates a log. You know why the system decided that way.&lt;/li&gt;
&lt;li&gt;Contractual Output
Result isn't text. It's a structured plan with owners, timelines, and scores.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;✅ When to Use Tiramisu 3.0&lt;/p&gt;

&lt;p&gt;Systems that need to explain decisions&lt;br&gt;
Domains with multiple perspectives&lt;br&gt;
Applications requiring prior validation&lt;br&gt;
Projects that value traceability over speed&lt;/p&gt;

&lt;p&gt;❌ When NOT to Use&lt;/p&gt;

&lt;p&gt;Simple Q&amp;amp;A chatbots&lt;br&gt;
Systems that don't need auditing&lt;br&gt;
Applications where speed matters more than governance&lt;/p&gt;

&lt;p&gt;🚀 What's Next&lt;br&gt;
Short-term:&lt;/p&gt;

&lt;p&gt;Documentation expansion&lt;br&gt;
More routing strategies&lt;br&gt;
Community feedback integration&lt;/p&gt;

&lt;p&gt;Medium-term:&lt;/p&gt;

&lt;p&gt;Advanced routing strategies&lt;br&gt;
Enterprise governance features&lt;br&gt;
Audit trail exports&lt;/p&gt;

&lt;p&gt;Long-term:&lt;/p&gt;

&lt;p&gt;Domain-specific persona templates&lt;br&gt;
Enterprise governance features&lt;br&gt;
Audit trail exports&lt;/p&gt;

&lt;p&gt;📚 Resources&lt;br&gt;
PyPI: &lt;a href="https://pypi.org/project/tiramisu-framework/" rel="noopener noreferrer"&gt;https://pypi.org/project/tiramisu-framework/&lt;/a&gt;&lt;br&gt;
GitHub: &lt;a href="https://github.com/tiramisu-framework/tiramisu" rel="noopener noreferrer"&gt;https://github.com/tiramisu-framework/tiramisu&lt;/a&gt;&lt;br&gt;
Previous article (v2.0): &lt;a href="https://dev.to/tiramisuframework/from-rag-to-rao-level-6-how-i-evolved-tiramisu-framework-into-a-multi-agent-system-4ebh"&gt;https://dev.to/tiramisuframework/from-rag-to-rao-level-6-how-i-evolved-tiramisu-framework-into-a-multi-agent-system-4ebh&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🎯 Key Takeaway&lt;/p&gt;

&lt;p&gt;"We don't innovate in generation. We innovate in decision governance."&lt;/p&gt;

&lt;p&gt;The future of AI systems isn't about generating better responses.&lt;br&gt;
It's about governing better decisions.&lt;br&gt;
Tiramisu 3.0 is a step in that direction.&lt;/p&gt;

&lt;p&gt;Questions? Feedback? Drop a comment below.&lt;br&gt;
What's your biggest challenge with AI decision-making? 👇&lt;/p&gt;

&lt;h1&gt;
  
  
  AI #Python #OpenSource #MachineLearning #DecisionGovernance
&lt;/h1&gt;

</description>
      <category>agents</category>
      <category>architecture</category>
      <category>opensource</category>
      <category>ai</category>
    </item>
    <item>
      <title>Building a 95% Precision Offline</title>
      <dc:creator>tiramisu-framework</dc:creator>
      <pubDate>Tue, 02 Dec 2025 09:53:18 +0000</pubDate>
      <link>https://dev.to/tiramisuframework/building-a-95-precision-offline-2dbk</link>
      <guid>https://dev.to/tiramisuframework/building-a-95-precision-offline-2dbk</guid>
      <description>&lt;p&gt;RAG System: Multi-Query Rewriting and Named Entity Disambiguation&lt;br&gt;
published: true&lt;br&gt;
description: How I built Efrat 2.0 - a research-grade offline RAG system with +750% recall improvement, adaptive hybrid search, and automatic confidence classification. Complete technical breakdown with metrics and code.&lt;br&gt;
tags: ai, python, rag, opensource&lt;/p&gt;

&lt;p&gt;Three weeks after publishing Tiramisu Framework v2.0 (a multi-agent RAO system), I built Efrat 2.0 — an offline RAG system that achieves 95% precision with advanced retrieval techniques.&lt;br&gt;
Real metrics from production tests:&lt;/p&gt;

&lt;p&gt;95% precision (near-perfect accuracy)&lt;br&gt;
+750% recall improvement (finds 7.5x more relevant results)&lt;br&gt;
+312% overall score improvement&lt;br&gt;
Zero false positives in person searches&lt;br&gt;
100% offline (no API costs, full data privacy)&lt;/p&gt;

&lt;p&gt;This article breaks down exactly how I did it.&lt;/p&gt;

&lt;p&gt;🎯 TL;DR&lt;br&gt;
bash# Core innovations:&lt;br&gt;
✓ Multi-query rewriting (+750% recall)&lt;br&gt;
✓ 7-criteria re-ranking with named entity disambiguation&lt;br&gt;
✓ Adaptive hybrid search (dynamic FAISS/BM25 weighting)&lt;br&gt;
✓ Automatic confidence classification&lt;br&gt;
✓ 100% offline (FAISS + BM25 + Ollama)&lt;/p&gt;

&lt;h1&gt;
  
  
  Real metrics:
&lt;/h1&gt;

&lt;p&gt;✓ 95% precision&lt;br&gt;
✓ 85%+ recall on complex queries&lt;br&gt;
✓ +312% score improvement&lt;br&gt;
✓ Zero API costs&lt;br&gt;
Tech stack: Python, FAISS, Rank-BM25, Ollama, sentence-transformers&lt;br&gt;
GitHub: [coming soon]&lt;/p&gt;

&lt;p&gt;📊 The Problem: Precision vs Recall in RAG&lt;br&gt;
Traditional RAG systems face a fundamental tradeoff:&lt;br&gt;
ApproachPrecisionRecallProblemSemantic only (FAISS)70-80%60-70%Misses exact matchesKeyword only (BM25)60-70%50-60%Misses semantic similaritySimple hybrid (50/50)75-85%65-75%Not adaptive to query type&lt;br&gt;
The challenge: How do you get both high precision AND high recall without manual tuning?&lt;/p&gt;

&lt;p&gt;🏗️ Efrat 2.0 Architecture&lt;br&gt;
USER QUERY: "person name"&lt;br&gt;
        ↓&lt;br&gt;
┌─────────────────────────────────┐&lt;br&gt;
│  MULTI-QUERY REWRITING          │&lt;br&gt;
│  Input: "John Smith"            │&lt;br&gt;
│  Output: 6 variations           │&lt;br&gt;
│  • "John Smith"                 │&lt;br&gt;
│  • "J. Smith"                   │&lt;br&gt;
│  • "Smith"                      │&lt;br&gt;
│  • "partner John"               │&lt;br&gt;
│  • etc.                         │&lt;br&gt;
└─────────────────────────────────┘&lt;br&gt;
        ↓&lt;br&gt;
┌─────────────────────────────────┐&lt;br&gt;
│  ADAPTIVE HYBRID SEARCH         │&lt;br&gt;
│  α = 0.5 (50% FAISS, 50% BM25)  │&lt;br&gt;
│  Searches ALL 6 queries         │&lt;br&gt;
│  Returns: 34 raw results        │&lt;br&gt;
└─────────────────────────────────┘&lt;br&gt;
        ↓&lt;br&gt;
┌─────────────────────────────────┐&lt;br&gt;
│  7-CRITERIA RE-RANKING          │&lt;br&gt;
│  • full_name_bonus: +0.25       │&lt;br&gt;
│  • empty_penalty: -0.25         │&lt;br&gt;
│  • cooccurrence_bonus: +0.10    │&lt;br&gt;
│  • similarity_bonus: +0.15      │&lt;br&gt;
│  • repetition_penalty: -0.10    │&lt;br&gt;
│  • partial_match_bonus: +0.05   │&lt;br&gt;
│  • query_term_bonus: +0.20      │&lt;br&gt;
└─────────────────────────────────┘&lt;br&gt;
        ↓&lt;br&gt;
┌─────────────────────────────────┐&lt;br&gt;
│  CONFIDENCE CLASSIFICATION      │&lt;br&gt;
│  🟢 HIGH (≥0.70): 4 results     │&lt;br&gt;
│  🟡 MEDIUM (0.50-0.70): 2 results│&lt;br&gt;
│  🟠 VERIFY (0.30-0.50): 1       │&lt;br&gt;
│  🔴 DISCARD (&amp;lt;0.30): 27         │&lt;br&gt;
└─────────────────────────────────┘&lt;br&gt;
        ↓&lt;br&gt;
    FINAL RESULTS&lt;/p&gt;

&lt;p&gt;🔄 Innovation #1: Multi-Query Rewriting&lt;br&gt;
Problem: Single queries miss variations&lt;br&gt;
Example: Searching "John Smith" misses documents with:&lt;/p&gt;

&lt;p&gt;"J. Smith" (abbreviated first name)&lt;br&gt;
"Smith" (last name only)&lt;br&gt;
"partner John Smith" (with context)&lt;br&gt;
"son John Smith" (with relationship)&lt;/p&gt;

&lt;p&gt;Solution: Automatically generate query variations&lt;br&gt;
pythondef generate_query_variations(original_query: str) -&amp;gt; List[str]:&lt;br&gt;
    variations = [original_query]&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;if is_person_name(original_query):
    parts = original_query.split()

    if len(parts) == 2:
        first, last = parts
        variations.extend([
            f"{first[0]}. {last}",
            last,
            f"partner {original_query}",
            f"son {original_query}",
            f"president {original_query}"
        ])

return list(set(variations))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;query = "John Smith"&lt;br&gt;
variations = generate_query_variations(query)&lt;br&gt;
Result:&lt;br&gt;
python[&lt;br&gt;
    "John Smith",&lt;br&gt;
    "J. Smith", &lt;br&gt;
    "Smith",&lt;br&gt;
    "partner John Smith",&lt;br&gt;
    "son John Smith",&lt;br&gt;
    "president John Smith"&lt;br&gt;
]&lt;br&gt;
Impact: +750% recall improvement&lt;/p&gt;

&lt;p&gt;⚖️ Innovation #2: Adaptive Hybrid Search&lt;br&gt;
Problem: Fixed FAISS/BM25 weights don't work for all queries&lt;br&gt;
Query TypeBest ApproachWhyPerson name50% FAISS, 50% BM25Need both semantic + exactConcept70% FAISS, 30% BM25Semantic similarity matters moreDate/Number20% FAISS, 80% BM25Exact matching critical&lt;br&gt;
Solution: Dynamic α weighting based on query type&lt;br&gt;
pythondef adaptive_hybrid_search(&lt;br&gt;
    query: str,&lt;br&gt;
    faiss_index,&lt;br&gt;
    bm25_index,&lt;br&gt;
    k: int = 10&lt;br&gt;
) -&amp;gt; List[Document]:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;query_type = classify_query_type(query)

if query_type == "person":
    alpha = 0.5
elif query_type == "concept":
    alpha = 0.7
elif query_type == "date_number":
    alpha = 0.2
else:
    alpha = 0.6

faiss_scores = faiss_index.search(query, k)
bm25_scores = bm25_index.search(query, k)

combined_scores = (
    alpha * normalize(faiss_scores) +
    (1 - alpha) * normalize(bm25_scores)
)

return rank_by_score(combined_scores)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Impact: Precision jumps from 75% → 90%+ across all query types&lt;/p&gt;

&lt;p&gt;🎯 Innovation #3: 7-Criteria Re-Ranking&lt;br&gt;
Problem: Raw retrieval scores don't account for context&lt;br&gt;
Example: Search "John Smith" returns:&lt;br&gt;
Doc 1: "John Doe is the CEO..." (WRONG PERSON)&lt;br&gt;
Doc 2: "John                   Smith" (SPACING ISSUE)&lt;br&gt;
Doc 3: "Smith family business..." (PARTIAL MATCH)&lt;br&gt;
Doc 4: "John Smith, partner..." (PERFECT MATCH)&lt;br&gt;
All have similar FAISS/BM25 scores!&lt;br&gt;
Solution: Named Entity Disambiguation via 7-criteria scoring&lt;br&gt;
Criterion 1: Full Name Bonus (+0.25)&lt;br&gt;
pythondef full_name_bonus(text: str, query_terms: List[str]) -&amp;gt; float:&lt;br&gt;
    if len(query_terms) &amp;lt; 2:&lt;br&gt;
        return 0.0&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;positions = []
for term in query_terms:
    if term.lower() in text.lower():
        positions.append(text.lower().find(term.lower()))

if len(positions) == len(query_terms):
    distance = max(positions) - min(positions)
    if distance &amp;lt; 50:
        return 0.25

return 0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Differentiates:&lt;/p&gt;

&lt;p&gt;"John Smith" (distance: 5) → +0.25 ✅&lt;br&gt;
"John ... Doe" (distance: 200) → 0.0 ❌&lt;/p&gt;

&lt;p&gt;Criterion 2: Empty Field Penalty (-0.25)&lt;br&gt;
pythondef empty_penalty(text: str) -&amp;gt; float:&lt;br&gt;
    empty_patterns = [&lt;br&gt;
        r'\s{3,}',&lt;br&gt;
        r'^[\s\t]*$',&lt;br&gt;
        r'(null|none|n/a|—|–)',&lt;br&gt;
    ]&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;for pattern in empty_patterns:
    if re.search(pattern, text.lower()):
        return -0.25

return 0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Penalizes:&lt;/p&gt;

&lt;p&gt;"John                   Smith" → -0.25 ❌&lt;br&gt;
"Name: null, ID: —" → -0.25 ❌&lt;/p&gt;

&lt;p&gt;Criterion 3: Co-occurrence Bonus (+0.10)&lt;br&gt;
pythondef cooccurrence_bonus(&lt;br&gt;
    text: str, &lt;br&gt;
    query_terms: List[str]&lt;br&gt;
) -&amp;gt; float:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;context_terms = [
    "partner", "son", "president", "director",
    "ID", "address", "birth"
]

found_terms = sum(
    1 for term in context_terms 
    if term in text.lower()
)

if found_terms &amp;gt;= 2:
    return 0.10

return 0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Boosts:&lt;/p&gt;

&lt;p&gt;"John Smith, partner, ID..." → +0.10 ✅&lt;/p&gt;

&lt;p&gt;All 7 Criteria Combined:&lt;br&gt;
pythondef rerank_results(&lt;br&gt;
    results: List[Dict],&lt;br&gt;
    query: str&lt;br&gt;
) -&amp;gt; List[Dict]:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;query_terms = query.lower().split()

for result in results:
    text = result['text']
    base_score = result['score']

    adjustments = [
        full_name_bonus(text, query_terms),
        empty_penalty(text),
        cooccurrence_bonus(text, query_terms),
        similarity_bonus(text, query),
        repetition_penalty(text),
        partial_match_bonus(text, query_terms),
        query_term_bonus(text, query_terms)
    ]

    result['final_score'] = base_score + sum(adjustments)

return sorted(results, key=lambda x: x['final_score'], reverse=True)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Impact: 95% precision in person searches&lt;/p&gt;

&lt;p&gt;🚦 Innovation #4: Confidence Classification&lt;br&gt;
Problem: Not all results have equal reliability&lt;br&gt;
Solution: Automatic confidence scoring&lt;br&gt;
pythondef classify_confidence(score: float) -&amp;gt; str:&lt;br&gt;
    if score &amp;gt;= 0.70:&lt;br&gt;
        return "🟢 HIGH"&lt;br&gt;
    elif score &amp;gt;= 0.50:&lt;br&gt;
        return "🟡 MEDIUM"&lt;br&gt;
    elif score &amp;gt;= 0.30:&lt;br&gt;
        return "🟠 VERIFY"&lt;br&gt;
    else:&lt;br&gt;
        return "🔴 DISCARD"&lt;/p&gt;

&lt;p&gt;results = [&lt;br&gt;
    {"text": "John Smith, partner...", "score": 0.89},&lt;br&gt;
    {"text": "John Smith was born...", "score": 0.73},&lt;br&gt;
    {"text": "J. Smith participates...", "score": 0.58},&lt;br&gt;
    {"text": "Smith family...", "score": 0.42},&lt;br&gt;
    {"text": "John Doe...", "score": 0.15},&lt;br&gt;
]&lt;/p&gt;

&lt;p&gt;for result in results:&lt;br&gt;
    confidence = classify_confidence(result['score'])&lt;br&gt;
    print(f"{confidence}: {result['text'][:30]}...")&lt;br&gt;
Output:&lt;br&gt;
🟢 HIGH: John Smith, partner...&lt;br&gt;
🟢 HIGH: John Smith was born...&lt;br&gt;
🟡 MEDIUM: J. Smith participates...&lt;br&gt;
🟠 VERIFY: Smith family...&lt;br&gt;
🔴 DISCARD: John Doe...&lt;br&gt;
Impact: Zero false positives in high-confidence results&lt;/p&gt;

&lt;p&gt;📊 Real Production Metrics&lt;br&gt;
Test Case: Person Search ("John Smith")&lt;br&gt;
Baseline RAG (single query, FAISS only):&lt;br&gt;
Recall: 11.8%&lt;br&gt;
Precision: 65%&lt;br&gt;
Score: 0.089&lt;br&gt;
False positives: 3/10&lt;br&gt;
Efrat 2.0 (multi-query + adaptive + re-ranking):&lt;br&gt;
Recall: 85.3% (+750% improvement)&lt;br&gt;
Precision: 95%&lt;br&gt;
Score: 0.367 (+312% improvement)&lt;br&gt;
False positives: 0/10&lt;br&gt;
Test Case: Complex Query ("company formation 2020-2023")&lt;br&gt;
Baseline:&lt;br&gt;
Recall: 23%&lt;br&gt;
Precision: 71%&lt;br&gt;
Relevant results: 7/30&lt;br&gt;
Efrat 2.0:&lt;br&gt;
Recall: 79%&lt;br&gt;
Precision: 94%&lt;br&gt;
Relevant results: 27/30&lt;br&gt;
Performance Benchmarks&lt;br&gt;
OperationTimeMemoryIndex 10k docs45s890MBSingle query0.8s+12MBMulti-query (6x)2.1s+45MBRe-ranking 30 results0.3s+8MB&lt;br&gt;
Total: ~2.5s per query, fully offline&lt;/p&gt;

&lt;p&gt;💻 Complete Implementation&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Setup
pythonfrom sentence_transformers import SentenceTransformer
import faiss
from rank_bm25 import BM25Okapi
import numpy as np&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;model = SentenceTransformer('all-MiniLM-L6-v2')&lt;/p&gt;

&lt;p&gt;documents = load_documents("data/")&lt;br&gt;
embeddings = model.encode([doc.text for doc in documents])&lt;/p&gt;

&lt;p&gt;faiss_index = faiss.IndexFlatL2(384)&lt;br&gt;
faiss_index.add(embeddings)&lt;/p&gt;

&lt;p&gt;tokenized_docs = [doc.text.split() for doc in documents]&lt;br&gt;
bm25_index = BM25Okapi(tokenized_docs)&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Query Pipeline&lt;br&gt;
pythondef search(query: str, k: int = 10) -&amp;gt; List[Dict]:&lt;br&gt;
variations = generate_query_variations(query)&lt;/p&gt;

&lt;p&gt;all_results = []&lt;br&gt;
for variant in variations:&lt;br&gt;
    results = adaptive_hybrid_search(&lt;br&gt;
        variant, &lt;br&gt;
        faiss_index, &lt;br&gt;
        bm25_index, &lt;br&gt;
        k=k&lt;br&gt;
    )&lt;br&gt;
    all_results.extend(results)&lt;/p&gt;

&lt;p&gt;deduplicated = remove_duplicates(all_results)&lt;/p&gt;

&lt;p&gt;reranked = rerank_results(deduplicated, query)&lt;/p&gt;

&lt;p&gt;for result in reranked:&lt;br&gt;
    result['confidence'] = classify_confidence(result['final_score'])&lt;/p&gt;

&lt;p&gt;return reranked[:k]&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Usage&lt;br&gt;
pythonresults = search("John Smith", k=5)&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;for i, result in enumerate(results, 1):&lt;br&gt;
    print(f"\n{i}. {result['confidence']}")&lt;br&gt;
    print(f"   Score: {result['final_score']:.3f}")&lt;br&gt;
    print(f"   Text: {result['text'][:100]}...")&lt;br&gt;
Output:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;🟢 HIGH&lt;br&gt;
Score: 0.893&lt;br&gt;
Text: John Smith is a founding partner of the company...&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;🟢 HIGH&lt;br&gt;&lt;br&gt;
Score: 0.761&lt;br&gt;
Text: Birth: John Smith, 03/15/1978...&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;🟡 MEDIUM&lt;br&gt;
Score: 0.612&lt;br&gt;
Text: The Smith family, including John...&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;🟠 VERIFY&lt;br&gt;
Score: 0.445&lt;br&gt;
Text: Meeting with J. Smith about...&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;🔴 DISCARD&lt;br&gt;
Score: 0.187&lt;br&gt;
Text: John Doe and other partners...&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;🎓 Lessons Learned&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Multi-Query Rewriting is a Game-Changer
Single biggest impact: +750% recall
Simple implementation, massive results.
Key insight: Users don't know how documents are written. Generate variations automatically.&lt;/li&gt;
&lt;li&gt;Don't Trust Raw Scores
FAISS and BM25 scores need heavy post-processing.
Named entity disambiguation via context is essential for person searches.&lt;/li&gt;
&lt;li&gt;Adaptive Weighting &amp;gt; Fixed Weighting
No single α value works for all queries.
Dynamic adjustment based on query type yields +20% precision.&lt;/li&gt;
&lt;li&gt;Confidence Classification Saves Time
Auto-triaging results into HIGH/MEDIUM/VERIFY/DISCARD means:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Users focus on high-confidence results first&lt;br&gt;
Manual review time cut by 60%&lt;br&gt;
Zero false positives in production&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Offline is Viable for Production
100% offline with Ollama + FAISS + BM25:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Zero API costs&lt;br&gt;
Full data privacy&lt;br&gt;
Predictable latency&lt;br&gt;
No vendor lock-in&lt;/p&gt;

&lt;p&gt;Trade-off: Slightly lower quality than GPT-4, but 95% precision is good enough.&lt;/p&gt;

&lt;p&gt;🚀 What's Next&lt;br&gt;
Short-term:&lt;/p&gt;

&lt;p&gt;Publish code on GitHub&lt;br&gt;
Write tutorial series on each technique&lt;br&gt;
Add support for multilingual queries&lt;/p&gt;

&lt;p&gt;Medium-term:&lt;/p&gt;

&lt;p&gt;Integrate with Tiramisu Framework v2.0&lt;br&gt;
Combine multi-agent orchestration (Tiramisu) with advanced retrieval (Efrat)&lt;br&gt;
This creates a complete RAG/RAO system with:&lt;/p&gt;

&lt;p&gt;100% routing accuracy (Tiramisu)&lt;br&gt;
95% retrieval precision (Efrat)&lt;br&gt;
Contextual memory (Tiramisu)&lt;br&gt;
Auto-correction (Tiramisu)&lt;/p&gt;

&lt;p&gt;Long-term:&lt;/p&gt;

&lt;p&gt;Agent-to-agent ecosystems via MCP protocol&lt;br&gt;
Distributed search across multiple Efrat instances&lt;br&gt;
Active learning for automatic re-ranking optimization&lt;/p&gt;

&lt;p&gt;📚 Resources&lt;br&gt;
Related Articles:&lt;/p&gt;

&lt;p&gt;Tiramisu Framework v2.0 - Multi-Agent RAO System&lt;/p&gt;

&lt;p&gt;Tech Stack:&lt;/p&gt;

&lt;p&gt;sentence-transformers&lt;br&gt;
FAISS&lt;br&gt;
Rank-BM25&lt;br&gt;
Ollama&lt;/p&gt;

&lt;p&gt;Contact:&lt;/p&gt;

&lt;p&gt;LinkedIn: Tiramisu Framework&lt;br&gt;
PyPI: pip install tiramisu-framework==2.0.0&lt;br&gt;
Email: &lt;a href="mailto:frameworktiramisu@gmail.com"&gt;frameworktiramisu@gmail.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🎯 Key Takeaways&lt;/p&gt;

&lt;p&gt;Multi-query rewriting is the highest ROI technique (+750% recall)&lt;br&gt;
Adaptive hybrid search beats fixed weighting (+20% precision)&lt;br&gt;
Named entity disambiguation via 7-criteria re-ranking achieves 95% precision&lt;br&gt;
Confidence classification enables automatic result triage&lt;br&gt;
100% offline is viable for production with acceptable trade-offs&lt;/p&gt;

&lt;p&gt;Building advanced RAG systems isn't about using the latest LLM - it's about combining multiple techniques that each solve specific problems.&lt;br&gt;
Efrat 2.0 proves you can achieve research-grade results with open-source tools, zero API costs, and full data privacy.&lt;/p&gt;

&lt;p&gt;Questions? Comments? What's your biggest RAG challenge? 👇&lt;/p&gt;

&lt;h1&gt;
  
  
  AI #Python #RAG #MachineLearning #InformationRetrieval #OpenSource
&lt;/h1&gt;

</description>
      <category>rag</category>
      <category>ai</category>
      <category>python</category>
      <category>performance</category>
    </item>
    <item>
      <title>From RAG to RAO Level 6: How I Evolved Tiramisu Framework into a Multi-Agent System</title>
      <dc:creator>tiramisu-framework</dc:creator>
      <pubDate>Tue, 25 Nov 2025 11:50:17 +0000</pubDate>
      <link>https://dev.to/tiramisuframework/from-rag-to-rao-level-6-how-i-evolved-tiramisu-framework-into-a-multi-agent-system-4ebh</link>
      <guid>https://dev.to/tiramisuframework/from-rag-to-rao-level-6-how-i-evolved-tiramisu-framework-into-a-multi-agent-system-4ebh</guid>
      <description>&lt;p&gt;Three weeks ago, I published Tiramisu Framework v1.0 — a simple RAG system for marketing consultancy.&lt;br&gt;
Today, I'm releasing v2.0 — a complete RAO Level 6 multi-agent system with memory, auto-correction, and MCP protocol support.&lt;br&gt;
This is the story of how I evolved it in 2 days (and what I learned building a production-ready AI framework).&lt;/p&gt;

&lt;p&gt;🎯 TL;DR&lt;br&gt;
bashpip install tiramisu-framework==2.0.0&lt;br&gt;
What's new in v2.0:&lt;/p&gt;

&lt;p&gt;✅ Real multi-agent architecture (not simulated)&lt;br&gt;
✅ 100% accurate intelligent routing&lt;br&gt;
✅ Contextual memory (Redis + semantic)&lt;br&gt;
✅ Auto-correction &amp;amp; validation&lt;br&gt;
✅ MCP-ready (agent-discoverable)&lt;br&gt;
✅ RAO Level 6 complete&lt;/p&gt;

&lt;p&gt;🔗 GitHub&lt;br&gt;
🔗 PyPI&lt;br&gt;
📧 &lt;a href="mailto:frameworktiramisu@gmail.com"&gt;frameworktiramisu@gmail.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;⚠️ Important Legal Notice&lt;br&gt;
The consultant names (Philip Kotler, Gary Vaynerchuk, Martha Gabriel) used throughout this article are for illustrative and educational purposes only to demonstrate the multi-agent architecture concept.&lt;br&gt;
The actual Tiramisu Framework v2.0 distributed on PyPI is a generic, customizable system where you:&lt;/p&gt;

&lt;p&gt;✅ Add your own knowledge base and documents&lt;br&gt;
✅ Define your own expert personas and personalities&lt;br&gt;
✅ Configure your own agent behaviors and specializations&lt;br&gt;
✅ Use any domain experts relevant to your use case&lt;/p&gt;

&lt;p&gt;No proprietary content, copyrighted materials, or brand names are included in the distributed package.&lt;br&gt;
The framework provides the architecture and orchestration; you provide the content and expertise.&lt;br&gt;
Think of it as a template: we show "Strategy Expert + Social Media Expert + Tech Expert" as an example, but you could create "Legal Expert + Financial Expert + HR Expert" or any other combination for your domain.&lt;/p&gt;

&lt;p&gt;📊 The Evolution: v1.0 → v2.0&lt;br&gt;
Featurev1.0 (RAG Basic)v2.0 (RAO Level 6)ArchitectureSingle LLMMulti-Agent SystemExpertsSimulated (prompts)Real agents (independent code)RoutingNoneHybrid Supervisor (keywords + LLM)Routing AccuracyN/A100% (tested 50+ queries)MemorySQLite onlyRedis + Semantic patternsValidationManualAuto-correction (Auditor + Gatekeeper)Chunking800 chars1200 chars (40% better context)DiscoverabilityNoMCP-readyLines of Code~600~2,488&lt;/p&gt;

&lt;p&gt;🧠 What is RAO? (And Why It Matters)&lt;br&gt;
RAG (Retrieval-Augmented Generation) = Search + Generate&lt;br&gt;
RAO (Reasoning + Acting + Orchestration) = Think + Do + Coordinate&lt;br&gt;
RAO Levels (0-6):&lt;br&gt;
Level 0-2: RAG (retrieval + generation)&lt;br&gt;
Level 3: Memory (context between interactions) ✅&lt;br&gt;
Level 4: Executor (real actions) ✅&lt;br&gt;
Level 5: Multi-Agent (coordinated specialists) ✅&lt;br&gt;
Level 6: MCP-ready (discoverable by other agents) ✅&lt;/p&gt;

&lt;p&gt;Tiramisu v2.0 = Level 6 complete&lt;br&gt;
Most RAG systems stop at Level 2. We went to 6.&lt;/p&gt;

&lt;p&gt;🏗️ The New Architecture&lt;br&gt;
v1.0 - Single LLM Approach:&lt;br&gt;
User Query → FAISS Search → GPT-4 (simulates 3 experts) → Mixed Response&lt;br&gt;
❌ Problem: All experts "spoke" at once. Generic, unfocused responses.&lt;br&gt;
v2.0 - Multi-Agent System:&lt;br&gt;
User Query &lt;br&gt;
   ↓&lt;br&gt;
Supervisor Agent (routes intelligently)&lt;br&gt;
   ↓&lt;br&gt;
Kotler Agent | Gary Vee Agent | Martha Agent&lt;br&gt;
   ↓&lt;br&gt;
Specialized FAISS search (filtered by expert)&lt;br&gt;
   ↓&lt;br&gt;
GPT-4 with expert personality&lt;br&gt;
   ↓&lt;br&gt;
Focused, expert response&lt;br&gt;
✅ Result: Each agent maintains unique voice, expertise, and context.&lt;/p&gt;

&lt;p&gt;💻 Code Comparison&lt;br&gt;
v1.0 - Everything Mixed:&lt;br&gt;
pythonfrom tiramisu import TiramisuRAG&lt;/p&gt;

&lt;p&gt;rag = TiramisuRAG()&lt;br&gt;
response = rag.analyze("How to improve Instagram?")&lt;/p&gt;

&lt;h1&gt;
  
  
  Returns: Mixed insights from all 3 experts
&lt;/h1&gt;

&lt;p&gt;v2.0 - Intelligent Routing:&lt;br&gt;
pythonfrom tiramisu.agents import TiramisuMultiAgent&lt;/p&gt;

&lt;p&gt;system = TiramisuMultiAgent()&lt;br&gt;
result = system.process("How to improve Instagram?")&lt;/p&gt;

&lt;p&gt;print(result['consultant'])  # "Gary" (social media expert)&lt;br&gt;
print(result['response'])     # 100% Gary Vee style!&lt;br&gt;
The difference? v1.0 mixed everyone's opinion. v2.0 routes to the RIGHT expert.&lt;/p&gt;

&lt;p&gt;🎯 Feature 1: Hybrid Supervisor (100% Accuracy)&lt;br&gt;
The Challenge:&lt;br&gt;
First attempt: Pure LLM routing.&lt;br&gt;
python# ❌ This failed - sent EVERYTHING to Kotler&lt;br&gt;
def route(query):&lt;br&gt;
    response = llm.invoke(f"Route this query: {query}")&lt;br&gt;
    return response  # Always returned "Kotler"&lt;br&gt;
Why? GPT-4 defaulted to the "strategic" expert for ambiguous queries.&lt;br&gt;
The Solution: Hybrid Approach&lt;br&gt;
pythonclass SupervisorAgent:&lt;br&gt;
    def route(self, query: str):&lt;br&gt;
        query_lower = query.lower()&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    # Layer 1: Keywords (fast, 95% of cases)
    gary_keywords = ["instagram", "tiktok", "social", "content"]
    if any(kw in query_lower for kw in gary_keywords):
        return "Gary"

    martha_keywords = ["ai", "automation", "data", "tech"]
    if any(kw in query_lower for kw in martha_keywords):
        return "Martha"

    # Layer 2: LLM (complex cases)
    return self.llm_route(query)  # Fallback
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Result: 100% accuracy on 50+ test queries.&lt;br&gt;
Lesson: Hybrid beats pure LLM for classification tasks.&lt;/p&gt;

&lt;p&gt;🧩 Feature 2: Real Multi-Agent Architecture&lt;br&gt;
Each agent is independent code with:&lt;/p&gt;

&lt;p&gt;Specialized FAISS search (filtered by expert)&lt;br&gt;
Unique personality (temperature, tone, style)&lt;br&gt;
Expert prompting (deep character simulation)&lt;/p&gt;

&lt;p&gt;Example: Gary Vee Agent&lt;br&gt;
pythonclass GaryAgent:&lt;br&gt;
    def &lt;strong&gt;init&lt;/strong&gt;(self):&lt;br&gt;
        self.llm = ChatOpenAI(&lt;br&gt;
            model="gpt-4",&lt;br&gt;
            temperature=0.7  # More creative&lt;br&gt;
        )&lt;br&gt;
        self.style_prompt = """&lt;br&gt;
You are Gary Vaynerchuk.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;DIRECT and NO BS&lt;/li&gt;
&lt;li&gt;Focus on EXECUTION&lt;/li&gt;
&lt;li&gt;ENERGETIC language&lt;/li&gt;
&lt;li&gt;Real examples&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Authentic content obsession&lt;br&gt;
"""&lt;/p&gt;

&lt;p&gt;def search(self, query):&lt;br&gt;
    # Filter FAISS: only Gary Vee content&lt;br&gt;
    results = []&lt;br&gt;
    for doc in faiss_results:&lt;br&gt;
        if "gary" in doc['source'].lower():&lt;br&gt;
            results.append(doc)&lt;br&gt;
    return results&lt;br&gt;
Compare with Kotler Agent:&lt;br&gt;
pythonclass KotlerAgent:&lt;br&gt;
def &lt;strong&gt;init&lt;/strong&gt;(self):&lt;br&gt;
    self.llm = ChatOpenAI(&lt;br&gt;
        model="gpt-4",&lt;br&gt;
        temperature=0.3  # More conservative&lt;br&gt;
    )&lt;br&gt;
    self.style_prompt = """&lt;br&gt;
You are Philip Kotler.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;ANALYTICAL and STRUCTURED&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Based on FRAMEWORKS (4Ps, SWOT)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;ACADEMIC but accessible&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Long-term strategy focus&lt;br&gt;
"""&lt;br&gt;
Result: Each agent has distinct voice, expertise, and behavior.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;💾 Feature 3: Contextual Memory&lt;br&gt;
The Problem:&lt;br&gt;
User: "Tell me about Instagram strategy"&lt;br&gt;
Bot: [responds]&lt;br&gt;
User: "What about budget?"&lt;br&gt;
Bot: "Budget for what?" ❌ Lost context!&lt;br&gt;
The Solution: Dual Memory System&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Short-term (Redis):&lt;br&gt;
pythonclass SessionMemory:&lt;br&gt;
def &lt;strong&gt;init&lt;/strong&gt;(self):&lt;br&gt;
    self.redis = redis.Redis()&lt;/p&gt;

&lt;p&gt;def add_interaction(self, session_id, query, response):&lt;br&gt;
    key = f"session:{session_id}:history"&lt;br&gt;
    self.redis.lpush(key, json.dumps({&lt;br&gt;
        "query": query,&lt;br&gt;
        "response": response,&lt;br&gt;
        "timestamp": datetime.now()&lt;br&gt;
    }))&lt;br&gt;
    self.redis.expire(key, 3600)  # 1 hour TTL&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Long-term (Semantic patterns):&lt;br&gt;
pythonclass SemanticMemory:&lt;br&gt;
def detect_patterns(self, user_id):&lt;br&gt;
    # Analyzes: frequent topics, preferences, style&lt;br&gt;
    return {&lt;br&gt;
        "preferred_consultant": "Gary",&lt;br&gt;
        "topics": ["social media", "content"],&lt;br&gt;
        "tone": "practical"&lt;br&gt;
    }&lt;br&gt;
Result: Bot remembers context, adapts to user preferences.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;✅ Feature 4: Auto-Correction (Auditor + Gatekeeper)&lt;br&gt;
Input Validation (Gatekeeper):&lt;br&gt;
pythonclass Gatekeeper:&lt;br&gt;
    def validate_query(self, query: str):&lt;br&gt;
        score = self.llm.invoke(f"""&lt;br&gt;
        Rate clarity (0-10): "{query}"&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    Is it specific enough to answer?
    """)

    if score &amp;lt; 5:
        return {
            "valid": False,
            "clarification_needed": "Please specify..."
        }
    return {"valid": True}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Output Validation (Auditor):&lt;br&gt;
pythonclass ResponseAuditor:&lt;br&gt;
    def audit(self, response: str, query: str):&lt;br&gt;
        scores = self.evaluate({&lt;br&gt;
            "completeness": "Does it fully answer?",&lt;br&gt;
            "accuracy": "Is it factually correct?",&lt;br&gt;
            "relevance": "Stays on topic?",&lt;br&gt;
            "actionability": "Provides clear actions?",&lt;br&gt;
            "expertise": "Matches consultant's style?"&lt;br&gt;
        })&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    if scores['average'] &amp;lt; 7:
        return {"reprocess": True, "reason": "Low quality"}
    return {"approved": True}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Real example from tests:&lt;br&gt;
Query: "Marketing strategy"&lt;br&gt;
First response: Generic overview (score: 6.2)&lt;br&gt;
Auto-correction triggered ✅&lt;br&gt;
Second response: Specific 4Ps analysis (score: 8.7)&lt;br&gt;
Lesson: Auto-validation dramatically improves output quality.&lt;/p&gt;

&lt;p&gt;🔧 Feature 5: Optimized Chunking&lt;br&gt;
The VUCA Problem (from another project):&lt;br&gt;
Document: "VUCA means: Volatility, Uncertainty, &lt;br&gt;
Complexity, and Ambiguity"&lt;/p&gt;

&lt;p&gt;With chunk_size=800:&lt;br&gt;
Chunk 1: "VUCA means: Volatility, Uncertainty"&lt;br&gt;
Chunk 2: "Complexity, and Ambiguity"&lt;/p&gt;

&lt;p&gt;Query: "What is VUCA?"&lt;br&gt;
Result: Incomplete answer ❌&lt;br&gt;
The Solution:&lt;br&gt;
python# v1.0&lt;br&gt;
chunk_size = 800&lt;br&gt;
chunk_overlap = 150&lt;/p&gt;

&lt;h1&gt;
  
  
  v2.0
&lt;/h1&gt;

&lt;p&gt;chunk_size = 1200  # +50% context&lt;br&gt;
chunk_overlap = 200  # +33% safety margin&lt;br&gt;
Result: Concepts like "4Ps", "SWOT", "Customer Journey" preserved completely.&lt;br&gt;
Lesson: Larger chunks = better context preservation (within reason).&lt;/p&gt;

&lt;p&gt;🌐 Feature 6: MCP-Ready (Agent Discoverable)&lt;br&gt;
What if OTHER AI agents could discover and use Tiramisu?&lt;br&gt;
python# MCP Protocol Support&lt;br&gt;
@app.get("/agent/mcp/capabilities")&lt;br&gt;
def get_capabilities():&lt;br&gt;
    return {&lt;br&gt;
        "framework": "Tiramisu",&lt;br&gt;
        "version": "2.0.0",&lt;br&gt;
        "capabilities": {&lt;br&gt;
            "marketing_analysis": {&lt;br&gt;
                "consultants": ["Strategy", "Digital", "Tech"],&lt;br&gt;
                "methods": ["analyze", "consult", "plan"],&lt;br&gt;
                "output_formats": ["json", "markdown", "structured"]&lt;br&gt;
            }&lt;br&gt;
        },&lt;br&gt;
        "endpoints": {&lt;br&gt;
            "analyze": "/agent/mcp/analyze",&lt;br&gt;
            "consultants": "/agent/mcp/consultants"&lt;br&gt;
        }&lt;br&gt;
    }&lt;br&gt;
Result: Tiramisu is now discoverable by Claude, GPT, and other agents via MCP protocol.&lt;/p&gt;

&lt;p&gt;📈 Performance Metrics&lt;br&gt;
Response Time:&lt;br&gt;
Simple query (1 agent):     ~15s&lt;br&gt;
Complex query (3 agents):   ~30-40s&lt;br&gt;
With auto-correction:       +5-10s&lt;br&gt;
Accuracy:&lt;br&gt;
Routing accuracy:           100% (50+ queries tested)&lt;br&gt;
Auto-correction triggers:   ~12% of queries&lt;br&gt;
Quality improvement:        40% (user feedback)&lt;br&gt;
Memory:&lt;br&gt;
Context retention:          5 interactions&lt;br&gt;
Session duration:           1 hour (configurable)&lt;br&gt;
Semantic patterns:          Learned over time&lt;/p&gt;

&lt;p&gt;🚧 Technical Challenges Solved&lt;br&gt;
Challenge 1: Python 3.13 Incompatibility&lt;br&gt;
Problem: FAISS doesn't support Python 3.13 yet.&lt;br&gt;
Solution:&lt;br&gt;
bash# Use Python 3.12&lt;br&gt;
python3.12 -m venv venv&lt;br&gt;
source venv/bin/activate&lt;br&gt;
pip install tiramisu-framework&lt;br&gt;
Lesson: Always check compatibility matrix for ML libraries.&lt;/p&gt;

&lt;p&gt;Challenge 2: Pydantic Pickle Incompatibility&lt;br&gt;
Problem: Metadata saved with Pydantic v1 couldn't load in v2.&lt;br&gt;
Solution:&lt;br&gt;
python# Rebuild metadata with current Pydantic version&lt;br&gt;
def rebuild_metadata(old_pkl_path):&lt;br&gt;
    # Load raw data, reconstruct as dict, re-save&lt;br&gt;
    with open(old_pkl_path, 'rb') as f:&lt;br&gt;
        raw = pickle.load(f, encoding='latin1')&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;clean_data = [
    {"content": doc.content, "source": doc.source}
    for doc in raw if hasattr(doc, 'content')
]

with open(new_pkl_path, 'wb') as f:
    pickle.dump(clean_data, f)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Lesson: Avoid pickling Pydantic models; use JSON instead.&lt;/p&gt;

&lt;p&gt;Challenge 3: FAISS Dimension Mismatch&lt;br&gt;
Problem:&lt;br&gt;
FAISS index: 3072 dimensions (text-embedding-3-large)&lt;br&gt;
Default OpenAI: 1536 dimensions (text-embedding-ada-002)&lt;br&gt;
AssertionError: Dimension mismatch!&lt;br&gt;
Solution:&lt;br&gt;
python# Always specify model explicitly&lt;br&gt;
embeddings = OpenAIEmbeddings(&lt;br&gt;
    model="text-embedding-3-large"  # 3072 dims&lt;br&gt;
)&lt;br&gt;
Lesson: Document your embedding model choice in README.&lt;/p&gt;

&lt;p&gt;📚 What I Learned Building This&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Multi-Agent ≠ Multiple Prompts&lt;br&gt;
Wrong approach:&lt;br&gt;
python# This is NOT multi-agent&lt;br&gt;
prompt = "Think like Kotler, then Gary, then Martha"&lt;br&gt;
response = llm(prompt)&lt;br&gt;
Right approach:&lt;br&gt;
python# Real multi-agent: separate code, memory, behavior&lt;br&gt;
kotler = KotlerAgent()  # Independent&lt;br&gt;
gary = GaryAgent()      # Independent&lt;br&gt;
martha = MarthaAgent()  # Independent&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Hybrid Systems Beat Pure LLM&lt;br&gt;
For routing, classification, validation:&lt;br&gt;
Keywords (fast, deterministic) + LLM (smart, flexible) = Best of both&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Auto-Validation is a Game-Changer&lt;br&gt;
Before: Manual quality checks.&lt;br&gt;
After: System self-corrects automatically.&lt;br&gt;
ROI: 40% quality improvement, zero human intervention.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Chunking is Critical&lt;br&gt;
Too small = fragmented concepts.&lt;br&gt;
Too large = irrelevant noise.&lt;br&gt;
Sweet spot: 1200 chars with 200 overlap (for most use cases).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Memory Makes AI Feel "Real"&lt;br&gt;
Without memory: Bot feels robotic.&lt;br&gt;
With memory: Bot feels like a real consultant who remembers you.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;🔮 What's Next: v3.0 Roadmap&lt;/p&gt;

&lt;p&gt;GUI (Streamlit + Next.js dashboard)&lt;br&gt;
 More Agents (SEO, Email, Analytics, Branding)&lt;br&gt;
 Benchmarks (vs Perplexity, Claude, GPT)&lt;br&gt;
 One-click Deploy (Railway, Render, AWS)&lt;br&gt;
 CRM Integration (HubSpot, Salesforce)&lt;br&gt;
 Multi-language (Spanish, Portuguese)&lt;br&gt;
 A/B Testing (compare agent responses)&lt;/p&gt;

&lt;p&gt;🛠️ Try It Now&lt;br&gt;
Installation:&lt;br&gt;
bashpip install tiramisu-framework==2.0.0&lt;br&gt;
Quick Test:&lt;br&gt;
pythonfrom tiramisu.agents import TiramisuMultiAgent&lt;/p&gt;

&lt;p&gt;system = TiramisuMultiAgent()&lt;/p&gt;

&lt;h1&gt;
  
  
  Simple query
&lt;/h1&gt;

&lt;p&gt;result = system.process("How to improve Instagram engagement?")&lt;br&gt;
print(f"Consultant: {result['consultant']}")  # "Gary"&lt;br&gt;
print(f"Response: {result['response']}")&lt;/p&gt;

&lt;h1&gt;
  
  
  Complex query (multiple agents)
&lt;/h1&gt;

&lt;p&gt;result = system.process_complex(&lt;br&gt;
    "I need a complete digital marketing strategy for a B2B SaaS startup"&lt;br&gt;
)&lt;/p&gt;

&lt;h1&gt;
  
  
  Consults Kotler (strategy) + Gary (tactics) + Martha (tech)
&lt;/h1&gt;

&lt;p&gt;print(result['response'])&lt;br&gt;
Run API Server:&lt;br&gt;
bash# Clone repo&lt;br&gt;
git clone &lt;a href="https://github.com/tiramisu-framework/tiramisu" rel="noopener noreferrer"&gt;https://github.com/tiramisu-framework/tiramisu&lt;/a&gt;&lt;br&gt;
cd tiramisu&lt;/p&gt;

&lt;h1&gt;
  
  
  Install
&lt;/h1&gt;

&lt;p&gt;pip install -e .&lt;/p&gt;

&lt;h1&gt;
  
  
  Set API key
&lt;/h1&gt;

&lt;p&gt;export OPENAI_API_KEY="your-key"&lt;/p&gt;

&lt;h1&gt;
  
  
  Run
&lt;/h1&gt;

&lt;p&gt;uvicorn tiramisu.api.main:app --reload&lt;/p&gt;

&lt;h1&gt;
  
  
  Test
&lt;/h1&gt;

&lt;p&gt;curl &lt;a href="http://localhost:8000/api/analyze" rel="noopener noreferrer"&gt;http://localhost:8000/api/analyze&lt;/a&gt; \&lt;br&gt;
  -H "Content-Type: application/json" \&lt;br&gt;
  -d '{"query": "Marketing strategy for SaaS"}'&lt;/p&gt;

&lt;p&gt;📊 Real Example Output&lt;br&gt;
Input:&lt;br&gt;
B2B SaaS startup, $30k/month marketing budget, &lt;br&gt;
need better lead quality from inbound channels&lt;br&gt;
Output (Kotler + Gary + Martha synthesis):&lt;br&gt;
markdown🌱 ROOTS (Kotler - Strategic Analysis)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Current ICP unclear (mixing SMB + Enterprise)&lt;/li&gt;
&lt;li&gt;Value prop not differentiated enough&lt;/li&gt;
&lt;li&gt;CAC too high ($450) vs LTV ($3.2k)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🌳 TRUNK (Core Strategy)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Refine ICP: Focus on 50-500 employee tech companies&lt;/li&gt;
&lt;li&gt;ABM approach: Target 100 high-fit accounts&lt;/li&gt;
&lt;li&gt;Content: Problem-aware → Solution-aware funnel&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🍃 BRANCHES (Gary - Tactical Execution)&lt;br&gt;
Week 1-2: LinkedIn thought leadership (3x/week)&lt;br&gt;
Week 3-4: Case study content + webinars&lt;br&gt;
Week 5-8: Retargeting + email nurture sequences&lt;br&gt;
Budget: 60% content, 30% ads, 10% tools&lt;/p&gt;

&lt;p&gt;🤖 TECH ENABLEMENT (Martha)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HubSpot + Clearbit for enrichment&lt;/li&gt;
&lt;li&gt;Drift for qualification&lt;/li&gt;
&lt;li&gt;Mixpanel for behavior tracking&lt;/li&gt;
&lt;li&gt;Zapier for automation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;KPIs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MQL → SQL: 40% → 60%&lt;/li&gt;
&lt;li&gt;CAC: $450 → $280&lt;/li&gt;
&lt;li&gt;Sales cycle: 45 → 30 days&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🤝 Contributing&lt;br&gt;
We're looking for contributors in:&lt;/p&gt;

&lt;p&gt;Agent Development: New expert personalities&lt;br&gt;
Frontend: React/Next.js dashboard&lt;br&gt;
Testing: Automated test suites&lt;br&gt;
Documentation: Tutorials, guides, videos&lt;br&gt;
Integrations: CRMs, analytics tools&lt;/p&gt;

&lt;p&gt;How to contribute:&lt;/p&gt;

&lt;p&gt;Fork the repo&lt;br&gt;
Create feature branch&lt;br&gt;
Submit PR with tests&lt;br&gt;
Join our Discord (coming soon)&lt;/p&gt;

&lt;p&gt;📜 License &amp;amp; Business Model&lt;br&gt;
Framework: MIT License (free, open-source)&lt;br&gt;
Business Model:&lt;/p&gt;

&lt;p&gt;Free: Core framework&lt;br&gt;
Paid: Expanded knowledge bases, custom integrations, support, white-label&lt;/p&gt;

&lt;p&gt;Why open source?&lt;/p&gt;

&lt;p&gt;Transparency builds trust&lt;br&gt;
Community accelerates innovation&lt;br&gt;
Better product through feedback&lt;/p&gt;

&lt;p&gt;📚 Resources&lt;br&gt;
🔗 GitHub: tiramisu-framework/tiramisu&lt;br&gt;
🔗 PyPI: pypi.org/project/tiramisu-framework/2.0.0/&lt;br&gt;
📧 Email: &lt;a href="mailto:frameworktiramisu@gmail.com"&gt;frameworktiramisu@gmail.com&lt;/a&gt;&lt;br&gt;
📖 Docs: [Coming soon]&lt;br&gt;
💬 Discord: [Coming soon]&lt;/p&gt;

&lt;p&gt;🙏 Acknowledgments&lt;br&gt;
Built with:&lt;/p&gt;

&lt;p&gt;LangChain (RAG orchestration)&lt;br&gt;
OpenAI GPT-4 (LLM)&lt;br&gt;
FAISS (vector search)&lt;br&gt;
Redis (memory)&lt;br&gt;
FastAPI (API)&lt;/p&gt;

&lt;p&gt;Inspired by:&lt;/p&gt;

&lt;p&gt;LlamaIndex (RAG patterns)&lt;br&gt;
DSPy (structured prompting)&lt;br&gt;
AutoGen (multi-agent concepts)&lt;/p&gt;

&lt;p&gt;💬 Let's Connect&lt;br&gt;
I'd love to hear:&lt;/p&gt;

&lt;p&gt;What you build with Tiramisu&lt;br&gt;
Feature requests&lt;br&gt;
Technical challenges you face&lt;br&gt;
Ideas for v3.0&lt;/p&gt;

&lt;p&gt;Comment below or reach out:&lt;br&gt;
📧 &lt;a href="mailto:frameworktiramisu@gmail.com"&gt;frameworktiramisu@gmail.com&lt;/a&gt;&lt;br&gt;
🐙 &lt;a class="mentioned-user" href="https://dev.to/tiramisuframework"&gt;@tiramisuframework&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🎯 Final Thoughts&lt;br&gt;
Three weeks ago, Tiramisu was a simple RAG system.&lt;br&gt;
Today, it's a production-ready RAO Level 6 multi-agent framework with:&lt;/p&gt;

&lt;p&gt;Real specialized agents&lt;br&gt;
Intelligent routing (100% accuracy)&lt;br&gt;
Contextual memory&lt;br&gt;
Auto-correction&lt;br&gt;
MCP protocol support&lt;/p&gt;

&lt;p&gt;The journey from v1.0 to v2.0 taught me:&lt;/p&gt;

&lt;p&gt;Multi-agent systems require architectural thinking&lt;br&gt;
Hybrid approaches beat pure LLM&lt;br&gt;
Auto-validation is essential for production&lt;br&gt;
Memory transforms user experience&lt;br&gt;
Open source accelerates innovation&lt;/p&gt;

&lt;p&gt;What's your experience building RAG systems?&lt;br&gt;
Have you tried multi-agent architectures?&lt;br&gt;
Let's discuss in the comments! 👇&lt;/p&gt;

&lt;p&gt;If you found this helpful, please ⭐ the GitHub repo and share with your network!&lt;/p&gt;

&lt;h1&gt;
  
  
  ai #python #opensource #rag #multiagent #llm #gpt4 #langchain #machinelearning #artificialintelligence
&lt;/h1&gt;

</description>
      <category>agents</category>
      <category>rag</category>
      <category>ai</category>
      <category>python</category>
    </item>
    <item>
      <title>Building Tiramisu: An Open-Source Multi-Expert RAG Framework for Marketing Consultancy</title>
      <dc:creator>tiramisu-framework</dc:creator>
      <pubDate>Wed, 29 Oct 2025 09:36:26 +0000</pubDate>
      <link>https://dev.to/tiramisuframework/building-tiramisu-an-open-source-multi-expert-rag-framework-for-marketing-consultancy-lc7</link>
      <guid>https://dev.to/tiramisuframework/building-tiramisu-an-open-source-multi-expert-rag-framework-for-marketing-consultancy-lc7</guid>
      <description>&lt;p&gt;TL;DR&lt;/p&gt;

&lt;p&gt;I just published Tiramisu Framework — an open-source Python framework that provides AI-powered marketing consultancy by synthesizing insights from three complementary perspectives using RAG (Retrieval-Augmented Generation).&lt;br&gt;
pip install tiramisu-framework&lt;br&gt;
🔗 &lt;a href="https://github.com/tiramisu-framework/tiramisu" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;br&gt;
🔗 &lt;a href="https://pypi.org/project/tiramisu-framework/" rel="noopener noreferrer"&gt;PyPI&lt;/a&gt;&lt;br&gt;
📧 &lt;a href="mailto:frameworktiramisu@gmail.com"&gt;frameworktiramisu@gmail.com&lt;/a&gt;&lt;br&gt;
The Problem&lt;/p&gt;

&lt;p&gt;Traditional marketing consultancy is:&lt;br&gt;
    • Expensive ($10k–50k+ per engagement)&lt;br&gt;
    • Slow (weeks to months)&lt;br&gt;
    • Not scalable (limited expert availability)&lt;br&gt;
    • Single-perspective (one consultant = one viewpoint)&lt;/p&gt;

&lt;p&gt;Businesses need strategic guidance now, not weeks from now.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;The Solution: Multi-Perspective RAG&lt;/p&gt;

&lt;p&gt;What if you could get marketing analysis from three complementary perspectives — strategic fundamentals, digital tactics, and transformation strategy — instantly?&lt;br&gt;
That’s what Tiramisu Framework does.&lt;/p&gt;

&lt;p&gt;The Three Perspectives&lt;br&gt;
    1.  Strategic Marketing Fundamentals → positioning, competitive analysis, core principles&lt;br&gt;
    2.  Digital Marketing &amp;amp; Social Media → modern tactics, content strategy, engagement&lt;br&gt;
    3.  Digital Transformation &amp;amp; Innovation → tech integration, business model innovation&lt;/p&gt;

&lt;p&gt;Architecture&lt;br&gt;
User Query&lt;br&gt;
    ↓&lt;br&gt;
Query Expansion (synonyms, related terms)&lt;br&gt;
    ↓&lt;br&gt;
FAISS Vector Search (semantic retrieval)&lt;br&gt;
    ↓&lt;br&gt;
Context Assembly (relevant chunks from 3 perspectives)&lt;br&gt;
    ↓&lt;br&gt;
GPT-4 Synthesis (structured analysis)&lt;br&gt;
    ↓&lt;br&gt;
Parsed Response (Roots → Trunk → Branches)&lt;br&gt;
Tech Stack&lt;/p&gt;

&lt;p&gt;Core: Python 3.11+, FastAPI, LangChain, FAISS (Meta AI), OpenAI GPT-4&lt;br&gt;
Features: CLI (tiramisu init, build-index, run), REST API + conversation management, SQLite, Pydantic schemas&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;Code Walkthrough&lt;/p&gt;

&lt;p&gt;RAG Initialization&lt;br&gt;
from tiramisu import TiramisuRAG&lt;/p&gt;

&lt;p&gt;rag = TiramisuRAG(&lt;br&gt;
    faiss_index_path="data/faiss_index",&lt;br&gt;
    openai_api_key="your-key"&lt;br&gt;
)&lt;br&gt;
from tiramisu import TiramisuRAG&lt;/p&gt;

&lt;p&gt;rag = TiramisuRAG(&lt;br&gt;
    faiss_index_path="data/faiss_index",&lt;br&gt;
    openai_api_key="your-key"&lt;br&gt;
)&lt;br&gt;
Simple Analysis&lt;br&gt;
query = """&lt;br&gt;
B2B SaaS startup, $50k/month marketing budget.&lt;br&gt;
Need to improve inbound lead generation.&lt;br&gt;
"""&lt;br&gt;
result = rag.analyze(query)&lt;br&gt;
print(result)&lt;br&gt;
Conversational Mode&lt;br&gt;
from tiramisu.core import ConversationManager&lt;/p&gt;

&lt;p&gt;manager = ConversationManager()&lt;br&gt;
conv_id = manager.create_conversation(title="Marketing Strategy Discussion")&lt;br&gt;
response = manager.add_message(conversation_id=conv_id, user_message="How do I position against competitors?")&lt;br&gt;
history = manager.get_conversation_history(conv_id)&lt;br&gt;
The “Three Trees” Methodology&lt;/p&gt;

&lt;p&gt;🌱 ROOTS (Foundations)&lt;/p&gt;

&lt;p&gt;Deep context, root causes, resources/capabilities.&lt;/p&gt;

&lt;p&gt;🌳 TRUNK (Core Strategy)&lt;/p&gt;

&lt;p&gt;Positioning, value proposition, competitive differentiation.&lt;/p&gt;

&lt;p&gt;🍃 BRANCHES (Tactics)&lt;/p&gt;

&lt;p&gt;Action plan, KPIs, timeline.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;CLI in Action&lt;/p&gt;

&lt;h1&gt;
  
  
  Initialize project
&lt;/h1&gt;

&lt;p&gt;tiramisu init my-marketing-ai&lt;/p&gt;

&lt;h1&gt;
  
  
  Add your own documents
&lt;/h1&gt;

&lt;p&gt;tiramisu add-docs ./marketing-docs/&lt;/p&gt;

&lt;h1&gt;
  
  
  Build FAISS index
&lt;/h1&gt;

&lt;p&gt;tiramisu build-index&lt;/p&gt;

&lt;h1&gt;
  
  
  Start API server
&lt;/h1&gt;

&lt;p&gt;tiramisu run&lt;/p&gt;

&lt;h1&gt;
  
  
  → &lt;a href="http://127.0.0.1:8000" rel="noopener noreferrer"&gt;http://127.0.0.1:8000&lt;/a&gt;
&lt;/h1&gt;

&lt;p&gt;API Endpoints&lt;br&gt;
POST /analyze&lt;br&gt;
POST /conversations&lt;br&gt;
POST /conversations/{id}/messages&lt;br&gt;
GET  /conversations/{id}/history&lt;/p&gt;

&lt;p&gt;Why Open Source?&lt;/p&gt;

&lt;p&gt;Transparency, credibility, community.&lt;br&gt;
Business model: framework free; paid services for expanded knowledge bases, custom integrations, support, white-label.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;Challenges Solved&lt;/p&gt;

&lt;p&gt;Query expansion&lt;br&gt;
"improve marketing" →&lt;br&gt;
["enhance marketing","optimize campaigns","increase ROI","boost engagement"]&lt;/p&gt;

&lt;p&gt;Multi-perspective synthesis&lt;br&gt;
Retrieve strategic + digital + transformation contexts → synthesize with perspective-aware prompting&lt;/p&gt;

&lt;p&gt;Context window management&lt;br&gt;
Smart chunking (800/150) + re-ranking + top-k&lt;/p&gt;

&lt;p&gt;Structured output&lt;br&gt;
{ "roots": {...}, "trunk": {...}, "branches": {...},&lt;br&gt;
  "perspective_insights": { "strategic": "...", "digital": "...", "transformation": "..." } }&lt;/p&gt;

&lt;p&gt;Performance&lt;br&gt;
    • Retrieval: &amp;lt;100ms (FAISS)&lt;br&gt;
    • Generation: 3–8s (GPT-4)&lt;br&gt;
    • Total: &amp;lt;10s per analysis&lt;br&gt;
    • Async FastAPI for concurrency&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;Installation &amp;amp; Quick Test&lt;br&gt;
pip install tiramisu-framework&lt;br&gt;
python -c "from tiramisu import TiramisuRAG; print('✅ Ready')"&lt;br&gt;
tiramisu init demo &amp;amp;&amp;amp; cd demo &amp;amp;&amp;amp; tiramisu run&lt;/p&gt;

&lt;p&gt;Real-World Example (simplified)&lt;/p&gt;

&lt;p&gt;Input&lt;br&gt;
B2B SaaS, low lead quality, $30k/month budget&lt;/p&gt;

&lt;p&gt;Output&lt;br&gt;
🌱 ROOTS — misaligned targeting; unclear value prop&lt;br&gt;
🌳 TRUNK — ABM with ICP refinement + personalized nurture&lt;br&gt;
🍃 BRANCHES — 8-week plan; KPIs: Lead→SQL, CAC, velocity&lt;/p&gt;

&lt;p&gt;What’s Next&lt;/p&gt;

&lt;p&gt;v1.1: more perspective domains, dashboard, multi-language, CRM integrations&lt;br&gt;
v2.0: multi-agent collab, predictive analytics, A/B testing&lt;/p&gt;

&lt;p&gt;Lessons Learned&lt;/p&gt;

&lt;p&gt;RAG ≠ only vector search • Structured prompts win • Synthesis &amp;gt; concatenation • Conversation state is hard • Good CLI matters • Open source builds trust&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;Try It Now&lt;br&gt;
🔗 &lt;a href="https://github.com/tiramisu-framework/tiramisu" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;br&gt;
🔗 &lt;a href="https://pypi.org/project/tiramisu-framework/" rel="noopener noreferrer"&gt;PyPI&lt;/a&gt;&lt;br&gt;
📧 &lt;a href="mailto:frameworktiramisu@gmail.com"&gt;frameworktiramisu@gmail.com&lt;/a&gt;&lt;br&gt;
Contributing&lt;/p&gt;

&lt;p&gt;PRs welcome! Areas: domain curation, React/Next dashboard, tests/CI, docs, alt embeddings.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>opensource</category>
      <category>rag</category>
    </item>
  </channel>
</rss>
