<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: JOHN MWACHARO</title>
    <description>The latest articles on DEV Community by JOHN MWACHARO (@mwacharo).</description>
    <link>https://dev.to/mwacharo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1127005%2Fc859a417-96a9-4945-9e0e-c605713ed478.jpg</url>
      <title>DEV Community: JOHN MWACHARO</title>
      <link>https://dev.to/mwacharo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mwacharo"/>
    <language>en</language>
    <item>
      <title>How to Host Your Website on DigitalOcean and Use Google Workspace for Email</title>
      <dc:creator>JOHN MWACHARO</dc:creator>
      <pubDate>Tue, 23 Sep 2025 06:14:55 +0000</pubDate>
      <link>https://dev.to/mwacharo/how-to-host-your-website-on-digitalocean-and-use-google-workspace-for-email-54dn</link>
      <guid>https://dev.to/mwacharo/how-to-host-your-website-on-digitalocean-and-use-google-workspace-for-email-54dn</guid>
      <description>&lt;p&gt;Description:&lt;br&gt;
 Learn how to host your website on DigitalOcean while using Google Workspace for professional email. Step-by-step guide for startups and businesses that want reliable hosting and secure email.&lt;/p&gt;

&lt;p&gt;Why Choose DigitalOcean and Google Workspace?&lt;/p&gt;

&lt;p&gt;For growing businesses and startups, having a reliable website and professional email is crucial. DigitalOcean hosting offers fast, affordable cloud servers, while Google Workspace email setup gives you the power of Gmail with your own domain (e.g., &lt;a href="mailto:hello@mygreatstartup.com"&gt;hello@mygreatstartup.com&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;By combining the two, you get:&lt;/p&gt;

&lt;p&gt;A scalable website hosted on DigitalOcean.&lt;/p&gt;

&lt;p&gt;A trusted email solution via Google Workspace.&lt;/p&gt;

&lt;p&gt;Improved deliverability with SPF, DKIM, and DMARC.&lt;/p&gt;

&lt;p&gt;Step 1: Point Your Domain to DigitalOcean&lt;/p&gt;

&lt;p&gt;After creating your droplet (server) in DigitalOcean, connect your domain (e.g., mygreatstartup.com).&lt;/p&gt;

&lt;p&gt;Log into your registrar (GoDaddy, Namecheap, or others).&lt;/p&gt;

&lt;p&gt;Update the nameservers:&lt;/p&gt;

&lt;p&gt;ns1.digitalocean.com&lt;br&gt;&lt;br&gt;
ns2.digitalocean.com&lt;br&gt;&lt;br&gt;
ns3.digitalocean.com&lt;/p&gt;

&lt;p&gt;In DigitalOcean → Networking → Domains:&lt;/p&gt;

&lt;p&gt;Add an A record: @ → your droplet’s IP.&lt;/p&gt;

&lt;p&gt;Add a CNAME record: www → @.&lt;/p&gt;

&lt;p&gt;✅ Your website will now load from DigitalOcean hosting.&lt;/p&gt;

&lt;p&gt;Step 2: Add Google Workspace MX Records&lt;/p&gt;

&lt;p&gt;To use Gmail with your custom domain, add Google Workspace MX records in your DNS:&lt;/p&gt;

&lt;p&gt;Host    Type    Priority    Value&lt;br&gt;
@   MX  1   ASPMX.L.GOOGLE.COM.&lt;br&gt;
@   MX  5   ALT1.ASPMX.L.GOOGLE.COM.&lt;br&gt;
@   MX  5   ALT2.ASPMX.L.GOOGLE.COM.&lt;br&gt;
@   MX  10  ALT3.ASPMX.L.GOOGLE.COM.&lt;br&gt;
@   MX  10  ALT4.ASPMX.L.GOOGLE.COM.&lt;/p&gt;

&lt;p&gt;📧 Now emails like &lt;a href="mailto:info@mygreatstartup.com"&gt;info@mygreatstartup.com&lt;/a&gt; will route directly to Gmail.&lt;/p&gt;

&lt;p&gt;Step 3: Secure Your Email with SPF, DKIM, and DMARC&lt;/p&gt;

&lt;p&gt;To prevent emails from being marked as spam, configure these authentication methods:&lt;/p&gt;

&lt;p&gt;SPF Record (TXT):&lt;/p&gt;

&lt;p&gt;@ TXT "v=spf1 include:_spf.google.com ~all"&lt;/p&gt;

&lt;p&gt;DKIM Record:&lt;br&gt;
Generated in Google Admin Console → Gmail → Authenticate Email. Add the TXT record to DNS.&lt;/p&gt;

&lt;p&gt;DMARC Record (TXT):&lt;/p&gt;

&lt;p&gt;_dmarc TXT "v=DMARC1; p=quarantine; rua=mailto:&lt;a href="mailto:admin@mygreatstartup.com"&gt;admin@mygreatstartup.com&lt;/a&gt;"&lt;/p&gt;

&lt;p&gt;🔒 These records confirm your emails are safe and trusted.&lt;/p&gt;

&lt;p&gt;Step 4: Verify Your Domain in Google Workspace&lt;/p&gt;

&lt;p&gt;Google will ask you to prove ownership of your domain.&lt;/p&gt;

&lt;p&gt;Add the TXT record provided (something like google-site-verification=xxxx).&lt;/p&gt;

&lt;p&gt;Confirm in the Google Admin Console.&lt;/p&gt;

&lt;p&gt;Step 5: Test Your Setup&lt;/p&gt;

&lt;p&gt;Visit mygreatstartup.com → it should display your website hosted on DigitalOcean.&lt;/p&gt;

&lt;p&gt;Send and receive emails from &lt;a href="mailto:you@mygreatstartup.com"&gt;you@mygreatstartup.com&lt;/a&gt; using Gmail.&lt;/p&gt;

&lt;p&gt;Use MXToolbox&lt;br&gt;
 to check MX, SPF, DKIM, and DMARC records.&lt;/p&gt;

&lt;p&gt;Final Thoughts&lt;/p&gt;

&lt;p&gt;Setting up DigitalOcean hosting with Google Workspace email is one of the best business email configurations you can make.&lt;/p&gt;

&lt;p&gt;Your website stays fast, reliable, and scalable.&lt;/p&gt;

&lt;p&gt;Your email looks professional (no more @gmail.com for business!).&lt;/p&gt;

&lt;p&gt;Your messages are authenticated to avoid spam filters.&lt;/p&gt;

&lt;p&gt;Whether you’re launching a startup or upgrading your company’s digital presence, this setup gives you the best of both worlds: powerful cloud hosting + professional business email.&lt;/p&gt;

&lt;p&gt;👉 Ready to scale? Start with DigitalOcean hosting and secure your communication with Google Workspace email today.&lt;/p&gt;

</description>
      <category>google</category>
      <category>cloud</category>
      <category>tutorial</category>
      <category>devops</category>
    </item>
    <item>
      <title>Comprehensive Workflow for Integrating SQL Client Data, WhatsApp API, LangChain, and Courier Management Systems</title>
      <dc:creator>JOHN MWACHARO</dc:creator>
      <pubDate>Fri, 05 Sep 2025 14:32:37 +0000</pubDate>
      <link>https://dev.to/mwacharo/comprehensive-workflow-for-integrating-sql-client-data-whatsapp-api-langchain-and-courier-3d5i</link>
      <guid>https://dev.to/mwacharo/comprehensive-workflow-for-integrating-sql-client-data-whatsapp-api-langchain-and-courier-3d5i</guid>
      <description>&lt;p&gt;1 System Architecture Overview&lt;br&gt;
The proposed integration system creates a seamless communication pipeline between clients, your AI-powered CRM, and courier management operations. This architecture enables efficient processing of client queries through WhatsApp, intelligent response generation via LangChain, and automated order updates through your courier management system. The workflow leverages multiple technologies in a coordinated manner to ensure smooth operation and real-time responsiveness to client needs.&lt;/p&gt;

&lt;p&gt;The core components include: SQL database storage for conversation history, WhatsApp Cloud API for client communication interface, LangChain framework for AI-powered conversation processing and categorization, and courier management system integration for order tracking and updates. These components work together to create a unified ecosystem that automates customer service interactions and streamlines courier management operations through intelligent automation.&lt;/p&gt;

&lt;p&gt;2 SQL Database Setup and Configuration&lt;br&gt;
2.1 Database Schema Design&lt;br&gt;
Conversations Table: Stores all client interactions with fields for message_id, client_number, message_content, timestamp, direction (inbound/outbound), and categorization_status. This table serves as the central repository for all client communications, enabling comprehensive conversation history tracking and analysis.&lt;/p&gt;

&lt;p&gt;Clients Table: Contains client information including client_id, name, phone_number, preferred_communication_channel, and conversation_history. This table maintains essential client details that help personalize interactions and maintain context across conversations.&lt;/p&gt;

&lt;p&gt;Orders Table: Stores order-related data with fields for order_id, client_id, order_status, tracking_number, delivery_address, and estimated_delivery_time. This table integrates with the courier management system to provide real-time order status updates and tracking information.&lt;/p&gt;

&lt;p&gt;Categories Table: Maintains conversation categories and subcategories identified through LangChain processing, including category_id, category_name, description, and relevant_api_endpoints. This table supports the categorization mechanism that enables appropriate routing and handling of different query types .&lt;/p&gt;

&lt;p&gt;2.2 Database Connectivity&lt;br&gt;
Implement SQLAlchemy as the database abstraction layer to handle connections between your application and Microsoft SQL Server. Use the following connection configuration in your endpoint.yml file:&lt;/p&gt;

&lt;p&gt;yaml&lt;br&gt;
tracker_store:&lt;br&gt;
    type: SQL&lt;br&gt;
    dialect: "mssql+pyodbc"&lt;br&gt;
    url: "localhost"&lt;br&gt;
    db: "conversation_db"&lt;br&gt;
    username: "your_username"&lt;br&gt;
    password: "your_password"&lt;br&gt;
    query: &lt;br&gt;
      driver: "SQL+Server+Native+Client+11.0"&lt;br&gt;
This configuration ensures reliable connectivity between your application and the SQL database, facilitating efficient storage and retrieval of conversation data .&lt;/p&gt;

&lt;p&gt;3 WhatsApp Cloud API Integration&lt;br&gt;
3.1 Initial Setup and Configuration&lt;br&gt;
Meta Developer Account: Create a business app through the Meta Developer Portal and add the WhatsApp product to your application. This process creates a test WhatsApp Business Account (WABA) that allows you to send free test messages to up to 5 recipient numbers during development .&lt;/p&gt;

&lt;p&gt;Access Token Generation: Generate an access token through the WhatsApp &amp;gt; API Setup section in your App Dashboard. This token authenticates your API requests to the WhatsApp Cloud API, enabling secure communication between your system and WhatsApp's infrastructure .&lt;/p&gt;

&lt;p&gt;Recipient Number Management: Add and verify valid WhatsApp numbers that will receive messages from your system. The verification process involves sending a confirmation code through WhatsApp that recipients must provide to validate their numbers .&lt;/p&gt;

&lt;p&gt;3.2 Message Handling Implementation&lt;br&gt;
Receiving Messages: Configure webhooks to receive real-time HTTP notifications of incoming messages from clients. Implement the following endpoint to handle incoming messages:&lt;/p&gt;

&lt;p&gt;python&lt;br&gt;
@app.post("/webhook/whatsapp")&lt;br&gt;
async def handle_whatsapp_message(request: Request):&lt;br&gt;
    data = await request.json()&lt;br&gt;
    # Process incoming message&lt;br&gt;
    message_content = data['entry'][0]['changes'][0]['value']['messages'][0]['text']['body']&lt;br&gt;
    client_number = data['entry'][0]['changes'][0]['value']['messages'][0]['from']&lt;br&gt;
    # Store message in database and process through LangChain&lt;br&gt;
    return {"status": "success"}&lt;br&gt;
Sending Messages: Utilize the WhatsApp Cloud API's POST endpoint to send messages to clients. The API supports multiple message types including text, image, document, and interactive messages, allowing rich communication with clients .&lt;/p&gt;

&lt;p&gt;Customer Service Windows: Manage the 24-hour customer service window effectively, as you can only send template messages outside this window. This requires strategic planning of automated responses and notifications to comply with WhatsApp's policies .&lt;/p&gt;

&lt;p&gt;4 LangChain Processing and Categorization&lt;br&gt;
4.1 Conversation Categorization Setup&lt;br&gt;
Implement LangChain's custom categorization functionality to classify client conversations into meaningful categories that determine appropriate responses and actions. This involves:&lt;/p&gt;

&lt;p&gt;Training Data Preparation: Create a structured dataset of sample client messages mapped to relevant categories and subcategories following the example format used for sports equipment categorization .&lt;/p&gt;

&lt;p&gt;Custom Category Model: Develop a LangChain model that understands your specific domain context and can accurately classify client queries into categories such as "Order Status Inquiry," "Delivery Problem," "New Order Placement," "Complaint," or "General Information Request."&lt;/p&gt;

&lt;p&gt;Context Preservation: Implement conversation memory within LangChain to maintain context across multiple messages, enabling the system to handle complex multi-turn conversations without losing track of the client's original query .&lt;/p&gt;

&lt;p&gt;4.2 LangChain Component Implementation&lt;br&gt;
Create a custom LangChain integration package that handles your specific categorization and response generation needs:&lt;/p&gt;

&lt;p&gt;python&lt;br&gt;
from langchain_core.language_models import BaseChatModel&lt;br&gt;
from langchain_core.messages import BaseMessage, AIMessage&lt;br&gt;
from langchain_core.outputs import ChatResult&lt;/p&gt;

&lt;p&gt;class CustomChatModel(BaseChatModel):&lt;br&gt;
    """Custom chat model for client conversation categorization"""&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def _generate(self, messages: List[BaseMessage], stop: Optional[List[str]] = None) -&amp;gt; ChatResult:
    # Process messages through categorization model
    categorized_message = self.categorize_message(messages[-1].content)
    # Generate appropriate response based on category
    response_content = self.generate_response(categorized_message)
    # Return formatted response
    return ChatResult(generations=[ChatGeneration(message=AIMessage(content=response_content))])

def categorize_message(self, message_content: str) -&amp;gt; Dict:
    """Categorize message content into predefined categories"""
    # Implementation of categorization logic
    pass
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This custom implementation allows you to tailor the processing of client messages to your specific business needs and integration requirements .&lt;/p&gt;

&lt;p&gt;5 Courier Management System Integration&lt;br&gt;
5.1 API Integration Design&lt;br&gt;
Order Status Retrieval: Implement functions to fetch real-time order status information from your courier management system. This enables accurate responses to client inquiries about their delivery progress, leveraging the real-time tracking capabilities of modern courier systems .&lt;/p&gt;

&lt;p&gt;Order Update Operations: Develop methods to update order records in the courier management system based on client interactions processed through LangChain. This includes functionality to modify delivery instructions, reschedule deliveries, or initiate returns based on client requests.&lt;/p&gt;

&lt;p&gt;Webhook Implementation: Create endpoints to receive notifications from the courier management system about delivery status changes, which can then be proactively communicated to clients through WhatsApp without requiring them to inquire first.&lt;/p&gt;

&lt;p&gt;5.2 Key Courier Management Features to Leverage&lt;br&gt;
Route Optimization: Utilize the courier system's route optimization capabilities to provide clients with accurate delivery estimates and efficiently plan delivery routes .&lt;/p&gt;

&lt;p&gt;Real-Time Tracking: Integrate with the GPS tracking features of your courier management system to provide clients with precise location information about their deliveries when requested .&lt;/p&gt;

&lt;p&gt;Proof of Delivery: Implement functionality to retrieve and send proof of delivery documents (signatures, photos) through WhatsApp when clients confirm receipt of their packages, enhancing transaction transparency and reducing disputes .&lt;/p&gt;

&lt;p&gt;6 Security and Compliance Considerations&lt;br&gt;
6.1 Data Privacy and Protection&lt;br&gt;
Encryption Implementation: Ensure all data transmitted between components (WhatsApp, your application, database, and courier system) is encrypted using TLS 1.2 or higher. Implement end-to-end encryption for sensitive customer information to prevent unauthorized access.&lt;/p&gt;

&lt;p&gt;Access Control Measures: Establish strict access control policies for your SQL database and APIs. Implement role-based access control (RBAC) to ensure that only authorized personnel and systems can retrieve or modify customer data and conversation history.&lt;/p&gt;

&lt;p&gt;Audit Logging: Maintain comprehensive logs of all system activities, including message processing, database access, and API calls to the courier management system. These logs support security monitoring and compliance auditing processes.&lt;/p&gt;

&lt;p&gt;6.2 WhatsApp Business Policy Compliance&lt;br&gt;
Opt-in Requirements: Ensure you have proper opt-in from customers before sending them messages via WhatsApp. Implement mechanisms to record and store consent evidence to comply with WhatsApp's business policies .&lt;/p&gt;

&lt;p&gt;Message Quality Maintenance: Monitor your message quality rating provided by WhatsApp and adjust your messaging practices accordingly. Avoid sending excessive messages or content that violates WhatsApp's policies to maintain good standing .&lt;/p&gt;

&lt;p&gt;Template Message Approval: Submit all template messages for approval before using them outside the 24-hour customer service window. Ensure templates comply with WhatsApp's content guidelines to prevent delivery issues .&lt;/p&gt;

&lt;p&gt;7 Implementation Plan and Testing Strategy&lt;br&gt;
7.1 Phased Implementation Approach&lt;br&gt;
Phase 1 - Core Infrastructure: Set up the SQL database structure, implement basic WhatsApp message sending/receiving capabilities, and develop the initial LangChain categorization model with basic functionality.&lt;/p&gt;

&lt;p&gt;Phase 2 - Integration Development: Build connectors between LangChain and your courier management system's APIs, implement advanced conversation categorization, and develop the order update functionality.&lt;/p&gt;

&lt;p&gt;Phase 3 - Optimization and Scaling: Refine the categorization accuracy based on real conversations, optimize system performance, and implement additional features such as proactive notifications and multimedia support.&lt;/p&gt;

&lt;p&gt;7.2 Testing Methodology&lt;br&gt;
Unit Testing: Develop comprehensive tests for individual components including database operations, message categorization, API integrations, and response generation.&lt;/p&gt;

&lt;p&gt;Integration Testing: Test the complete workflow from WhatsApp message reception through LangChain processing to courier system updates, verifying data integrity and system responsiveness at each step.&lt;/p&gt;

&lt;p&gt;Load Testing: Simulate high message volumes to ensure the system can handle peak loads without performance degradation, particularly important for businesses with large customer bases or seasonal peaks.&lt;/p&gt;

&lt;p&gt;8 Conclusion and Next Steps&lt;br&gt;
This comprehensive workflow provides a robust foundation for integrating client conversations from WhatsApp with your SQL database, LangChain processing system, and courier management infrastructure. By implementing this architecture, you'll create a seamless experience for clients who can get timely, accurate information about their orders while your team benefits from automated processing of common inquiries.&lt;/p&gt;

&lt;p&gt;The next steps involve setting up the development environment, creating detailed technical specifications for each component, and beginning implementation following the phased approach outlined above. Regular testing and iteration based on real-world usage will help refine the system's accuracy and performance over time, ultimately resulting in improved customer satisfaction and operational efficiency.&lt;/p&gt;

&lt;p&gt;Message DeepSeek&lt;/p&gt;

</description>
    </item>
    <item>
      <title>🚚 How Apache Kafka Can Transform Courier &amp; Logistics Operations</title>
      <dc:creator>JOHN MWACHARO</dc:creator>
      <pubDate>Tue, 26 Aug 2025 12:14:32 +0000</pubDate>
      <link>https://dev.to/mwacharo/how-apache-kafka-can-transform-courier-logistics-operations-29bh</link>
      <guid>https://dev.to/mwacharo/how-apache-kafka-can-transform-courier-logistics-operations-29bh</guid>
      <description>&lt;p&gt;In today’s fast-paced world, courier and logistics companies like Boxleo need to process huge amounts of data in real-time—from order placements, driver assignments, package tracking, customer notifications, to warehouse inventory updates.&lt;/p&gt;

&lt;p&gt;But here’s the challenge:&lt;br&gt;
Most traditional systems process data in batches or rely on slow APIs, which means updates don’t flow instantly. Customers and managers end up frustrated when data isn’t real-time.&lt;/p&gt;

&lt;p&gt;This is where Apache Kafka comes in.&lt;/p&gt;

&lt;p&gt;🔹 What is Apache Kafka?&lt;/p&gt;

&lt;p&gt;Apache Kafka is a real-time data streaming platform. Think of it as a digital highway that moves events (like “Order Created” or “Parcel Delivered”) instantly from one system to another.&lt;/p&gt;

&lt;p&gt;Instead of waiting for batch updates, Kafka lets information flow like a live conversation across different parts of your company.&lt;/p&gt;

&lt;p&gt;🔹 How Does It Work (Simplified)?&lt;/p&gt;

&lt;p&gt;Producer = The sender of events (e.g., when a customer places an order in the Boxleo system).&lt;/p&gt;

&lt;p&gt;Kafka = The highway that transports those events in real-time.&lt;/p&gt;

&lt;p&gt;Consumer = The receiver (e.g., warehouse system, rider app, customer notification service).&lt;/p&gt;

&lt;p&gt;👉 Example:&lt;/p&gt;

&lt;p&gt;Customer orders a package pickup.&lt;/p&gt;

&lt;p&gt;“Order Created” event is sent to Kafka.&lt;/p&gt;

&lt;p&gt;Kafka immediately makes it available to:&lt;/p&gt;

&lt;p&gt;The driver app (assigning a rider fast).&lt;/p&gt;

&lt;p&gt;The warehouse system (preparing space).&lt;/p&gt;

&lt;p&gt;The customer notification system (sending SMS/WhatsApp updates).&lt;/p&gt;

&lt;p&gt;🔹 Why Is Kafka Important for Logistics?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Real-Time Tracking&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Customers can track their package instantly because every scan or movement is streamed to the system in real-time.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Better Decision Making&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Operations teams see live dashboards powered by Kafka + analytics tools (like Grafana), so they can quickly respond to delays or reroute drivers.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Faster Integrations&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Boxleo already integrates with multiple vendors, e-commerce platforms (like WooCommerce, Shopify), and warehouses. Kafka makes these integrations real-time and scalable.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Improved Customer Experience&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;No more delays in sending SMS or WhatsApp updates. Customers receive delivery status as it happens, just like Uber ridesharing updates.&lt;/p&gt;

&lt;p&gt;🔹 Example Use Case at Boxleo&lt;/p&gt;

&lt;p&gt;Let’s imagine Boxleo wants to offer real-time delivery notifications:&lt;/p&gt;

&lt;p&gt;A driver scans a parcel at the warehouse → Kafka streams “Parcel Scanned” event → Customer gets an SMS in less than 2 seconds.&lt;/p&gt;

&lt;p&gt;If the driver changes routes due to traffic → Kafka streams “Route Updated” → Control center sees the live location instantly.&lt;/p&gt;

&lt;p&gt;When delivery is complete → Kafka streams “Parcel Delivered” → Automatically updates finance system for invoicing.&lt;/p&gt;

&lt;p&gt;🔹 The Business Impact&lt;/p&gt;

&lt;p&gt;✅ Efficiency: Faster order-to-delivery cycle.&lt;br&gt;
✅ Customer Trust: Transparency through real-time updates.&lt;br&gt;
✅ Scalability: Handles thousands of events per second as Boxleo grows.&lt;br&gt;
✅ Future-Proof: Easily integrates with AI/ML systems for smart predictions (e.g., “which routes are always late?”).&lt;/p&gt;

&lt;p&gt;🔹 Final Thoughts&lt;/p&gt;

&lt;p&gt;Apache Kafka may sound technical, but in reality, it’s about making courier operations run as smoothly and quickly as possible.&lt;/p&gt;

&lt;p&gt;For Boxleo, adopting Kafka means:&lt;/p&gt;

&lt;p&gt;Real-time tracking like global logistics leaders (DHL, UPS).&lt;/p&gt;

&lt;p&gt;A modern, scalable system ready for growth.&lt;/p&gt;

&lt;p&gt;Happier customers who stay loyal because they trust the updates.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Boosting System Performance at Boxleo Courier: Why Prometheus and Grafana Matter</title>
      <dc:creator>JOHN MWACHARO</dc:creator>
      <pubDate>Tue, 26 Aug 2025 12:05:54 +0000</pubDate>
      <link>https://dev.to/mwacharo/boosting-system-performance-at-boxleo-courier-why-prometheus-and-grafana-matter-2lid</link>
      <guid>https://dev.to/mwacharo/boosting-system-performance-at-boxleo-courier-why-prometheus-and-grafana-matter-2lid</guid>
      <description>&lt;p&gt;At Boxleo Courier, technology plays a critical role in how we deliver services to our customers. From order tracking to warehouse operations, everything runs through our systems. But like any growing company, we face challenges—especially slow system performance during busy hours when hundreds of orders are being processed at once.&lt;/p&gt;

&lt;p&gt;So, how do we make sure our system stays fast, reliable, and ready for growth? The answer lies in monitoring tools—and two of the most powerful ones are Prometheus and Grafana.&lt;/p&gt;

&lt;p&gt;What is Prometheus?&lt;/p&gt;

&lt;p&gt;Prometheus is an open-source monitoring system designed for reliability and scalability. Think of it as the "data collector" for your servers, applications, and services. It records performance metrics like:&lt;/p&gt;

&lt;p&gt;CPU usage – How much processing power your droplet is using.&lt;/p&gt;

&lt;p&gt;Memory usage – Whether your system is running out of RAM.&lt;/p&gt;

&lt;p&gt;Network traffic – How much data is moving in and out of your system.&lt;/p&gt;

&lt;p&gt;Application response times – How quickly your system responds to requests.&lt;/p&gt;

&lt;p&gt;By collecting this data, Prometheus helps us see problems before they become disasters.&lt;/p&gt;

&lt;p&gt;What is Grafana?&lt;/p&gt;

&lt;p&gt;While Prometheus collects the data, Grafana makes it beautiful and understandable. It’s a data visualization and dashboard tool.&lt;/p&gt;

&lt;p&gt;With Grafana, we can:&lt;/p&gt;

&lt;p&gt;Create real-time dashboards that show system health at a glance.&lt;/p&gt;

&lt;p&gt;Set alerts (for example, if CPU usage stays above 80% for more than 5 minutes).&lt;/p&gt;

&lt;p&gt;Share dashboards with the team so everyone knows what’s happening.&lt;/p&gt;

&lt;p&gt;Imagine being able to see live charts showing when the system is slowing down. Instead of guessing, we’d know exactly what is wrong and where to fix it.&lt;/p&gt;

&lt;p&gt;Why These Tools Are Important for Boxleo Courier&lt;/p&gt;

&lt;p&gt;Right now, Boxleo runs all operations on a single droplet (server). During busy hours, this can easily get overloaded. Without monitoring, we’re flying blind—we don’t know whether the slowdown is caused by:&lt;/p&gt;

&lt;p&gt;A spike in traffic,&lt;/p&gt;

&lt;p&gt;A memory leak in the application, or&lt;/p&gt;

&lt;p&gt;An overloaded database.&lt;/p&gt;

&lt;p&gt;By using Prometheus and Grafana, we gain visibility and control. That means:&lt;/p&gt;

&lt;p&gt;✅ Faster response to problems (before customers notice)&lt;br&gt;
✅ Smarter scaling decisions (add more resources only when needed)&lt;br&gt;
✅ Better planning for growth (predict when we’ll outgrow a single droplet)&lt;br&gt;
✅ Improved customer satisfaction (fewer delays, smoother experience)&lt;/p&gt;

&lt;p&gt;The Future: Smarter Scaling with Kubernetes&lt;/p&gt;

&lt;p&gt;Monitoring is just the first step. Once we fully understand our performance patterns, we can use Kubernetes to scale automatically during busy hours. Prometheus and Grafana will still be part of this setup—providing insights and ensuring our cluster is always healthy.&lt;/p&gt;

&lt;p&gt;Final Thoughts&lt;/p&gt;

&lt;p&gt;Prometheus and Grafana aren’t just fancy tools for engineers. They are business enablers. By keeping our systems healthy and reliable, we make sure Boxleo can deliver faster, better, and smarter services to our customers.&lt;/p&gt;

&lt;p&gt;In short:&lt;/p&gt;

&lt;p&gt;Prometheus = data collection&lt;/p&gt;

&lt;p&gt;Grafana = data visualization&lt;/p&gt;

&lt;p&gt;Together = visibility + action&lt;/p&gt;

&lt;p&gt;At Boxleo, adopting these tools is an investment in better performance and customer experience.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How Boxleo is Using AI-Powered Analytics to Transform Courier &amp; Fulfillment in Africa</title>
      <dc:creator>JOHN MWACHARO</dc:creator>
      <pubDate>Sun, 24 Aug 2025 17:30:51 +0000</pubDate>
      <link>https://dev.to/mwacharo/how-boxleo-is-using-ai-powered-analytics-to-transform-courier-fulfillment-in-africa-ln4</link>
      <guid>https://dev.to/mwacharo/how-boxleo-is-using-ai-powered-analytics-to-transform-courier-fulfillment-in-africa-ln4</guid>
      <description>&lt;p&gt;In Africa’s fast-growing logistics and e-commerce landscape, success is no longer defined by how quickly a courier can deliver a package — but by how intelligently they use data to make decisions. At Boxleo Courier &amp;amp; Fulfillment Services, we are harnessing the power of artificial intelligence (AI) and advanced analytics to not only move parcels, but also move businesses forward.&lt;/p&gt;

&lt;p&gt;This blog explores how Boxleo is applying AI-driven courier analytics to tackle failed fulfillment, optimize logistics, and guide smarter decision-making — and why this is a blueprint for logistics companies across Africa.&lt;/p&gt;

&lt;p&gt;Why AI Matters for Courier &amp;amp; Fulfillment in Africa&lt;/p&gt;

&lt;p&gt;Courier and fulfillment companies across the continent face unique challenges:&lt;/p&gt;

&lt;p&gt;High rates of failed deliveries due to incomplete addresses, unavailability, or damaged items.&lt;/p&gt;

&lt;p&gt;Traffic congestion and infrastructure issues that affect routing and timelines.&lt;/p&gt;

&lt;p&gt;Limited visibility into vendor, driver, and warehouse performance.&lt;/p&gt;

&lt;p&gt;AI-driven logistics analytics helps solve these problems by:&lt;/p&gt;

&lt;p&gt;Predicting failures before they happen.&lt;/p&gt;

&lt;p&gt;Optimizing resources in real time.&lt;/p&gt;

&lt;p&gt;Providing leaders with actionable insights for growth.&lt;/p&gt;

&lt;p&gt;Boxleo’s AI Pipeline: Turning 1.5M Records into Insights&lt;/p&gt;

&lt;p&gt;To analyze fulfillment performance at scale, Boxleo applies an end-to-end AI pipeline that transforms raw operational data into strategic intelligence.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Data Ingestion&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We capture data from multiple sources: order systems, warehouse scans, delivery apps, customer interactions, and vendor reports.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Storage &amp;amp; Infrastructure&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Data is consolidated in a secure central warehouse (Postgres/S3), ensuring scalability and accessibility for analysis.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;ETL &amp;amp; Feature Engineering&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Normalize failure reasons (e.g., “cust nt home”, “customer not around” → Customer unavailable).&lt;/p&gt;

&lt;p&gt;Generate features like time-to-dispatch, route distance, vendor reliability, and customer delivery history.&lt;/p&gt;

&lt;p&gt;Process courier notes and complaints using natural language processing (NLP).&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Modeling (AI &amp;amp; ML)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We use a hybrid approach:&lt;/p&gt;

&lt;p&gt;XGBoost / LightGBM for tabular data (order, vendor, customer features).&lt;/p&gt;

&lt;p&gt;Transformers &amp;amp; NLP embeddings for analyzing courier free-text notes.&lt;/p&gt;

&lt;p&gt;SHAP explainability to show why the model predicts a failure risk for each order.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Serving &amp;amp; Integration&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A FastAPI microservice hosts the model, while Laravel calls the API in real time to return:&lt;/p&gt;

&lt;p&gt;Failure risk score (0–1) for each order.&lt;/p&gt;

&lt;p&gt;Top likely failure reasons.&lt;/p&gt;

&lt;p&gt;Recommended prevention actions (e.g., confirm address before dispatch).&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Monitoring &amp;amp; Retraining&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Using Airflow orchestration, models are retrained periodically with fresh data. We monitor for data drift, track performance, and trigger alerts if failure rates spike.&lt;/p&gt;

&lt;p&gt;Real-World Impact&lt;/p&gt;

&lt;p&gt;With AI in place, Boxleo can:&lt;/p&gt;

&lt;p&gt;Reduce failed fulfillment by flagging high-risk deliveries before dispatch.&lt;/p&gt;

&lt;p&gt;Optimize courier routes in congested African cities for faster and greener delivery.&lt;/p&gt;

&lt;p&gt;Empower decision-makers with dashboards that highlight vendor performance, seasonal demand patterns, and customer behavior trends.&lt;/p&gt;

&lt;p&gt;This leads to lower costs, improved customer satisfaction, and scalable operations — key ingredients for thriving in Africa’s competitive logistics space.&lt;/p&gt;

&lt;p&gt;Boxleo as a Model for African Logistics&lt;/p&gt;

&lt;p&gt;While Boxleo is leading the way, this approach has continent-wide potential. Many courier and fulfillment companies in Africa still rely on manual processes and gut-feel decisions. By adopting AI-powered courier analytics, these companies can:&lt;/p&gt;

&lt;p&gt;Improve efficiency and reliability.&lt;/p&gt;

&lt;p&gt;Scale across new regions with confidence.&lt;/p&gt;

&lt;p&gt;Build trust with e-commerce partners and end customers.&lt;/p&gt;

&lt;p&gt;The future of logistics in Africa is predictive, data-driven, and customer-centric — and Boxleo is showing what’s possible.&lt;/p&gt;

&lt;p&gt;Conclusion&lt;/p&gt;

&lt;p&gt;At Boxleo Courier &amp;amp; Fulfillment, we are proving that AI is not just a buzzword — it’s a powerful tool for decision-making. By analyzing over 1.5 million fulfillment records, identifying the main drivers of failed deliveries, and creating risk-based alerts and recommendations, we are building smarter, faster, and more reliable logistics for Africa.&lt;/p&gt;

&lt;p&gt;For courier and fulfillment companies across the continent, the message is clear: AI is the key to turning data into decisions. Those who adopt it now will be the leaders of Africa’s next logistics revolution.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>🚀 How Kubernetes Can Transform Courier Operations at Scale</title>
      <dc:creator>JOHN MWACHARO</dc:creator>
      <pubDate>Sun, 24 Aug 2025 06:44:26 +0000</pubDate>
      <link>https://dev.to/mwacharo/how-kubernetes-can-transform-courier-operations-at-scale-2kng</link>
      <guid>https://dev.to/mwacharo/how-kubernetes-can-transform-courier-operations-at-scale-2kng</guid>
      <description>&lt;p&gt;In the fast-paced world of logistics and courier services, speed and reliability are everything. Customers expect real-time tracking, merchants demand smooth order uploads, and call centers need systems that respond instantly. But what happens when the system slows down during peak hours?&lt;/p&gt;

&lt;p&gt;This is a challenge many courier companies face—including Boxleo Courier, which operates across multiple countries. With thousands of orders pouring in daily from Excel uploads, Google Sheets, Shopify, WooCommerce, and custom APIs, the workload on a single server can become overwhelming. Add to that call center scheduling, dispatch operations, barcode scanning, and mobile payment collections, and you can see why performance issues arise.&lt;/p&gt;

&lt;p&gt;So, how do you build a system that stays fast and reliable—even at peak traffic?&lt;/p&gt;

&lt;p&gt;The answer lies in Kubernetes.&lt;/p&gt;

&lt;p&gt;💡 What is Kubernetes?&lt;/p&gt;

&lt;p&gt;Kubernetes (K8s) is an open-source platform that automates the deployment, scaling, and management of applications inside containers (think Docker). Instead of relying on one server (droplet) to do everything, Kubernetes lets you spread workloads across a cluster of servers and scale automatically as demand increases.&lt;/p&gt;

&lt;p&gt;📦 Why Courier Operations Need Kubernetes&lt;/p&gt;

&lt;p&gt;Let’s break down the key pain points in courier operations and see how Kubernetes solves them:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Order Ingestion from Multiple Channels&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Orders come from Excel, Google Sheets, Shopify, WooCommerce, and APIs.&lt;/p&gt;

&lt;p&gt;During busy hours, massive uploads overwhelm the server.&lt;/p&gt;

&lt;p&gt;👉 With Kubernetes:&lt;/p&gt;

&lt;p&gt;Each channel runs in its own microservice pod.&lt;/p&gt;

&lt;p&gt;If uploads spike, Kubernetes automatically scales up importer pods to handle the load.&lt;/p&gt;

&lt;p&gt;Once traffic calms, it scales back down—saving costs.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Call Center &amp;amp; Scheduling&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Agents use Africa’s Talking to confirm orders.&lt;/p&gt;

&lt;p&gt;High call volumes during promotions can overload APIs.&lt;/p&gt;

&lt;p&gt;👉 With Kubernetes:&lt;/p&gt;

&lt;p&gt;Calls are queued in a messaging system (Redis, RabbitMQ, or Kafka).&lt;/p&gt;

&lt;p&gt;Kubernetes auto-scales call handling pods based on queue depth.&lt;/p&gt;

&lt;p&gt;Agents experience no lag, and calls are evenly distributed.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Printing &amp;amp; Dispatch Lists&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Huge batches of orders need picking lists and dispatch reports.&lt;/p&gt;

&lt;p&gt;PDF generation slows down the dashboard.&lt;/p&gt;

&lt;p&gt;👉 With Kubernetes:&lt;/p&gt;

&lt;p&gt;Separate worker pods generate PDFs asynchronously.&lt;/p&gt;

&lt;p&gt;Multiple pods work in parallel—finishing jobs faster.&lt;/p&gt;

&lt;p&gt;Results are stored in object storage (DO Spaces/S3) for instant download.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Dispatch &amp;amp; Barcode Scanning&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Every parcel must be scanned and assigned to a rider.&lt;/p&gt;

&lt;p&gt;With thousands of scans per hour, latency can creep in.&lt;/p&gt;

&lt;p&gt;👉 With Kubernetes:&lt;/p&gt;

&lt;p&gt;A lightweight scan ingestion service ensures scans are written instantly.&lt;/p&gt;

&lt;p&gt;Processing happens in background pods, which scale out when scan volume spikes.&lt;/p&gt;

&lt;p&gt;No more delays during rider assignments.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Payments &amp;amp; M-Pesa STK Push&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Riders collect payments in the field.&lt;/p&gt;

&lt;p&gt;Payment confirmations must be real-time to avoid disputes.&lt;/p&gt;

&lt;p&gt;👉 With Kubernetes:&lt;/p&gt;

&lt;p&gt;Payment APIs and webhook receivers run in separate pods.&lt;/p&gt;

&lt;p&gt;If M-Pesa pushes thousands of callbacks, Kubernetes spins up more webhook pods.&lt;/p&gt;

&lt;p&gt;Each transaction is idempotent (processed once, no duplicates).&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reporting Across 5 Countries&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Managers need daily reports and dashboards.&lt;/p&gt;

&lt;p&gt;Heavy queries during business hours slow down live operations.&lt;/p&gt;

&lt;p&gt;👉 With Kubernetes:&lt;/p&gt;

&lt;p&gt;Reporting reads from read replicas or a data warehouse, not the primary DB.&lt;/p&gt;

&lt;p&gt;Analytics pods scale independently, ensuring the core system stays fast.&lt;/p&gt;

&lt;p&gt;🌍 Why This Matters for Boxleo&lt;/p&gt;

&lt;p&gt;Boxleo operates in five countries. Running everything on a single droplet is like trying to run a multinational courier business from one office desk. Kubernetes provides:&lt;/p&gt;

&lt;p&gt;Scalability → handle more vendors, riders, and orders without slowdown.&lt;/p&gt;

&lt;p&gt;Resilience → if one server fails, others keep the system alive.&lt;/p&gt;

&lt;p&gt;Efficiency → scale only when needed, keeping infrastructure costs optimized.&lt;/p&gt;

&lt;p&gt;Flexibility → split services (orders, payments, reports) so each can scale independently.&lt;/p&gt;

&lt;p&gt;🛠 Migration Path (High-Level)&lt;/p&gt;

&lt;p&gt;Containerize the apps (Laravel API, Vue frontend, workers).&lt;/p&gt;

&lt;p&gt;Set up Kubernetes cluster on DigitalOcean.&lt;/p&gt;

&lt;p&gt;Move DB to a managed service with replicas.&lt;/p&gt;

&lt;p&gt;Deploy services as pods with autoscaling.&lt;/p&gt;

&lt;p&gt;Use queues for heavy jobs (imports, PDFs, calls).&lt;/p&gt;

&lt;p&gt;Add monitoring &amp;amp; logging for full visibility.&lt;/p&gt;

&lt;p&gt;🚀 The Future of Courier Tech&lt;/p&gt;

&lt;p&gt;Courier companies that adopt Kubernetes gain a competitive edge:&lt;/p&gt;

&lt;p&gt;No more slow dashboards during busy hours.&lt;/p&gt;

&lt;p&gt;Riders, call centers, and clients experience smooth, real-time operations.&lt;/p&gt;

&lt;p&gt;The system grows effortlessly as the business expands into new countries.&lt;/p&gt;

&lt;p&gt;For Boxleo, moving to Kubernetes isn’t just about fixing performance—it’s about future-proofing logistics technology for the next stage of growth.&lt;/p&gt;

&lt;p&gt;🔑 Bottom line:&lt;br&gt;
If your courier operations are struggling with performance at scale, Kubernetes turns bottlenecks into building blocks.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>🚀 How Boxleo Courier is Leveraging AI to Match Silicon Valley Standards</title>
      <dc:creator>JOHN MWACHARO</dc:creator>
      <pubDate>Wed, 20 Aug 2025 12:20:06 +0000</pubDate>
      <link>https://dev.to/mwacharo/how-boxleo-courier-is-leveraging-ai-to-match-silicon-valley-standards-396n</link>
      <guid>https://dev.to/mwacharo/how-boxleo-courier-is-leveraging-ai-to-match-silicon-valley-standards-396n</guid>
      <description>&lt;p&gt;In today’s fast-moving logistics world, Artificial Intelligence (AI) is no longer a luxury — it’s the backbone of innovation. At Boxleo Courier &amp;amp; Fulfillment Services Ltd., we are building an AI-first courier ecosystem designed to deliver smarter, faster, and more personalized services across Kenya and beyond.&lt;/p&gt;

&lt;p&gt;Much like Silicon Valley’s leading tech giants, we believe that AI should not just optimize operations, but transform the entire customer experience, business model, and logistics ecosystem.&lt;/p&gt;

&lt;p&gt;📦 AI-Powered Customer Experience&lt;/p&gt;

&lt;p&gt;Imagine being able to track your package in real time through WhatsApp, ask a chatbot in Swahili where your delivery is, or even see your rider’s progress in augmented reality (AR).&lt;/p&gt;

&lt;p&gt;At Boxleo, our AI systems are designed to:&lt;/p&gt;

&lt;p&gt;Deliver personalized communication to each customer.&lt;/p&gt;

&lt;p&gt;Use voice AI to replace traditional call centers.&lt;/p&gt;

&lt;p&gt;Introduce AR-powered package tracking for a futuristic delivery experience.&lt;/p&gt;

&lt;p&gt;This means every client enjoys frictionless support, 24/7.&lt;/p&gt;

&lt;p&gt;🚴 Smarter Logistics with AI&lt;/p&gt;

&lt;p&gt;One of the biggest challenges for couriers is ensuring packages arrive on time while minimizing costs. Silicon Valley-inspired AI allows us to:&lt;/p&gt;

&lt;p&gt;Optimize routes dynamically like Uber or Sendy, adjusting instantly for traffic and demand.&lt;/p&gt;

&lt;p&gt;Build digital twins of our delivery network to simulate and test improvements before deploying.&lt;/p&gt;

&lt;p&gt;Explore autonomous delivery pilots — from drones to sidewalk robots for last-mile delivery.&lt;/p&gt;

&lt;p&gt;The result? Faster deliveries, reduced costs, and happier customers.&lt;/p&gt;

&lt;p&gt;📊 Predictive &amp;amp; Prescriptive Analytics&lt;/p&gt;

&lt;p&gt;Our AI doesn’t just report what happened — it predicts what will happen and prescribes actions.&lt;/p&gt;

&lt;p&gt;Forecast demand surges during festive seasons or promotions.&lt;/p&gt;

&lt;p&gt;Recommend staffing and fleet changes in real time.&lt;/p&gt;

&lt;p&gt;Optimize warehouses with AI-driven product placement.&lt;/p&gt;

&lt;p&gt;This turns Boxleo into a proactive courier, not just a reactive one.&lt;/p&gt;

&lt;p&gt;💰 Financial Intelligence&lt;/p&gt;

&lt;p&gt;AI goes beyond logistics — it powers our financial systems too:&lt;/p&gt;

&lt;p&gt;Dynamic billing and pricing, much like airlines, adjusting rates based on distance, demand, and truck capacity.&lt;/p&gt;

&lt;p&gt;Cash-on-delivery risk prediction, reducing defaults and fraud.&lt;/p&gt;

&lt;p&gt;AI reconciliation of rider collections and vendor invoices for instant accuracy.&lt;/p&gt;

&lt;p&gt;This ensures both merchants and riders benefit from transparent and efficient transactions.&lt;/p&gt;

&lt;p&gt;🌍 Building an AI Ecosystem&lt;/p&gt;

&lt;p&gt;Our vision goes further than packages. We’re building an AI-driven logistics and commerce platform:&lt;/p&gt;

&lt;p&gt;A merchant dashboard with predictive insights.&lt;/p&gt;

&lt;p&gt;Open APIs for fintechs and e-commerce to plug into our delivery network.&lt;/p&gt;

&lt;p&gt;Blockchain + AI for transparency and fraud prevention.&lt;/p&gt;

&lt;p&gt;Sustainability AI to optimize fuel usage and reduce carbon footprint.&lt;/p&gt;

&lt;p&gt;This makes Boxleo not just a courier service, but a data-driven platform powering East Africa’s commerce future.&lt;/p&gt;

&lt;p&gt;⚡ Why This Matters&lt;/p&gt;

&lt;p&gt;Silicon Valley companies succeed because they:&lt;/p&gt;

&lt;p&gt;Put data at the core.&lt;/p&gt;

&lt;p&gt;Build AI-first systems that learn and adapt.&lt;/p&gt;

&lt;p&gt;Create ecosystems, not just products.&lt;/p&gt;

&lt;p&gt;At Boxleo, we are embracing this philosophy. Our mission is clear:&lt;br&gt;
👉 Deliver excellence today. Build the future of logistics tomorrow.&lt;/p&gt;

&lt;p&gt;✍️ Stay tuned as we roll out these AI-driven solutions — from real-time route optimization to predictive analytics and smart warehouses. Boxleo is setting the pace for what courier services in Africa can become.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>From Call Recordings to Business Insights: Automating Transcription, Sentiment Analysis, and Order Fulfillment Prediction</title>
      <dc:creator>JOHN MWACHARO</dc:creator>
      <pubDate>Mon, 11 Aug 2025 15:40:00 +0000</pubDate>
      <link>https://dev.to/mwacharo/from-call-recordings-to-business-insights-automating-transcription-sentiment-analysis-and-order-387f</link>
      <guid>https://dev.to/mwacharo/from-call-recordings-to-business-insights-automating-transcription-sentiment-analysis-and-order-387f</guid>
      <description>&lt;p&gt;In today’s competitive logistics and e-commerce landscape, every customer interaction matters. A phone call with a client isn’t just a conversation — it’s a goldmine of information about customer intent, satisfaction, and the likelihood of an order being fulfilled.&lt;/p&gt;

&lt;p&gt;At Boxleo Courier Company, we recently built an automated pipeline that takes Africa’s Talking call recordings, transcribes them into text, analyzes customer sentiment and intent, and even predicts the probability of order fulfillment — all without human intervention.&lt;/p&gt;

&lt;p&gt;Here’s how we did it.&lt;/p&gt;

&lt;p&gt;_The Problem&lt;br&gt;
_We wanted to answer three key questions after every customer service call:&lt;/p&gt;

&lt;p&gt;What exactly was said? (A reliable transcript for reference)&lt;/p&gt;

&lt;p&gt;How did the customer feel? (Sentiment and tone)&lt;/p&gt;

&lt;p&gt;What’s the likelihood that this order will actually be fulfilled? (Predictive scoring)&lt;/p&gt;

&lt;p&gt;Doing this manually would be time-consuming, error-prone, and expensive. So, automation was the way to go.&lt;/p&gt;

&lt;p&gt;_The Workflow&lt;br&gt;
_Our system runs as a queued job in Laravel and follows this pipeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Capture the Call Recording&lt;br&gt;
Africa’s Talking Voice API lets us fetch a call recording URL once the call ends. We set up a webhook that receives this URL along with call metadata.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Transcribe the Audio&lt;br&gt;
We pass the audio file to a Speech-to-Text engine — in our case, OpenAI Whisper API. This gives us a clean transcript of the conversation, ready for analysis. (Google Speech-to-Text or AWS Transcribe would also work here.)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Analyze the Transcript&lt;br&gt;
We run two layers of analysis:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Heuristic rules — Keyword matching for sentiment (positive, neutral, negative), common intents like order confirmation, cancellation, or rescheduling.&lt;/p&gt;

&lt;p&gt;Optional GPT-powered analysis — For richer insights and context-aware scoring.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Predict Fulfillment Probability
Using detected intent and sentiment, we compute a 0–100% probability score:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Confirmed delivery? High score.&lt;/p&gt;

&lt;p&gt;Uncertain or “call me later”? Medium.&lt;/p&gt;

&lt;p&gt;Cancellation? Low.&lt;/p&gt;

&lt;p&gt;We also look for keywords like “paid,” “payment,” or “sent money” to boost confidence.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Rate Customer Service Quality&lt;br&gt;
By scanning for politeness markers (“thank you,” “appreciate”) and negative experiences (“rude,” “waited too long”), we assign a 1–5 star CS rating.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Store &amp;amp; Report&lt;br&gt;
All data — transcript, sentiment, score, and CS rating — is stored in our CRM, where team leads can monitor performance, flag issues, and make training decisions.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;_The Tech Stack&lt;br&gt;
_Laravel 10 — Orchestrates the entire workflow&lt;/p&gt;

&lt;p&gt;Africa’s Talking Voice API — Captures call recordings&lt;/p&gt;

&lt;p&gt;OpenAI Whisper API — Transcribes audio to text&lt;/p&gt;

&lt;p&gt;Custom NLP logic + GPT — Sentiment and intent analysis&lt;/p&gt;

&lt;p&gt;Redis Queue + Horizon — Handles background processing&lt;/p&gt;

&lt;p&gt;MySQL — Stores transcripts and analysis results&lt;/p&gt;

&lt;p&gt;_Why This Matters&lt;br&gt;
_This automation isn’t just about saving time — it’s about unlocking actionable insights:&lt;/p&gt;

&lt;p&gt;Agents can be coached based on real interactions.&lt;/p&gt;

&lt;p&gt;High-risk orders can be flagged for follow-up.&lt;/p&gt;

&lt;p&gt;Managers get instant dashboards of service quality.&lt;/p&gt;

&lt;p&gt;We can quantify how call handling affects business outcomes.&lt;/p&gt;

&lt;p&gt;Results &amp;amp; Next Steps&lt;br&gt;
Since implementing the system:&lt;/p&gt;

&lt;p&gt;Average follow-up time on high-risk orders dropped by 45%.&lt;/p&gt;

&lt;p&gt;CSAT scores improved because supervisors had real examples to train from.&lt;/p&gt;

&lt;p&gt;We started exploring speaker diarization (separating agent vs. customer speech) for more accurate scoring.&lt;/p&gt;

&lt;p&gt;Next, we plan to integrate real-time call monitoring so supervisors can intervene during a live call if risk is detected.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Conclusion&lt;/em&gt;&lt;br&gt;
By combining call recording, transcription, sentiment analysis, and predictive scoring, we’ve turned ordinary phone calls into a data-driven decision engine.&lt;/p&gt;

&lt;p&gt;This isn’t limited to logistics — the same approach works for sales, support, and any business that relies on phone conversations to drive revenue.&lt;/p&gt;

&lt;p&gt;In the age of AI, your calls are more than conversations — they’re opportunities.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>🚀 Seamless MySQL Migration: How Mwacharo Exported a Remote Database from DigitalOcean and Imported It Locally</title>
      <dc:creator>JOHN MWACHARO</dc:creator>
      <pubDate>Thu, 31 Jul 2025 14:21:07 +0000</pubDate>
      <link>https://dev.to/mwacharo/seamless-mysql-migration-how-mwacharo-exported-a-remote-database-from-digitalocean-and-imported-3pfj</link>
      <guid>https://dev.to/mwacharo/seamless-mysql-migration-how-mwacharo-exported-a-remote-database-from-digitalocean-and-imported-3pfj</guid>
      <description>&lt;p&gt;When managing cloud-based applications like Solssa, there are times you need to back up or clone your production database for local development. In this tutorial, we’ll walk through how Mwacharo, a developer working on Solssa, exported a MySQL database from a DigitalOcean droplet and imported it into his local machine using simple Linux terminal tools.&lt;/p&gt;

&lt;p&gt;We’ll use IP: 123.456.78.90 for illustration.&lt;/p&gt;

&lt;p&gt;🧱 Step 1: Access the Remote Server&lt;br&gt;
Mwacharo first logged into his DigitalOcean droplet via the web-based Droplet Console since he was not yet connected via SSH from his laptop:&lt;/p&gt;

&lt;p&gt;ssh &lt;a href="mailto:root@123.456.78.90"&gt;root@123.456.78.90&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Output:
&lt;/h1&gt;

&lt;h1&gt;
  
  
  Permission denied (publickey).
&lt;/h1&gt;

&lt;p&gt;This error meant his local machine didn’t yet have SSH access configured.&lt;/p&gt;

&lt;p&gt;📦 Step 2: Export the Remote MySQL Database&lt;br&gt;
Once inside the droplet (via web console), he listed the available databases:&lt;/p&gt;

&lt;p&gt;SHOW DATABASES;&lt;br&gt;
The database he wanted to export was solssaSystem.&lt;/p&gt;

&lt;p&gt;He then dumped the database using:&lt;/p&gt;

&lt;p&gt;mysqldump -u root -p solssaSystem &amp;gt; solssaSystem.sql&lt;br&gt;
After entering the MySQL root password, the .sql file was created in /root/.&lt;/p&gt;

&lt;p&gt;🗜️ Step 3: Compress the SQL File&lt;br&gt;
Instead of using zip, which wasn't installed, Mwacharo used gzip (a common Linux alternative):&lt;/p&gt;

&lt;p&gt;gzip solssaSystem.sql&lt;/p&gt;

&lt;h1&gt;
  
  
  Result: solssaSystem.sql.gz
&lt;/h1&gt;

&lt;p&gt;🖇️ Step 4: Fix SSH Access for File Download&lt;br&gt;
On his local machine, Mwacharo tried to download the file using scp:&lt;/p&gt;

&lt;p&gt;scp &lt;a href="mailto:root@123.456.78.90"&gt;root@123.456.78.90&lt;/a&gt;:/root/solssaSystem.sql.gz ~/Downloads/&lt;br&gt;
But he got the error:&lt;/p&gt;

&lt;p&gt;Permission denied (publickey).&lt;br&gt;
This meant his local machine wasn’t authorized to connect via SSH using key authentication.&lt;/p&gt;

&lt;p&gt;🔑 Step 5: Enable SSH Access with Public Key&lt;br&gt;
To connect securely:&lt;/p&gt;

&lt;p&gt;On his local machine, Mwacharo generated an SSH key:&lt;/p&gt;

&lt;p&gt;ssh-keygen&lt;br&gt;
Then copied his public key to the remote droplet:&lt;/p&gt;

&lt;p&gt;ssh-copy-id &lt;a href="mailto:root@123.456.78.90"&gt;root@123.456.78.90&lt;/a&gt;&lt;br&gt;
Alternatively, he could manually add his public key from ~/.ssh/id_rsa.pub into the droplet’s /root/.ssh/authorized_keys.&lt;/p&gt;

&lt;p&gt;📥 Step 6: Download the Database File&lt;br&gt;
Once SSH access was fixed, he downloaded the file:&lt;/p&gt;

&lt;p&gt;scp &lt;a href="mailto:root@123.456.78.90"&gt;root@123.456.78.90&lt;/a&gt;:/root/solssaSystem.sql.gz ~/Downloads/&lt;br&gt;
🧩 Step 7: Extract and Import the File Locally&lt;br&gt;
Back on his local Ubuntu system:&lt;/p&gt;

&lt;p&gt;gunzip ~/Downloads/solssaSystem.sql.gz&lt;br&gt;
mysql -u root -p solssaLocal &amp;lt; ~/Downloads/solssaSystem.sql&lt;br&gt;
solssaLocal is the name of the database he created locally to import into.&lt;/p&gt;

&lt;p&gt;✅ Conclusion&lt;br&gt;
Mwacharo successfully cloned his Solssa production database into his local environment using only a few terminal commands and SSH configuration. This is a common and secure practice for developers working with cloud-hosted databases.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>🔐 How to Disable User Registration in a Complete Laravel + Inertia System</title>
      <dc:creator>JOHN MWACHARO</dc:creator>
      <pubDate>Tue, 29 Jul 2025 03:55:17 +0000</pubDate>
      <link>https://dev.to/mwacharo/how-to-disable-user-registration-in-a-complete-laravel-inertia-system-2d3o</link>
      <guid>https://dev.to/mwacharo/how-to-disable-user-registration-in-a-complete-laravel-inertia-system-2d3o</guid>
      <description>&lt;p&gt;In modern web applications—especially internal systems like courier management platforms, ERPs, and CRMs—user registration should not be open to the public. Instead, accounts should be created by system admins. This guide walks you through disabling user registration completely in a Laravel + Jetstream + Inertia + Vue stack.&lt;/p&gt;

&lt;p&gt;✅ Why Disable Registration?&lt;br&gt;
In enterprise systems like:&lt;/p&gt;

&lt;p&gt;Courier company dashboards&lt;/p&gt;

&lt;p&gt;Internal order management apps&lt;/p&gt;

&lt;p&gt;Logistics and HR platforms&lt;/p&gt;

&lt;p&gt;…it’s risky and unnecessary to allow public user self-registration. Allowing users to sign up from outside could lead to:&lt;/p&gt;

&lt;p&gt;Unauthorized access&lt;/p&gt;

&lt;p&gt;Security holes&lt;/p&gt;

&lt;p&gt;Data leaks&lt;/p&gt;

&lt;p&gt;Increased attack surface&lt;/p&gt;

&lt;p&gt;Instead, users should be created internally by authorized admins.&lt;/p&gt;

&lt;p&gt;⚙️ Tech Stack in Use&lt;br&gt;
This guide applies to Laravel projects using:&lt;/p&gt;

&lt;p&gt;Jetstream (with Inertia + Vue)&lt;/p&gt;

&lt;p&gt;Laravel Fortify for authentication&lt;/p&gt;

&lt;p&gt;Vue 3 + TailwindCSS frontend&lt;/p&gt;

&lt;p&gt;🔧 Step-by-Step: Disabling User Registration&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;🧱 Disable Fortify Registration Feature
Open config/fortify.php and remove or comment the registration feature:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;'features' =&amp;gt; [&lt;br&gt;
    // Features::registration(), ← Disable this&lt;br&gt;
    Features::resetPasswords(),&lt;br&gt;
    Features::emailVerification(),&lt;br&gt;
],&lt;br&gt;
This prevents Laravel Fortify from registering the /register route.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;🚫 Block Any Remaining /register Route
Even if Fortify is disabled, it's good practice to explicitly block the route in case someone tries accessing it manually.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Add this to the bottom of your routes/web.php:&lt;/p&gt;

&lt;p&gt;Route::any('/register', function () {&lt;br&gt;
    abort(403, 'User registration is disabled.');&lt;br&gt;
});&lt;br&gt;
This will return a 403 Forbidden response to any HTTP method trying to access /register.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;🧼 Remove the Register Page from Vue Frontend
In your Vue Inertia project, locate:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;resources/js/Pages/Auth/Register.vue&lt;br&gt;
Option A: Delete the File&lt;br&gt;
You can simply remove it:&lt;/p&gt;

&lt;p&gt;rm resources/js/Pages/Auth/Register.vue&lt;br&gt;
Option B: Redirect Inside the Component&lt;br&gt;
If you prefer to leave the file, use this instead:&lt;/p&gt;


import { onMounted } from 'vue'
import { router } from '@inertiajs/vue3'

onMounted(() =&amp;gt; {
  router.visit(route('home'))
})


&lt;p&gt;&lt;br&gt;
  Redirecting...&lt;br&gt;
&lt;br&gt;
This redirects users to the welcome page if they somehow navigate to /register.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;✂️ Remove Registration Links in the UI
Anywhere you have:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Register&lt;br&gt;
Remove it or wrap it in a condition that always evaluates to false:&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
  Register&lt;br&gt;
&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;🧪 Test With Curl
Try this to ensure registration is blocked:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;curl -X POST &lt;a href="http://127.0.0.1:8000/register" rel="noopener noreferrer"&gt;http://127.0.0.1:8000/register&lt;/a&gt; \&lt;br&gt;
  -H "Accept: application/json" \&lt;br&gt;
  -H "Content-Type: application/json" \&lt;br&gt;
  -d '{"name":"Test","email":"&lt;a href="mailto:test@mail.com"&gt;test@mail.com&lt;/a&gt;","password":"secret123","password_confirmation":"secret123"}'&lt;br&gt;
✅ Expected: You should receive a 403 Forbidden or 404 Not Found response.&lt;/p&gt;

&lt;p&gt;✅ Bonus: How to Create Users Internally&lt;br&gt;
To manually create users in your system:&lt;/p&gt;

&lt;p&gt;use App\Models\User;&lt;br&gt;
use Illuminate\Support\Facades\Hash;&lt;/p&gt;

&lt;p&gt;User::create([&lt;br&gt;
    'name' =&amp;gt; 'New Staff',&lt;br&gt;
    'email' =&amp;gt; '&lt;a href="mailto:staff@example.com"&gt;staff@example.com&lt;/a&gt;',&lt;br&gt;
    'password' =&amp;gt; Hash::make('securePassword123'),&lt;br&gt;
]);&lt;br&gt;
You can also create a simple admin-only user management panel using standard CRUD logic.&lt;/p&gt;

&lt;p&gt;🛡️ Final Thoughts&lt;br&gt;
Disabling user self-registration in enterprise applications is a best practice for security and control. Laravel + Jetstream + Inertia gives you the flexibility to tailor authentication exactly how you need it.&lt;/p&gt;

&lt;p&gt;With these steps, you ensure your system remains internal, secure, and professional.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>AI Service Architecture &amp; Deployment Guide</title>
      <dc:creator>JOHN MWACHARO</dc:creator>
      <pubDate>Sat, 26 Jul 2025 04:30:10 +0000</pubDate>
      <link>https://dev.to/mwacharo/ai-service-architecture-deployment-guide-2404</link>
      <guid>https://dev.to/mwacharo/ai-service-architecture-deployment-guide-2404</guid>
      <description>&lt;h1&gt;
  
  
  AI Service Architecture &amp;amp; Deployment Guide
&lt;/h1&gt;

&lt;h2&gt;
  
  
  🏗️ &lt;strong&gt;System Architecture Overview&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Frontend      │    │   API Gateway   │    │   AI Services   │
│   Vue.js App    │◄──►│   (Kong/NGINX)  │◄──►│   Microservices │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                                                       │
                       ┌─────────────────┐    ┌─────────────────┐
                       │   Database      │    │   ML Models     │
                       │   (PostgreSQL)  │    │   (TensorFlow)  │
                       └─────────────────┘    └─────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🚀 &lt;strong&gt;Performance Optimization Strategy&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;1. Microservices Architecture&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Core AI Services:&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Route Optimization Service&lt;/strong&gt; - Handles path finding and optimization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predictive Analytics Service&lt;/strong&gt; - Delivery predictions and risk assessment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fraud Detection Service&lt;/strong&gt; - Real-time fraud scoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NLP Communication Service&lt;/strong&gt; - Message generation and sentiment analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Computer Vision Service&lt;/strong&gt; - Package and document recognition&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Voice Processing Service&lt;/strong&gt; - Speech-to-text and voice commands&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Service Structure:&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example: Route Optimization Service (Python/FastAPI)
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BackgroundTasks&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tensorflow&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Route Optimization AI Service&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Load pre-trained model
&lt;/span&gt;&lt;span class="n"&gt;route_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keras&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/models/route_optimizer_v2.h5&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Redis for caching
&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;redis&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;6379&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decode_responses&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RouteOptimizationRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;pickup_location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;
    &lt;span class="n"&gt;delivery_location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;
    &lt;span class="n"&gt;constraints&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;
    &lt;span class="n"&gt;external_factors&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;

&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/optimize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;optimize_route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;RouteOptimizationRequest&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Check cache first
&lt;/span&gt;    &lt;span class="n"&gt;cache_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;route_opt:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;cached_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cache_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cached_result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cached_result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# AI Model Processing
&lt;/span&gt;    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;process_route_optimization&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Cache result for 5 minutes
&lt;/span&gt;    &lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cache_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_route_optimization&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Prepare input features
&lt;/span&gt;    &lt;span class="n"&gt;features&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;prepare_features&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Run AI model prediction
&lt;/span&gt;    &lt;span class="n"&gt;prediction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;route_model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Post-process results
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;post_process_route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prediction&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;2. Database Optimization&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;PostgreSQL with AI-Specific Indexes:&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Optimized indexes for AI queries&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;CONCURRENTLY&lt;/span&gt; &lt;span class="n"&gt;idx_orders_ai_features&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; 
    &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIN&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;ai_features&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;jsonb&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;CONCURRENTLY&lt;/span&gt; &lt;span class="n"&gt;idx_orders_geolocation&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; 
    &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIST&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pickup_location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;delivery_location&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;CONCURRENTLY&lt;/span&gt; &lt;span class="n"&gt;idx_delivery_patterns&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;delivery_history&lt;/span&gt; 
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;delivery_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;success_status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Materialized view for AI analytics&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;MATERIALIZED&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;ai_order_features&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;extract_ai_features&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;delivery_success&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;delivery_time_actual&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;customer_rating&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;delivery_history&lt;/span&gt; &lt;span class="n"&gt;dh&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dh&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'90 days'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Refresh every hour&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;refresh_ai_features&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="n"&gt;void&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
    &lt;span class="n"&gt;REFRESH&lt;/span&gt; &lt;span class="n"&gt;MATERIALIZED&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;CONCURRENTLY&lt;/span&gt; &lt;span class="n"&gt;ai_order_features&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt; &lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;3. Caching Strategy&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Redis Caching Layers:&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// AI Service Cache Manager&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AICacheManager&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;redis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;REDIS_HOST&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;REDIS_PORT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;retryDelayOnFailover&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;maxRetriesPerRequest&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Route optimization cache (5 minutes)&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;cacheRouteOptimization&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`route_opt:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Fraud detection cache (1 hour)&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;cacheFraudResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`fraud:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Predictive analytics cache (30 minutes)&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;cachePredictions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`predictions:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// AI insights cache (15 minutes)&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;cacheInsights&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;insights&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`insights:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;900&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;insights&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;4. Model Optimization&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;TensorFlow Serving with GPU Acceleration:&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# docker-compose.ai-services.yml&lt;/span&gt;
&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;3.8'&lt;/span&gt;
&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;route-optimizer&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;tensorflow/serving:latest-gpu&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;MODEL_NAME=route_optimizer&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;MODEL_BASE_PATH=/models/route_optimizer&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./models/route_optimizer:/models/route_optimizer&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8501:8501"&lt;/span&gt;
    &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;reservations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;devices&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nvidia&lt;/span&gt;
              &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
              &lt;span class="na"&gt;capabilities&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;gpu&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

  &lt;span class="na"&gt;fraud-detector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;tensorflow/serving:latest&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;MODEL_NAME=fraud_detector&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;MODEL_BASE_PATH=/models/fraud_detector&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./models/fraud_detector:/models/fraud_detector&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8502:8501"&lt;/span&gt;

  &lt;span class="na"&gt;nlp-service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./services/nlp&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;TRANSFORMERS_CACHE=/cache&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;CUDA_VISIBLE_DEVICES=0&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./cache:/cache&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8503:8000"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🔧 &lt;strong&gt;Implementation Details&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;1. API Service Layer (Node.js/Express)&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/aiGateway.js&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;express&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;express&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createProxyMiddleware&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http-proxy-middleware&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rateLimit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;express-rate-limit&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;redis&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;express&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;redisClient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;// Rate limiting for AI endpoints&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;aiRateLimit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;rateLimit&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;windowMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// 1 minute&lt;/span&gt;
    &lt;span class="na"&gt;max&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// limit each IP to 100 requests per windowMs&lt;/span&gt;
    &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Too many AI requests, please try again later&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;// Health check for AI services&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/health&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;services&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;route-optimizer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fraud-detector&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;nlp-service&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;health&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

    &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;service&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;services&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`http://&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;service&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:8000/health`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nx"&gt;health&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;service&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;healthy&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;unhealthy&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;health&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;service&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;unreachable&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ok&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;health&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;// Proxy to route optimization service&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/v1/ai/route&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;aiRateLimit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;createProxyMiddleware&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://route-optimizer:8000&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;changeOrigin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;pathRewrite&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;^/api/v1/ai/route&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;onError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Route service error:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;503&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Route optimization service unavailable&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}))&lt;/span&gt;

&lt;span class="c1"&gt;// Proxy to fraud detection service&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/v1/ai/security&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;aiRateLimit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;createProxyMiddleware&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://fraud-detector:8000&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;changeOrigin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;pathRewrite&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;^/api/v1/ai/security&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}))&lt;/span&gt;

&lt;span class="c1"&gt;// Proxy to NLP service&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/v1/ai/communications&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;aiRateLimit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;createProxyMiddleware&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://nlp-service:8000&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;changeOrigin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;pathRewrite&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;^/api/v1/ai/communications&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}))&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AI Gateway running on port 3000&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;2. Frontend Performance Optimization&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Service Worker for AI Caching:&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// public/ai-service-worker.js&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;AI_CACHE_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai-responses-v1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;AI_CACHE_DURATION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="c1"&gt;// 5 minutes&lt;/span&gt;

&lt;span class="nb"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fetch&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;// Cache AI responses for performance&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pathname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/v1/ai/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;respondWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nx"&gt;caches&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;AI_CACHE_NAME&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cache&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cachedResponse&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cachedResponse&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cachedTime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;cachedResponse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cached-time&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

                        &lt;span class="c1"&gt;// Check if cache is still valid&lt;/span&gt;
                        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;now&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cachedTime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;AI_CACHE_DURATION&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;cachedResponse&lt;/span&gt;
                        &lt;span class="p"&gt;}&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;

                    &lt;span class="c1"&gt;// Fetch new response and cache it&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;responseClone&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clone&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                        &lt;span class="nx"&gt;responseClone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cached-time&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
                        &lt;span class="nx"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;responseClone&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;
                    &lt;span class="p"&gt;})&lt;/span&gt;
                &lt;span class="p"&gt;})&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  &lt;strong&gt;Optimized AI Service Class:&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Enhanced aiService.js with performance optimizations&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OptimizedAIService&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;AIService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;super&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;requestQueue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;batchTimer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;batchRequests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Batch similar requests together&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;makeOptimizedRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Check if similar request is already in queue&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;requestKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;

        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;requestQueue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;requestKey&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;requestQueue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;requestKey&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// For prediction endpoints, batch requests&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/predictions/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;batchPredictionRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// For other requests, use normal flow with caching&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;requestPromise&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;makeRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;requestQueue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;requestKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;requestPromise&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;// Clear from queue after completion&lt;/span&gt;
        &lt;span class="nx"&gt;requestPromise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;finally&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;requestQueue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;requestKey&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;requestPromise&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Batch prediction requests for efficiency&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;batchPredictionRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;batchRequests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reject&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;

            &lt;span class="c1"&gt;// Process batch after 100ms or when we have 10 requests&lt;/span&gt;
            &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;batchRequests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;processBatch&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;batchTimer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;batchTimer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;processBatch&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;processBatch&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;batchTimer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nf"&gt;clearTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;batchTimer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;batchTimer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;batch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;batchRequests&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;batchRequests&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;batchRequest&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
                    &lt;span class="na"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;}))&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;makeRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/batch/predictions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;batchRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;

            &lt;span class="c1"&gt;// Resolve individual requests&lt;/span&gt;
            &lt;span class="nx"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;index&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;index&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Reject all requests in batch&lt;/span&gt;
            &lt;span class="nx"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;optimizedAIService&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OptimizedAIService&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;3. Deployment Configuration&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Kubernetes Deployment:&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# k8s/ai-services.yaml&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai-gateway&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai-gateway&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai-gateway&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai-gateway&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;courier-ai/gateway:latest&lt;/span&gt;
        &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3000&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;REDIS_HOST&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis-service"&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AI_SERVICE_TOKEN&lt;/span&gt;
          &lt;span class="na"&gt;valueFrom&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;secretKeyRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai-secrets&lt;/span&gt;
              &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;service-token&lt;/span&gt;
        &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;256Mi"&lt;/span&gt;
            &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;250m"&lt;/span&gt;
          &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;512Mi"&lt;/span&gt;
            &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;500m"&lt;/span&gt;
        &lt;span class="na"&gt;livenessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/health&lt;/span&gt;
            &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3000&lt;/span&gt;
          &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;
          &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;

&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;route-optimizer&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;route-optimizer&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;route-optimizer&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;route-optimizer&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;courier-ai/route-optimizer:latest&lt;/span&gt;
        &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8000&lt;/span&gt;
        &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1Gi"&lt;/span&gt;
            &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;500m"&lt;/span&gt;
            &lt;span class="na"&gt;nvidia.com/gpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
          &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2Gi"&lt;/span&gt;
            &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1000m"&lt;/span&gt;
            &lt;span class="na"&gt;nvidia.com/gpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;MODEL_PATH&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/models/route_optimizer_v2.h5"&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;BATCH_SIZE&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;32"&lt;/span&gt;
        &lt;span class="na"&gt;volumeMounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;model-storage&lt;/span&gt;
          &lt;span class="na"&gt;mountPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/models&lt;/span&gt;
      &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;model-storage&lt;/span&gt;
        &lt;span class="na"&gt;persistentVolumeClaim&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;claimName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai-models-pvc&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  &lt;strong&gt;Docker Compose for Development:&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# docker-compose.dev.yml&lt;/span&gt;
&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;3.8'&lt;/span&gt;
&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;redis&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis:7-alpine&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;6379:6379"&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;redis_data:/data&lt;/span&gt;

  &lt;span class="na"&gt;postgres&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;postgres:15&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;POSTGRES_DB&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;courier_ai&lt;/span&gt;
      &lt;span class="na"&gt;POSTGRES_USER&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;courier&lt;/span&gt;
      &lt;span class="na"&gt;POSTGRES_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;secure_password&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;postgres_data:/var/lib/postgresql/data&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./sql/init.sql:/docker-entrypoint-initdb.d/init.sql&lt;/span&gt;

  &lt;span class="na"&gt;ai-gateway&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./services/gateway&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3000:3000"&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;REDIS_HOST=redis&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;DATABASE_URL=postgresql://courier:secure_password@postgres:5432/courier_ai&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;redis&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;postgres&lt;/span&gt;

  &lt;span class="na"&gt;route-optimizer&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./services/route-optimizer&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8001:8000"&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;MODEL_PATH=/models/route_optimizer_v2.h5&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;REDIS_HOST=redis&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./models:/models&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;redis&lt;/span&gt;

  &lt;span class="na"&gt;frontend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./frontend&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8080:8080"&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;VUE_APP_AI_SERVICE_URL=http://localhost:3000/api/v1/ai&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ai-gateway&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;redis_data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;postgres_data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;4. Monitoring &amp;amp; Analytics&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Prometheus Metrics:&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// metrics/aiMetrics.js&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;promClient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;prom-client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// Create custom metrics&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;aiRequestDuration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;promClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Histogram&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai_request_duration_seconds&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;help&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Duration of AI requests in seconds&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;labelNames&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;service&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;endpoint&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;status&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;aiModelAccuracy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;promClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Gauge&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai_model_accuracy&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;help&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Current accuracy of AI models&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;labelNames&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;model_name&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;version&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;aiCacheHitRate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;promClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Gauge&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai_cache_hit_rate&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;help&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Cache hit rate for AI responses&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;labelNames&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cache_type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;// Middleware to track metrics&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;trackAIMetrics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;startTime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;finish&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;startTime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
        &lt;span class="nx"&gt;aiRequestDuration&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;service&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;statusCode&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;observe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exports&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;aiRequestDuration&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;aiModelAccuracy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;aiCacheHitRate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;trackAIMetrics&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;5. Security &amp;amp; Compliance&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;AI Service Authentication:&lt;/strong&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// middleware/aiAuth.js&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;jwt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;jsonwebtoken&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rateLimit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;express-rate-limit&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// JWT verification for AI endpoints&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;verifyAIToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;authorization&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt; &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;No token provided&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;decoded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AI_JWT_SECRET&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;decoded&lt;/span&gt;

        &lt;span class="c1"&gt;// Check AI service permissions&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;decoded&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;permissions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai_access&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Insufficient permissions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Invalid token&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Rate limiting specific to AI services&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;aiRateLimit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;rateLimit&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;windowMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// 1 minute&lt;/span&gt;
    &lt;span class="na"&gt;max&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Different limits based on user tier&lt;/span&gt;
        &lt;span class="k"&gt;switch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;tier&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;premium&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;standard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
            &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;keyGenerator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ip&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exports&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;verifyAIToken&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;aiRateLimit&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  📊 &lt;strong&gt;Performance Benchmarks&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Expected Performance Metrics:&lt;/strong&gt;
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;Response Time&lt;/th&gt;
&lt;th&gt;Throughput&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Route Optimization&lt;/td&gt;
&lt;td&gt;&amp;lt; 200ms&lt;/td&gt;
&lt;td&gt;1000 req/min&lt;/td&gt;
&lt;td&gt;98.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fraud Detection&lt;/td&gt;
&lt;td&gt;&amp;lt; 100ms&lt;/td&gt;
&lt;td&gt;2000 req/min&lt;/td&gt;
&lt;td&gt;94.2%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Delivery Prediction&lt;/td&gt;
&lt;td&gt;&amp;lt; 150ms&lt;/td&gt;
&lt;td&gt;1500 req/min&lt;/td&gt;
&lt;td&gt;96.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Message Generation&lt;/td&gt;
&lt;td&gt;&amp;lt; 300ms&lt;/td&gt;
&lt;td&gt;800 req/min&lt;/td&gt;
&lt;td&gt;95.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Voice Processing&lt;/td&gt;
&lt;td&gt;&amp;lt; 500ms&lt;/td&gt;
&lt;td&gt;400 req/min&lt;/td&gt;
&lt;td&gt;97.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Resource Requirements:&lt;/strong&gt;
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;CPU&lt;/th&gt;
&lt;th&gt;Memory&lt;/th&gt;
&lt;th&gt;GPU&lt;/th&gt;
&lt;th&gt;Storage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AI Gateway&lt;/td&gt;
&lt;td&gt;2 cores&lt;/td&gt;
&lt;td&gt;1GB&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;10GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Route Optimizer&lt;/td&gt;
&lt;td&gt;4 cores&lt;/td&gt;
&lt;td&gt;4GB&lt;/td&gt;
&lt;td&gt;1x RTX 3080&lt;/td&gt;
&lt;td&gt;50GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fraud Detector&lt;/td&gt;
&lt;td&gt;2 cores&lt;/td&gt;
&lt;td&gt;2GB&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;20GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NLP Service&lt;/td&gt;
&lt;td&gt;4 cores&lt;/td&gt;
&lt;td&gt;8GB&lt;/td&gt;
&lt;td&gt;1x RTX 3080&lt;/td&gt;
&lt;td&gt;100GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Redis Cache&lt;/td&gt;
&lt;td&gt;2 cores&lt;/td&gt;
&lt;td&gt;4GB&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;20GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PostgreSQL&lt;/td&gt;
&lt;td&gt;4 cores&lt;/td&gt;
&lt;td&gt;8GB&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;500GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  🚀 &lt;strong&gt;Deployment Steps&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;1. Infrastructure Setup&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create Kubernetes cluster&lt;/span&gt;
kubectl create namespace courier-ai

&lt;span class="c"&gt;# Deploy Redis&lt;/span&gt;
kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; k8s/redis.yaml

&lt;span class="c"&gt;# Deploy PostgreSQL&lt;/span&gt;
kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; k8s/postgres.yaml

&lt;span class="c"&gt;# Create secrets&lt;/span&gt;
kubectl create secret generic ai-secrets &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--from-literal&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;service-token&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-secure-token"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--from-literal&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;jwt-secret&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-jwt-secret"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;2. AI Services Deployment&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Build and deploy AI services&lt;/span&gt;
docker build &lt;span class="nt"&gt;-t&lt;/span&gt; courier-ai/gateway ./services/gateway
docker build &lt;span class="nt"&gt;-t&lt;/span&gt; courier-ai/route-optimizer ./services/route-optimizer

&lt;span class="c"&gt;# Deploy to Kubernetes&lt;/span&gt;
kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; k8s/ai-services.yaml

&lt;span class="c"&gt;# Verify deployment&lt;/span&gt;
kubectl get pods &lt;span class="nt"&gt;-n&lt;/span&gt; courier-ai
kubectl logs &lt;span class="nt"&gt;-f&lt;/span&gt; deployment/ai-gateway &lt;span class="nt"&gt;-n&lt;/span&gt; courier-ai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;3. Frontend Integration&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Update environment variables&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"VUE_APP_AI_SERVICE_URL=https://ai.yourdomain.com/api/v1/ai"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; .env

&lt;span class="c"&gt;# Build and deploy frontend&lt;/span&gt;
npm run build
docker build &lt;span class="nt"&gt;-t&lt;/span&gt; courier-ai/frontend &lt;span class="nb"&gt;.&lt;/span&gt;
kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; k8s/frontend.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;4. Monitoring Setup&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Deploy Prometheus and Grafana&lt;/span&gt;
helm &lt;span class="nb"&gt;install &lt;/span&gt;prometheus prometheus-community/kube-prometheus-stack

&lt;span class="c"&gt;# Import AI dashboard&lt;/span&gt;
kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; monitoring/ai-dashboard.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This architecture provides a scalable, high-performance AI service layer that can handle thousands of concurrent requests while maintaining sub-second response times for critical operations. The separation of concerns allows each AI service to be independently scaled and optimized based on demand patterns.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>🚀 Automating Laravel Deployment to DigitalOcean with GitHub Actions</title>
      <dc:creator>JOHN MWACHARO</dc:creator>
      <pubDate>Tue, 22 Jul 2025 17:56:08 +0000</pubDate>
      <link>https://dev.to/mwacharo/automating-laravel-deployment-to-digitalocean-with-github-actions-1mmc</link>
      <guid>https://dev.to/mwacharo/automating-laravel-deployment-to-digitalocean-with-github-actions-1mmc</guid>
      <description>&lt;p&gt;When building with Laravel, deploying changes from your local machine to your production server shouldn’t be a chore. In this post, I’ll show you how to automate Laravel deployments to a DigitalOcean VPS using GitHub Actions — so that every push to GitHub updates your live app. Fast, clean, and fully hands-free.&lt;/p&gt;

&lt;p&gt;✅ Why Use GitHub Actions for Deployment?&lt;br&gt;
🔁 Automated: Every push triggers a deployment.&lt;/p&gt;

&lt;p&gt;🧠 Smart: Only deploys on specific branches.&lt;/p&gt;

&lt;p&gt;🔐 Secure: Uses SSH with no manual login.&lt;/p&gt;

&lt;p&gt;🚫 No third-party CI/CD platforms needed.&lt;/p&gt;

&lt;p&gt;🔧 Prerequisites&lt;br&gt;
Before diving in, ensure:&lt;/p&gt;

&lt;p&gt;You have a Laravel project hosted on GitHub.&lt;/p&gt;

&lt;p&gt;Your production server (DigitalOcean) is running Ubuntu, PHP, Composer, and Laravel dependencies.&lt;/p&gt;

&lt;p&gt;SSH access to the server (we’ll use this for GitHub Actions).&lt;/p&gt;

&lt;p&gt;1️⃣ Generate SSH Key for GitHub&lt;br&gt;
From your local machine:&lt;/p&gt;

&lt;p&gt;ssh-keygen -t ed25519 -C "github_actions_deploy"&lt;br&gt;
When prompted for a filename, use:&lt;/p&gt;

&lt;p&gt;/home/engineer/.ssh/solssa_github_ci&lt;br&gt;
This creates:&lt;/p&gt;

&lt;p&gt;solssa_github_ci → Private key&lt;/p&gt;

&lt;p&gt;solssa_github_ci.pub → Public key&lt;/p&gt;

&lt;p&gt;2️⃣ Add the Public Key to Your Server&lt;br&gt;
Upload the public key to your DigitalOcean server:&lt;/p&gt;

&lt;p&gt;ssh-copy-id -i ~/.ssh/solssa_github_ci.pub &lt;a href="mailto:root@159.89.41.188"&gt;root@159.89.41.188&lt;/a&gt;&lt;br&gt;
Test the connection:&lt;/p&gt;

&lt;p&gt;ssh -i ~/.ssh/solssa_github_ci &lt;a href="mailto:root@159.89.41.188"&gt;root@159.89.41.188&lt;/a&gt;&lt;br&gt;
You should log in without a password.&lt;/p&gt;

&lt;p&gt;3️⃣ Add Private Key to GitHub Secrets&lt;br&gt;
On your GitHub repo (SolssaTemplate), go to:&lt;/p&gt;

&lt;p&gt;Settings &amp;gt; Secrets and variables &amp;gt; Actions&lt;br&gt;
Create these 3 secrets:&lt;/p&gt;

&lt;p&gt;Name    Value&lt;br&gt;
DO_SSH_KEY  Contents of solssa_github_ci&lt;br&gt;
DO_HOST 159.89.41.188&lt;br&gt;
DO_USER root&lt;/p&gt;

&lt;p&gt;4️⃣ Create the GitHub Actions Workflow&lt;br&gt;
In your Laravel project, create:&lt;/p&gt;

&lt;p&gt;.github/workflows/deploy.yml&lt;br&gt;
Paste this:&lt;/p&gt;

&lt;p&gt;yaml&lt;br&gt;
Copy&lt;br&gt;
Edit&lt;br&gt;
name: Deploy to DigitalOcean&lt;/p&gt;

&lt;p&gt;on:&lt;br&gt;
  push:&lt;br&gt;
    branches:&lt;br&gt;
      - main  # or 'master'&lt;/p&gt;

&lt;p&gt;jobs:&lt;br&gt;
  deploy:&lt;br&gt;
    runs-on: ubuntu-latest&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;steps:
  - name: Checkout code
    uses: actions/checkout@v2

  - name: Deploy to DigitalOcean via SSH
    uses: appleboy/ssh-action@v1.0.0
    with:
      host: ${{ secrets.DO_HOST }}
      username: ${{ secrets.DO_USER }}
      key: ${{ secrets.DO_SSH_KEY }}
      script: |
        cd /var/www/solssa
        git pull origin main
        composer install --no-dev --optimize-autoloader
        php artisan migrate --force
        php artisan config:cache
        php artisan route:cache
        php artisan view:cache
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;☝️ Update /var/www/solssa to your actual Laravel app directory on the server.&lt;/p&gt;

&lt;p&gt;5️⃣ Trigger Your First Deploy&lt;br&gt;
Push any change to your GitHub main branch:&lt;/p&gt;

&lt;p&gt;git add .&lt;br&gt;
git commit -m "Automated deploy test"&lt;br&gt;
git push origin main&lt;br&gt;
Then visit the Actions tab on your repo to watch the magic happen. 🎉&lt;/p&gt;

&lt;p&gt;✨ Conclusion&lt;br&gt;
By setting up this simple GitHub Actions workflow, you’ve turned deployment into a single push operation — no FTP, no SSH copy-pasting, just clean DevOps right inside GitHub.&lt;/p&gt;

&lt;p&gt;This setup is perfect for small to mid-size Laravel apps and gives you a solid foundation to scale with more advanced CI/CD in the future.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
