<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dewald Hugo</title>
    <description>The latest articles on DEV Community by Dewald Hugo (@dewaldhugo).</description>
    <link>https://dev.to/dewaldhugo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3773709%2F196ae154-46de-4b83-adb1-087bb96959d7.jpg</url>
      <title>DEV Community: Dewald Hugo</title>
      <link>https://dev.to/dewaldhugo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dewaldhugo"/>
    <language>en</language>
    <item>
      <title>Laravel AI Integration: A Production-Ready Architecture Guide (OpenAI vs Gemini vs Claude)</title>
      <dc:creator>Dewald Hugo</dc:creator>
      <pubDate>Mon, 27 Apr 2026 00:47:06 +0000</pubDate>
      <link>https://dev.to/dewaldhugo/laravel-ai-integration-a-production-ready-architecture-guide-openai-vs-gemini-vs-claude-lcp</link>
      <guid>https://dev.to/dewaldhugo/laravel-ai-integration-a-production-ready-architecture-guide-openai-vs-gemini-vs-claude-lcp</guid>
      <description>&lt;p&gt;Every Laravel AI integration starts the same way. You install a package, copy an API key into &lt;code&gt;.env&lt;/code&gt;, write a controller method that fires a prompt, and it works. Then you ship it. Then you get your first production incident, a 429 rate limit silences a customer-facing feature at 2AM, a Gemini model deprecation breaks a workflow three sprints before anyone noticed the announcement, or a runaway token loop burns through your monthly budget in four hours.&lt;/p&gt;

&lt;p&gt;That is the moment you realise that &lt;strong&gt;Laravel AI integration&lt;/strong&gt; is not an API problem. It is an architecture problem. The API call is the easy part. The hard part is building the layer around it that makes AI behaviour predictable, cost-controlled, and provider-agnostic.&lt;/p&gt;

&lt;p&gt;This guide does not walk you through any provider’s authentication flow. It gives you the system-level map: the problem space, the architectural patterns that address it, a clear-eyed comparison of OpenAI, Gemini, and Claude, and a decision framework you can put in front of your team before the next kickoff.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Laravel AI Integration Problem Space
&lt;/h2&gt;

&lt;p&gt;Before picking a provider or drawing a service diagram, you need to be honest about what you are actually building. The problems that make AI difficult in production are not unique to any one provider. They are structural.&lt;/p&gt;

&lt;h3&gt;
  
  
  Provider Abstraction
&lt;/h3&gt;

&lt;p&gt;You will almost certainly swap or combine providers over the lifecycle of a product. The model that makes sense at launch is rarely the model that makes sense at scale, or after a pricing change, or when your use case evolves. If your application code couples directly to OpenAI’s SDK or Anthropic’s client, a provider change means rewriting feature logic, not just swapping a configuration value.&lt;/p&gt;

&lt;p&gt;The correct answer is a provider interface your application depends on, with concrete adapters behind it. We cover the interface contract below.&lt;/p&gt;

&lt;h3&gt;
  
  
  Token Usage and Cost Control
&lt;/h3&gt;

&lt;p&gt;Language models charge by the token. That sounds simple. In practice it means a single misconfigured prompt, multiplied by your queue throughput, can produce a five-figure bill in an afternoon. We have seen this happen.&lt;/p&gt;

&lt;p&gt;Token management is not optional in production. You need budget enforcement before the API call, usage logging at the response level, and alerting when spend crosses threshold. This belongs in the service layer – not in a controller, not in a scheduled job that runs after the damage is done.&lt;/p&gt;

&lt;h3&gt;
  
  
  Latency and Streaming
&lt;/h3&gt;

&lt;p&gt;A synchronous AI call in an HTTP handler is almost always wrong. Inference latency runs from 800ms on a lightweight model to 30 seconds on a long-form reasoning task. PHP-FPM holds a connection open for that entire window. Under any meaningful traffic, that collapses your worker pool.&lt;/p&gt;

&lt;p&gt;Streaming helps on the UX side, but it does not fix the connection pressure problem. Background jobs fix that. The streaming question is a separate decision, covered in its own section below.&lt;/p&gt;

&lt;h3&gt;
  
  
  Error Handling and Retries
&lt;/h3&gt;

&lt;p&gt;AI APIs fail. They return 429s during traffic spikes, 500s during model instability, and occasionally time out with no response body at all. None of these are exceptional conditions. They are routine.&lt;/p&gt;

&lt;p&gt;Your retry logic must distinguish between transient failures (worth retrying with backoff) and structural failures (wrong API key, malformed payload, context length exceeded). Treating all failures the same is one of the most common production bugs we see in AI-integrated Laravel applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Vendor Lock-in
&lt;/h3&gt;

&lt;p&gt;Every provider has proprietary extensions – function calling formats, system prompt conventions, streaming event shapes, vision attachment schemas. The moment you hardcode those shapes into your application logic, you own a migration project the next time you need to switch.&lt;/p&gt;

&lt;p&gt;Abstraction is the mitigation. But abstraction has a cost. Build it at the right layer, and keep provider-specific behaviour in the adapter, not in your domain services.&lt;/p&gt;

&lt;h2&gt;
  
  
  Provider Comparison: OpenAI vs Gemini vs Claude
&lt;/h2&gt;

&lt;p&gt;No provider is the right answer for every use case. Here is an honest comparison of the three dominant options for Laravel AI integration.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability&lt;/th&gt;
&lt;th&gt;OpenAI (GPT-4o)&lt;/th&gt;
&lt;th&gt;Google (Gemini 2.5 Pro)&lt;/th&gt;
&lt;th&gt;Anthropic (Claude Sonnet 4.5)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Context window&lt;/td&gt;
&lt;td&gt;128K tokens&lt;/td&gt;
&lt;td&gt;1M+ tokens&lt;/td&gt;
&lt;td&gt;200K tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Streaming support&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Function / tool calling&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vision / multimodal&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Native embeddings&lt;/td&gt;
&lt;td&gt;✓ (text-embedding-3)&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✗ (use OpenAI or Gemini)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Laravel SDK support&lt;/td&gt;
&lt;td&gt;First-party&lt;/td&gt;
&lt;td&gt;Via laravel/ai&lt;/td&gt;
&lt;td&gt;Via laravel/ai&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reasoning quality&lt;/td&gt;
&lt;td&gt;Strong, broad&lt;/td&gt;
&lt;td&gt;Strong, long-context&lt;/td&gt;
&lt;td&gt;Exceptional, safety-aware&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rate limit tolerance&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Variable&lt;/td&gt;
&lt;td&gt;Conservative&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pricing tier&lt;/td&gt;
&lt;td&gt;Mid&lt;/td&gt;
&lt;td&gt;Mid&lt;/td&gt;
&lt;td&gt;Mid-High&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Honest notes on each:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI&lt;/strong&gt; has the most mature Laravel ecosystem. The first-party package is stable, the API surface is well-documented, and function calling is production-tested at scale. It is the right default for teams building general-purpose AI features with limited architectural budget.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gemini’s&lt;/strong&gt; one-million-token context window is not a marketing number, it genuinely changes what is possible for document processing, long-session memory, and RAG-less retrieval patterns. The &lt;code&gt;laravel/ai&lt;/code&gt; SDK wraps it cleanly. The downside is that rate limits are inconsistently enforced and the model behaviour at the boundary of that context window is less predictable than Anthropic’s or OpenAI’s.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Claude&lt;/strong&gt; is the most consistent model for reasoning-heavy tasks, structured output extraction, code review pipelines, and anything that requires following complex multi-step instructions reliably. The safety constraints are real and occasionally frustrating for edge cases, but the instruction-following quality justifies the slightly higher cost for high-stakes features.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recommended Architecture for Laravel AI Systems
&lt;/h2&gt;

&lt;p&gt;The pattern that holds up across every provider, every scale, and every use case is the same: isolate AI behind a typed contract, implement concrete adapters per provider, and register the active provider via the Service Container.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Contracts\AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;AIProviderInterface&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]):&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]):&lt;/span&gt; &lt;span class="nc"&gt;\Generator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each provider - OpenAI, Gemini, Claude - gets its own adapter class that implements this interface. Your application services, queued jobs, and controllers depend on &lt;code&gt;AIProviderInterface&lt;/code&gt;. They never know which provider sits behind it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// AppServiceProvider or bootstrap/app.php&lt;/span&gt;
&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;bind&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nc"&gt;\App\Contracts\AI\AIProviderInterface&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nc"&gt;\App\AI\Adapters\ClaudeAdapter&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Swapping providers is now a one-line configuration change. No domain logic moves.&lt;/p&gt;

&lt;p&gt;On top of this interface sits your governance layer – budget enforcement, telemetry, prompt versioning. &lt;a href="https://origin-main.com/laravel-architecture/production-grade-ai-architecture-in-laravel/" rel="noopener noreferrer"&gt;Our guide on production-grade AI architecture in Laravel&lt;/a&gt; covers that layer in full, including typed &lt;code&gt;AiResult&lt;/code&gt; data objects, decorator chains, and audit trail patterns. Read that guide before you build the adapter layer, the contract it defines will inform how you structure the interface above.&lt;/p&gt;

&lt;p&gt;[Architect’s Note] The three-method interface above is intentionally minimal. Resist the temptation to add provider-specific capabilities - vision input, citation formats, tool schemas - to the shared interface. Those belong in extended interfaces or provider-specific services consumed explicitly. The shared interface exists to make the 80% case portable. The 20% case is allowed to know which provider it is talking to.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configuration Management
&lt;/h3&gt;

&lt;p&gt;Provider credentials live in &lt;code&gt;config/ai.php&lt;/code&gt;, not scattered across service provider constructors. Each adapter reads from &lt;code&gt;config('ai.providers.claude.api_key')&lt;/code&gt; and so on. Environment variables map cleanly through the config layer, and you can define per-provider defaults - temperature, max tokens, retry budget - without touching application code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Token Management and Cost Control
&lt;/h2&gt;

&lt;p&gt;This is the section most architectural guides skip. It is also the section that determines whether your AI feature is financially viable.&lt;/p&gt;

&lt;p&gt;Token costs compound. A feature that runs 500 times per day at 2,000 tokens per call is one million tokens daily. Multiply by your model’s per-token rate, multiply by 30, and you have a monthly infrastructure line item that belongs in your budget conversations from day one.&lt;/p&gt;

&lt;p&gt;The enforcement point must be pre-dispatch. That means checking the estimated token count against a configured budget before the API call fires, not after the response arrives. You cannot claw back tokens you have already consumed.&lt;/p&gt;

&lt;p&gt;For systems that incorporate retrieval, the token problem gets harder. Injecting retrieved documents into a prompt is the right architectural move for accuracy, but each document chunk contributes to input token count. RAG systems need token-aware chunking at retrieval time, not just at prompt assembly time. If you are building retrieval into your Laravel application, the &lt;a href="https://origin-main.com/ai-foundations/laravel-vector-database-embeddings-pgvector-rag/" rel="noopener noreferrer"&gt;RAG and vector retrieval implementation guide&lt;/a&gt; covers the chunking and embedding strategy in detail.&lt;/p&gt;

&lt;p&gt;Telemetry is the other half of cost control. Log prompt tokens, completion tokens, model, latency, and user or tenant identity on every response. Without that data, you cannot attribute cost, identify expensive call patterns, or make informed decisions about model selection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Streaming vs Standard Responses
&lt;/h2&gt;

&lt;p&gt;The choice between streaming and synchronous responses is a UX decision that has infrastructure consequences.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Synchronous calls&lt;/strong&gt; are appropriate for background jobs, scheduled tasks, and any context where the result is written to storage before the user sees it. They are simpler to implement, easier to test, and do not require any special transport infrastructure. If your AI call feeds a Queue Worker writing to Eloquent, synchronous is the right default.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Streaming&lt;/strong&gt; is appropriate when a human is waiting and the output is long enough that a full-response delay degrades the experience. The threshold in practice is around 1.5–2 seconds. Below that, streaming adds complexity for no perceivable gain.&lt;/p&gt;

&lt;p&gt;The transport mechanism behind streaming is a separate decision. Server-Sent Events are the right default for Laravel, they require no additional infrastructure, work through standard HTTP, and integrate cleanly with Livewire. WebSockets are justified when you need bidirectional communication or multi-client broadcast. Livewire polling is a reasonable fallback for teams that want to avoid streaming infrastructure entirely, at the cost of perceived responsiveness.&lt;/p&gt;

&lt;p&gt;The tradeoffs between these three approaches are significant enough to warrant their own analysis. Our &lt;a href="https://origin-main.com/ai-foundations/laravel-ai-streaming-livewire-sse-websockets/" rel="noopener noreferrer"&gt;streaming transport decision guide&lt;/a&gt; covers all three with production code and explicit recommendations for each scenario.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing the Right Provider: A Decision Framework
&lt;/h2&gt;

&lt;p&gt;Stop treating provider selection as a preference. Treat it as an engineering decision with clear inputs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use OpenAI when:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  You need native embeddings alongside generation (text-embedding-3-small covers most retrieval use cases)&lt;/li&gt;
&lt;li&gt;  Your team needs the most widely documented integration path&lt;/li&gt;
&lt;li&gt;  You are building function-calling pipelines where tool reliability matters more than reasoning depth&lt;/li&gt;
&lt;li&gt;  You want the lowest overhead Laravel AI integration for an MVP that may evolve later&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;a href="https://origin-main.com/guides/laravel-openai-integration-guide/" rel="noopener noreferrer"&gt;complete OpenAI integration guide&lt;/a&gt; covers transport, function calling, and retry patterns against the GPT-4o and GPT-4o-mini endpoints.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Gemini when:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  Your use case involves documents or sessions that exceed 128K tokens&lt;/li&gt;
&lt;li&gt;  You are processing multimodal inputs — documents, images, mixed media — at scale&lt;/li&gt;
&lt;li&gt;  You want to experiment with RAG-less patterns using Gemini’s long context as a retrieval proxy&lt;/li&gt;
&lt;li&gt;  Cost at volume is a primary concern (Gemini 2.0 Flash is competitive at high call rates)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For &lt;a href="https://origin-main.com/guides/laravel-gemini-integration/" rel="noopener noreferrer"&gt;Gemini setup via the Laravel AI SDK&lt;/a&gt;, including the &lt;code&gt;GeminiManager&lt;/code&gt; failover pattern and model string verification, that guide covers everything from provider registration to streaming response handling.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Claude when:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  The task requires following multi-step instructions precisely and reliably&lt;/li&gt;
&lt;li&gt;  You are extracting structured output from unstructured text (JSON extraction, data normalisation)&lt;/li&gt;
&lt;li&gt;  Safety and refusal behaviour matter — regulated industries, content moderation pipelines&lt;/li&gt;
&lt;li&gt;  Code generation or review is a significant part of the workload&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;a href="https://origin-main.com/guides/laravel-claude-api-integration-guide/" rel="noopener noreferrer"&gt;Laravel Claude API integration guide&lt;/a&gt; covers the full Anthropic client setup, system prompt patterns, and streaming implementation with Laravel’s HTTP client.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mixed-provider architectures
&lt;/h3&gt;

&lt;p&gt;The interface pattern described earlier makes multi-provider setups practical. Routing by task type, OpenAI for embeddings, Claude for structured extraction, Gemini for long-document summarisation, is a legitimate production architecture, not over-engineering. The Service Container resolves the correct adapter per task type, and your domain code stays clean.&lt;/p&gt;

&lt;p&gt;[Production Pitfall] Mixed-provider setups introduce a subtle budget tracking problem. If you log token costs per provider separately, you lose the aggregate view. Make sure your telemetry table records the provider name alongside every call, and your cost dashboards aggregate across providers by feature or tenant, not just by provider account.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;If you take one thing from this guide, make it this: the providers are interchangeable. The architecture is not.&lt;/p&gt;

&lt;p&gt;OpenAI, Gemini, and Claude will all improve, deprecate models, change pricing, and introduce capabilities over the next twelve months. Any of the three can be the right choice today and the wrong choice in two quarters. The teams that weather those changes without incident are the teams that built the abstraction layer before they needed it, not the teams that coupled their application to a specific SDK and are now rewriting feature logic to swap providers.&lt;/p&gt;

&lt;p&gt;Production Laravel AI integration is a system design problem. The interface contract, the adapter layer, the token budget enforcement, the streaming transport decision, the telemetry pipeline — those are the decisions that determine whether your AI feature is operationally sustainable, or just a demo that scaled badly.&lt;/p&gt;

&lt;p&gt;Build the layer. Then pick the provider.&lt;/p&gt;

</description>
      <category>laravel</category>
      <category>ai</category>
      <category>php</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Laravel OpenAI Integration: The Complete Production Guide</title>
      <dc:creator>Dewald Hugo</dc:creator>
      <pubDate>Sun, 26 Apr 2026 20:59:38 +0000</pubDate>
      <link>https://dev.to/dewaldhugo/laravel-openai-integration-the-complete-production-guide-la1</link>
      <guid>https://dev.to/dewaldhugo/laravel-openai-integration-the-complete-production-guide-la1</guid>
      <description>&lt;p&gt;Laravel OpenAI integration is not simpler than Claude integration. The APIs differ, the model families differ — but the architectural requirements are nearly identical once you care about production stability.&lt;/p&gt;

&lt;p&gt;Laravel works well here because it already solves the problems AI integrations create: dependency boundaries, environment isolation, background work, failure handling, and observability. You’re not fighting the framework. You’re leaning on it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;On model names:&lt;/strong&gt; This guide uses &lt;code&gt;gpt-4o&lt;/code&gt; and &lt;code&gt;gpt-4o-mini&lt;/code&gt; as examples. These are current as of early 2026. OpenAI’s model catalogue evolves quickly — always verify against &lt;a href="https://platform.openai.com/docs/models" rel="noopener noreferrer"&gt;OpenAI’s official models reference&lt;/a&gt; before deploying. &lt;code&gt;gpt-3.5-turbo&lt;/code&gt; is deprecated; &lt;code&gt;gpt-4o-mini&lt;/code&gt; is its cost-effective replacement.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Before committing to OpenAI as your sole provider, it is worth understanding where it sits in the broader Laravel AI provider landscape. Our &lt;a href="https://origin-main.com/laravel-architecture/laravel-ai-integration-architecture-guide/" rel="noopener noreferrer"&gt;provider comparison and architecture decision guide&lt;/a&gt; covers how OpenAI stacks up against Gemini and Claude for different production use cases — useful context before you build the service layer around any single provider.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the OpenAI API Surface
&lt;/h2&gt;

&lt;p&gt;OpenAI’s platform has evolved faster than its documentation. Many tutorials conflate multiple API generations or mix incompatible approaches.&lt;/p&gt;

&lt;p&gt;For Laravel applications, think in terms of three distinct capabilities: the &lt;strong&gt;Chat Completions API&lt;/strong&gt; for text generation and reasoning, the &lt;strong&gt;Embeddings API&lt;/strong&gt; for semantic search and retrieval, and the &lt;strong&gt;Audio/Vision APIs&lt;/strong&gt; for transcription and multimodal inputs.&lt;/p&gt;

&lt;p&gt;For image generation specifically – gpt-image-1, queued jobs, S3 storage, and per-request cost tracking — see the &lt;a href="https://origin-main.com/api-integration/laravel-openai-image-generation-gpt-image-1/" rel="noopener noreferrer"&gt;Laravel OpenAI image generation guide&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;OpenAI is not “one API”. Your Laravel architecture should reflect that separation early. Controllers that call the Completions and Embeddings endpoints directly will become unmaintainable fast.&lt;/p&gt;

&lt;h2&gt;
  
  
  Foundations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Authentication and Environment Configuration
&lt;/h3&gt;

&lt;p&gt;OpenAI uses API keys issued at the account or project level. Treat them as production secrets — same as database credentials or payment tokens.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OPENAI_API_KEY=sk-...
OPENAI_ORG_ID=optional
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Never read these values directly from &lt;code&gt;env()&lt;/code&gt; outside of configuration files. This is one of the most common structural mistakes we see in Laravel AI codebases. Expose them through a dedicated config file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// config/openai.php&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'api_key'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'OPENAI_API_KEY'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="s1"&gt;'organization'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'OPENAI_ORG_ID'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="s1"&gt;'base_url'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'OPENAI_BASE_URL'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'https://api.openai.com/v1'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This centralizes OpenAI configuration and allows overrides for testing, proxying, or self-hosted gateways. Skipping this step is how vendor logic leaks across the codebase and makes rotation painful.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Security:&lt;/strong&gt; Never commit API keys to git. Use Laravel’s encrypted environment handling in production, and rotate keys periodically. Treat a leaked OpenAI key with the same urgency as a leaked database password — it has a direct billing impact.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Why You Should Not Use the OpenAI SDK Directly
&lt;/h3&gt;

&lt;p&gt;OpenAI provides official SDKs, including PHP community implementations. Injecting these directly into controllers or jobs is a mistake.&lt;/p&gt;

&lt;p&gt;SDKs change. Model names change. Response formats evolve. Your application should not absorb that churn.&lt;/p&gt;

&lt;p&gt;Access OpenAI through a service boundary you control:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OpenAiClient&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="kt"&gt;HttpClient&lt;/span&gt; &lt;span class="nv"&gt;$http&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OpenAiService&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="kt"&gt;OpenAiClient&lt;/span&gt; &lt;span class="nv"&gt;$client&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This prevents your controllers from becoming vendor adapters. It also makes the Laravel &lt;strong&gt;Service Container&lt;/strong&gt; work for you — you can bind a mock in tests without touching a single controller.&lt;/p&gt;

&lt;h3&gt;
  
  
  HTTP Client Configuration and Timeouts
&lt;/h3&gt;

&lt;p&gt;AI requests can be long-running, bursty, expensive, and sensitive to retries. A default &lt;code&gt;Http::post()&lt;/code&gt; is insufficient. Configure a dedicated client with explicit behaviour:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nc"&gt;Http&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;withHeaders&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="s1"&gt;'Authorization'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Bearer '&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'openai.api_key'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="s1"&gt;'Content-Type'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'application/json'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;retry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;timeout(60)&lt;/code&gt; matters because OpenAI requests can legitimately take tens of seconds with large prompts or slow models. The &lt;code&gt;retry(0, 0)&lt;/code&gt; matters even more — retrying AI requests blindly can double your cost and produce inconsistent outputs. If you need retries, implement them at the application level with full context awareness, not at the transport layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Error Handling as a First-Class Concern
&lt;/h3&gt;

&lt;p&gt;OpenAI’s Chat Completions API fails in ways standard REST APIs rarely do: partial responses, rate limit throttling, model overload, malformed outputs despite 200 responses.&lt;/p&gt;

&lt;p&gt;Your service layer must treat OpenAI responses as untrusted input, even when HTTP status codes indicate success.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;...&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;successful&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAiRequestFailed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;body&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But that’s only half the problem. You must also validate semantic correctness. Did the model return text when you expected JSON? Did it ignore system instructions? Did it hallucinate fields? That validation logic does not belong in a controller.&lt;/p&gt;

&lt;h2&gt;
  
  
  Service Architecture: Designing for Change
&lt;/h2&gt;

&lt;p&gt;Your OpenAI service should not expose low-level API primitives unless absolutely necessary.&lt;/p&gt;

&lt;p&gt;Instead of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$openai&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;chatCompletion&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mf"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Prefer intent-based methods:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$openai&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;generateText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;$openai&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;summarize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;$openai&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;extractStructuredData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Internally, these methods may all use the same API endpoint. Externally, they express business intent. This gives you three architectural wins: prompts become versionable assets, you can A/B test models without touching callers, and you can migrate providers with minimal surface-area changes.&lt;/p&gt;

&lt;p&gt;If you later introduce Claude alongside OpenAI — a decision many teams make as they want provider redundancy — this design pays for itself immediately. The service boundary is what makes multi-model routing tractable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dependency Injection and Testability
&lt;/h3&gt;

&lt;p&gt;Every OpenAI-facing class should be injectable and mockable. No static helpers, no facades hiding HTTP calls, no &lt;code&gt;env()&lt;/code&gt; reads at runtime.&lt;/p&gt;

&lt;p&gt;In tests, you should be able to swap the OpenAI client with a fake implementation that returns deterministic responses. If you cannot do that without modifying production code, you are not integrating OpenAI — you are coupling your application to it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Integration Patterns
&lt;/h2&gt;

&lt;p&gt;Once your foundation is in place, OpenAI becomes a capability your application depends on. These patterns are not OpenAI-specific — they’re standard Laravel patterns applied to probabilistic systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 1: Simple Request–Response (One-Shot Tasks)
&lt;/h3&gt;

&lt;p&gt;The simplest OpenAI interaction is still the most common: you send a prompt, you receive a response, the request lifecycle ends. Classification, extraction, summarization, content drafting — all one-shot.&lt;/p&gt;

&lt;p&gt;Most tutorials implement this directly in controllers. Don’t.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Designing the Service Method&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OpenAiService&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;generateText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// implementation&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Implementing the API Call&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/chat/completions'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'model'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'gpt-4o'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'messages'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'role'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Model selection is explicit. Never rely on defaults — model choice is an architectural decision with cost and quality implications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Extracting the Result Safely&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$data&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nv"&gt;$content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;data_get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'choices.0.message.content'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nv"&gt;$content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UnexpectedOpenAiResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$content&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use &lt;code&gt;data_get()&lt;/code&gt; here. Future models may return multiple content blocks, safety filters may suppress output, and partial responses can occur under load. Your service method should either return valid output or fail loudly — silent nulls are debugging nightmares.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 2: Streaming Responses (Long-Running Output)
&lt;/h3&gt;

&lt;p&gt;Streaming is not about novelty. It’s about user experience under latency. Large outputs can take seconds. Streaming turns that wait into visible progress.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When Streaming Makes Sense&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Use streaming when output length is unbounded and perceived latency matters. Avoid it when you need structured JSON output, must validate the full response before use, or results are consumed only by background jobs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Server-Side Streaming with OpenAI&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;OpenAI supports streaming via Server-Sent Events (SSE). On the Laravel side:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;openAi&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;streamText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"data: "&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nb"&gt;json_encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nb"&gt;ob_flush&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="nb"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'Content-Type'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'text/event-stream'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'Cache-Control'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'no-cache'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'X-Accel-Buffering'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'no'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Implementing the Stream Method&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;The &lt;code&gt;sink&lt;/code&gt; Option Defeats Streaming Entirely&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A common implementation mistake is using &lt;code&gt;-&amp;gt;withOptions(['stream' =&amp;gt; true, 'sink' =&amp;gt; fopen('php://temp', 'w+')])&lt;/code&gt; in Laravel’s HTTP client. The &lt;code&gt;sink&lt;/code&gt; option tells Guzzle to buffer the &lt;em&gt;entire&lt;/em&gt; response to that destination before returning. Your &lt;code&gt;while (!feof($handle))&lt;/code&gt; loop runs after the connection is already closed. You are not streaming — you are reading a completed buffer with extra steps. The UX benefits you expect don’t exist.&lt;/p&gt;

&lt;p&gt;For true streaming, use Guzzle directly with &lt;code&gt;stream =&amp;gt; true&lt;/code&gt; and read the PSR-7 body incrementally:&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;streamText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;callable&lt;/span&gt; &lt;span class="nv"&gt;$onChunk&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$client&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;\GuzzleHttp\Client&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$client&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'https://api.openai.com/v1/chat/completions'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'headers'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'Authorization'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Bearer '&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'openai.api_key'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="s1"&gt;'Content-Type'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'application/json'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="s1"&gt;'json'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'model'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'gpt-4o'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'messages'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt;
            &lt;span class="s1"&gt;'stream'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="s1"&gt;'stream'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;

    &lt;span class="nv"&gt;$body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getBody&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$resource&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$body&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;detach&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nb"&gt;feof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$resource&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;rtrim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;stream_get_line&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$resource&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;empty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nv"&gt;$line&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;'data: [DONE]'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str_starts_with&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'data: '&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;json_decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;substr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;isset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'choices'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'delta'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nv"&gt;$onChunk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'choices'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'delta'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Senior Dev Tip:&lt;/strong&gt; If you need to decide between Livewire polling, SSE, and WebSockets for your AI UI, that architectural choice has more impact than the streaming implementation itself. We compared all three approaches in &lt;a href="https://origin-main.com/ai-foundations/laravel-ai-streaming-livewire-sse-websockets/" rel="noopener noreferrer"&gt;Livewire vs SSE vs WebSockets for AI UIs&lt;/a&gt; — worth reading before you commit.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Pattern 3: Conversation Memory (Multi-Turn Interactions)
&lt;/h3&gt;

&lt;p&gt;Chatbots are one example. Multi-step reasoning, guided forms, iterative content refinement — all of these are multi-turn workflows. The key challenge is state management, not prompt engineering.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Representing Conversation State&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nc"&gt;Schema&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'conversations'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Blueprint&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'context'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;timestamps&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nc"&gt;Schema&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'conversation_messages'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Blueprint&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;foreignId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'conversation_id'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// 'user' or 'assistant'&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;timestamps&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Rehydrating Context&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before sending a request, load the full message history via &lt;strong&gt;Eloquent&lt;/strong&gt; and build the message array:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$conversation&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;orderBy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'role'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$msg&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$msg&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then append the new user input and send the full list to OpenAI. Large language models do not remember anything between API calls. Memory is entirely your responsibility. Persisting messages also lets you truncate context intelligently and summarize older turns when token counts grow large.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 4: Background Processing (Queued AI Work)
&lt;/h3&gt;

&lt;p&gt;Not every AI task belongs in the request–response cycle. Bulk processing, document analysis, scheduled generation — these belong in queued jobs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GenerateSummaryJob&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;ShouldQueue&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;OpenAiService&lt;/span&gt; &lt;span class="nv"&gt;$openAi&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$openAi&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;summarize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'summary'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$summary&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Laravel’s queue system provides retry semantics, failure hooks, and concurrency control — all essential when working with a rate-limited, cost-based API. A background job that silently retries three times on a large document can triple your bill. Jobs must cap token usage, log cost metadata, and fail deterministically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Connecting the Patterns
&lt;/h3&gt;

&lt;p&gt;These four patterns combine in real applications: a Livewire UI streams output, backed by a conversation record, which occasionally offloads heavy steps to queues, all of which reuse the same OpenAI service layer. That consistency is the real win — one abstraction, composable in every direction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building a Document Q&amp;amp;A Feature
&lt;/h2&gt;

&lt;p&gt;Chatbots are not the most interesting application of LLMs in production. The real leverage comes from augmenting your existing data — documents, records, internal knowledge — with natural language access.&lt;/p&gt;

&lt;p&gt;We’ll implement a Document Q&amp;amp;A feature: a user uploads a document, the system extracts and stores its content, and the user can ask questions grounded strictly in that document’s content. This exercises nearly every core pattern: one-shot generation, multi-turn context, background processing, and defensive prompt design.&lt;/p&gt;

&lt;p&gt;We deliberately skip embeddings and vector databases here. That’s a scoped decision, not a shortcut. The simpler architecture works today and degrades predictably when it eventually outgrows itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Problem Definition and Constraints
&lt;/h3&gt;

&lt;p&gt;What we want: natural language questions over uploaded documents, answers limited to document content, reasonable performance for mid-sized files, and clear failure modes.&lt;/p&gt;

&lt;p&gt;What we’re not solving yet: semantic search across large corpora, cross-document reasoning, or high-recall retrieval at scale. Many “RAG tutorials” prematurely introduce vector databases when a simpler architecture would suffice.&lt;/p&gt;

&lt;h3&gt;
  
  
  High-Level Architecture
&lt;/h3&gt;

&lt;p&gt;Four stages: ingestion (document uploaded and stored), extraction (text extracted and normalized), context preparation (content transformed into prompt context), and question answering (user questions answered via OpenAI). Each maps cleanly to a Laravel responsibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 1: Document Upload and Persistence
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nc"&gt;Schema&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'documents'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Blueprint&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'original_name'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'path'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;longText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;timestamps&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At upload time, do not call OpenAI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;JsonResponse&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'document'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$file&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'documents'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="nv"&gt;$document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="s1"&gt;'original_name'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$file&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getClientOriginalName&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="s1"&gt;'path'&lt;/span&gt;          &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;

    &lt;span class="nc"&gt;ExtractDocumentContent&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;dispatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$document&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'id'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$document&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Dispatch a job. Extraction may be slow. Upload requests should return quickly. Extraction failures should be isolated — not cascade into upload failures.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 2: Text Extraction as a Background Job
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ExtractDocumentContent&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;ShouldQueue&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;extractText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;Storage&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$text&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No AI calls, no tokens consumed. Failures are deterministic. You want extraction bugs to be boring.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 3: Chunking Document Content
&lt;/h3&gt;

&lt;p&gt;Here’s where many implementations fail. The naive approach — stuffing the entire document into a prompt — works briefly, then collapses under token limits and cost pressure.&lt;/p&gt;

&lt;p&gt;Split the document into chunks based on size:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;chunkText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
    &lt;span class="nv"&gt;$length&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;mb_strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$offset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$offset&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nv"&gt;$length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$chunks&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;mb_substr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$offset&lt;/span&gt;  &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nv"&gt;$size&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$chunks&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Gotcha — &lt;code&gt;str_split()&lt;/code&gt; Will Corrupt Multibyte Text&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The PHP standard library’s &lt;code&gt;str_split($text, $size)&lt;/code&gt; splits by &lt;em&gt;bytes&lt;/em&gt;, not characters. Any document with UTF-8 content — French accents, German umlauts, Arabic, CJK — will produce chunks with broken characters at boundaries. Always use &lt;code&gt;mb_substr()&lt;/code&gt; and &lt;code&gt;mb_strlen()&lt;/code&gt; for text chunking. This is a silent production bug that only surfaces with real-world content.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Stage 4: Answering Questions
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;answerQuestion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Document&lt;/span&gt; &lt;span class="nv"&gt;$document&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$question&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;chunkText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$document&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;buildPrompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$question&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;openAi&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;generateText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Prompt Construction&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;buildPrompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$question&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;implode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$chunks&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;&amp;lt;&amp;lt;&amp;lt;PROMPT
You are answering questions based ONLY on the following document content.
If the answer cannot be found in the document, respond with:
"I don't know based on the provided document."

Document:
{$context}

Question:
{$question}
PROMPT;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This prompt constrains hallucination, defines a fallback response, and makes incorrect answers detectable. The goal is not perfection — it’s bounded failure. You’ll eventually outgrow this approach, but it fails gracefully, which makes it an excellent foundation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Handling Multi-Turn Follow-Ups
&lt;/h3&gt;

&lt;p&gt;Users rarely ask one question. Instead of re-sending the entire document every time, summarize prior context, append follow-ups as messages, and re-inject only relevant chunks. The conversation schema from Pattern 3 handles this directly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Observability
&lt;/h3&gt;

&lt;p&gt;Every OpenAI call in this feature should log document ID, token count, model, and latency. Not for analytics — for debugging. When a user reports “the AI gave a wrong answer,” you need the exact inputs that produced it. Without that log, you’re guessing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Production Concerns
&lt;/h2&gt;

&lt;p&gt;By the time an OpenAI integration reaches production, the question shifts from “Can we generate an answer?” to “How often does this fail? How expensive is it over time? What happens under load?”&lt;/p&gt;

&lt;p&gt;AI integrations are probabilistic, rate-limited, and cost-based. Treating them like traditional REST APIs produces brittle systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rate Limiting as a System Behaviour
&lt;/h3&gt;

&lt;p&gt;OpenAI enforces rate limits at the account and model level. Most developers respond to sporadic 429s by adding retries. That’s the wrong mental model.&lt;/p&gt;

&lt;p&gt;Rate limits are not an error condition. They’re a signal that your system’s demand exceeds its contract.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Centralizing Rate Awareness&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Your OpenAI service layer should be the only place where rate-limit responses are interpreted:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;429&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAiRateLimited&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;retryAfter&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nb"&gt;header&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Retry-After'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This exception should not be caught silently. Its consumers — controllers, jobs — must decide what to do next.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Different Strategies for Different Contexts&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;User-facing requests should fail fast with a clear message or degraded experience. Background jobs can be delayed and rescheduled:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// OpenAI call&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OpenAiRateLimited&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;release&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;retryAfter&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This aligns retries with system capacity instead of guessing at backoff timing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost as an Operational Metric
&lt;/h3&gt;

&lt;p&gt;Token usage is not an abstract billing detail. It’s a runtime characteristic of your application. Two requests that look identical in code can differ by orders of magnitude in cost depending on prompt length, document size, conversation history, and model choice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recording Token Usage&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every OpenAI response includes token metadata. Capture it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$usage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;data_get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s1"&gt;'usage'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nc"&gt;OpenAiUsage&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="s1"&gt;'model'&lt;/span&gt;             &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'prompt_tokens'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$usage&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'prompt_tokens'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;     &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'completion_tokens'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$usage&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'completion_tokens'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'total_tokens'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$usage&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'total_tokens'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;      &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This data is not for dashboards at first. It’s for answering questions like: Why did costs spike yesterday? Which feature is expensive? Which users generate the most load? You cannot optimize what you do not measure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost-Aware Design Decisions&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Once usage is visible, architectural decisions become concrete: summarizing older conversation turns to stay within token budgets, truncating document context for cheaper models, and routing simple tasks to &lt;code&gt;gpt-4o-mini&lt;/code&gt; instead of &lt;code&gt;gpt-4o&lt;/code&gt;. These are the difference between sustainable and runaway costs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Caching AI Outputs (Carefully)
&lt;/h3&gt;

&lt;p&gt;Caching AI responses is tempting. It’s also dangerous if done naively.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What Is Safe to Cache&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Caching works well for deterministic prompts, non-user-specific inputs, and expensive but repeatable operations — document summaries, extracted metadata, classification labels:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nc"&gt;Cache&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;remember&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;"doc:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$document&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:summary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;addDay&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$openAi&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;summarize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$document&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What Should Not Be Cached&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Avoid caching conversational replies, user-personalized outputs, or anything derived from mutable state. If you cannot confidently answer “When should this be invalidated?”, don’t cache it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Senior Dev Tip:&lt;/strong&gt; Use &lt;strong&gt;Redis&lt;/strong&gt; for AI output caching, not the file driver. Redis gives you atomic expiry, explicit invalidation by key pattern, and the ability to inspect what’s cached without reading from disk. File-based caches become unmanageable quickly when you’re caching per-document, per-user outputs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Testing OpenAI Integrations Without Lying
&lt;/h3&gt;

&lt;p&gt;This is where many teams get stuck. Mocking the OpenAI client to return a fixed string proves nothing useful. You need two distinct kinds of tests.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Contract Tests (Fast, Deterministic)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;These verify that your service layer calls OpenAI correctly, that failures are handled, and that parsing logic works. Mocking is appropriate here:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nc"&gt;Http&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="s1"&gt;'api.openai.com/*'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;Http&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="s1"&gt;'choices'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'message'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Test output'&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="s1"&gt;'usage'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'prompt_tokens'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'completion_tokens'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'total_tokens'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;]),&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You’re testing plumbing, not intelligence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Behavioural Tests (Slow, Real)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;These hit the real OpenAI API, use real prompts, and validate constraints — not specific wording:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$service&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;answerQuestion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$doc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'What is the refund policy?'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;assertStringContainsString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'refund'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;strtolower&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These tests should run infrequently, be clearly labelled, use dedicated API keys, and have cost limits. They catch failures no mock ever will.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dealing With Non-Determinism&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AI output is not stable across runs. Tests must reflect that. Assert structure, presence or absence, and refusal behaviour. If a model is instructed to say “I don’t know” under certain conditions, test for that branch explicitly. You’re testing behavioural boundaries, not string values.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deployment Considerations
&lt;/h3&gt;

&lt;p&gt;Increase PHP and proxy timeouts for streaming routes. Ensure queue workers can handle long-running jobs. If you haven’t locked down your production environment yet, &lt;a href="https://origin-main.com/guides/deploy-laravel-to-production/" rel="noopener noreferrer"&gt;deploy Laravel to production&lt;/a&gt; covers the infrastructure baseline these integrations depend on. Separate API keys across local, staging, and production — never reuse them. Wrap major prompt changes behind feature flags; a prompt change can alter behaviour as dramatically as a code change.&lt;/p&gt;

&lt;p&gt;At minimum, you should be able to answer: Which OpenAI calls failed today? Which features are most expensive? Which prompts changed recently? This doesn’t require a full observability stack. It requires discipline: structured logs, request IDs, and consistent metadata.&lt;/p&gt;

&lt;h2&gt;
  
  
  Advanced Patterns
&lt;/h2&gt;

&lt;p&gt;Your Laravel application now has a clean OpenAI service layer with request/response, streaming, memory, queues, cost tracking, rate limiting, testability, and observability. Let’s extend it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Livewire Integration: Real-Time AI Without a Frontend Framework
&lt;/h3&gt;

&lt;p&gt;Not every Laravel team wants a JavaScript SPA. Livewire allows server-driven interactivity that pairs naturally with AI workflows.&lt;/p&gt;

&lt;p&gt;Livewire is request-driven, not socket-driven. Each interaction is an HTTP round trip, and long-running calls block the request lifecycle unless handled intentionally. For a complete working implementation, see &lt;a href="https://origin-main.com/coding-tutorials/laravel-livewire-claude-api-real-time-chat/" rel="noopener noreferrer"&gt;Laravel Livewire + Claude Integration&lt;/a&gt; — the architectural patterns are identical for OpenAI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Basic Livewire AI Component&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Livewire\Component&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiAssistant&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Component&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;   &lt;span class="nv"&gt;$loading&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;OpenAiService&lt;/span&gt; &lt;span class="nv"&gt;$openAi&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;loading&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$openAi&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;generateText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;loading&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;render&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'livewire.ai-assistant'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works immediately but it’s blocking. If generation takes 10 seconds, the request takes 10 seconds. For short responses, acceptable. For longer outputs, you have three realistic strategies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Polling:&lt;/strong&gt; Trigger generation in a queued job, persist partial output, and poll from Livewire until complete. Simple and resilient.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Hybrid Route for Streaming:&lt;/strong&gt; Livewire initiates the request, a separate streaming endpoint handles chunked output, and a thin JavaScript layer listens and updates the DOM.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Accept Blocking:&lt;/strong&gt; For short, predictable outputs, it may be entirely acceptable.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The decision depends on latency tolerance, not preference.&lt;/p&gt;

&lt;h3&gt;
  
  
  Inertia + Vue: SPA-Style AI Interfaces
&lt;/h3&gt;

&lt;p&gt;If your application already uses Inertia, you can leverage SSE natively on the client side.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Laravel Route:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/ai/stream'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;OpenAiService&lt;/span&gt; &lt;span class="nv"&gt;$openAi&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$openAi&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$openAi&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;streamText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'prompt'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"data: "&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nb"&gt;json_encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="nb"&gt;ob_flush&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
                &lt;span class="nb"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'Content-Type'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'text/event-stream'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'Cache-Control'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'no-cache'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'X-Accel-Buffering'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'no'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Vue Client:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;eventSource&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;EventSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`/ai/stream?prompt=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nf"&gt;encodeURIComponent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;eventSource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onmessage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Streaming is an architectural decision, not a cosmetic enhancement. It changes how users perceive your application’s responsiveness.&lt;/p&gt;

&lt;h3&gt;
  
  
  Retrieval-Augmented Generation (RAG) Fundamentals
&lt;/h3&gt;

&lt;p&gt;Eventually, chunking entire documents into prompts stops scaling. You need selective context retrieval.&lt;/p&gt;

&lt;p&gt;RAG solves this by converting text into embeddings, storing them in a vector store, and injecting only semantically relevant chunks into the prompt at query time — rather than the full document.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Generating Embeddings&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/embeddings'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'model'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'text-embedding-3-small'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'input'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$textChunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="nv"&gt;$vector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;data_get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s1"&gt;'data.0.embedding'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2: Storing Vectors&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;At small scale, PostgreSQL with the &lt;code&gt;pgvector&lt;/code&gt; extension is sufficient. If you need a deeper introduction to how embeddings actually work before choosing your storage strategy, &lt;a href="https://origin-main.com/ai-foundations/laravel-vector-database-embeddings-pgvector-rag/" rel="noopener noreferrer"&gt;Embeddings and Vector Databases: The Foundation of Semantic AI Systems&lt;/a&gt; covers the concepts and storage trade-offs in detail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Retrieving Relevant Chunks&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;At query time, convert the question to an embedding, perform a cosine similarity search, and retrieve the top N chunks. Only those chunks enter the OpenAI prompt. This reduces token usage, latency, and hallucination. RAG is not magic — it’s controlled context injection.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multi-Model Strategies
&lt;/h3&gt;

&lt;p&gt;Assuming one model should do everything is one of the most expensive architectural mistakes in LLM-backed applications. Large reasoning models are slow and costly. Smaller models are fast and cheap. Embedding models serve an entirely different purpose.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Intent-Based Model Routing&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$taskType&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s1"&gt;'summarization'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'gpt-4o-mini'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'complex_reasoning'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'gpt-4o'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'embedding'&lt;/span&gt;        &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'text-embedding-3-small'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;default&lt;/span&gt;            &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'gpt-4o-mini'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This prevents overpaying for simple tasks. A document classification job does not need the same model as a multi-step reasoning pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fallback Strategies&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can also implement primary → fallback model routing, high-accuracy → fast mode switching, and user-tier-based model selection. These are business decisions with direct cost and SLA implications, not technical hacks.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Senior Dev Tip:&lt;/strong&gt; If you want to go further and treat your AI layer as a genuinely provider-agnostic abstraction — with tool calling, multi-model orchestration, and structured outputs built in — take a look at &lt;a href="https://origin-main.com/ai-agents/laravel-prism-php-agentic-apps/" rel="noopener noreferrer"&gt;Building Agentic Laravel Apps with Prism PHP&lt;/a&gt;. Prism handles the abstraction layer so you’re not maintaining it yourself.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Where This Leaves You
&lt;/h2&gt;

&lt;p&gt;Your Laravel + OpenAI system now supports real-time interfaces, retrieval-based document reasoning, multi-model routing, and a structured architecture ready to scale.&lt;/p&gt;

&lt;p&gt;The service boundary protects you from API changes, model shifts, and provider swaps. OpenAI becomes infrastructure — replaceable, measurable, and contained. That’s the outcome worth building toward.&lt;/p&gt;

&lt;h2&gt;
  
  
  Next Steps
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Ready to build with real conversation memory and streaming?&lt;/strong&gt; &lt;a href="https://origin-main.com/coding-tutorials/laravel-claude-streaming-chatbot/" rel="noopener noreferrer"&gt;Building a Claude API Chatbot in Laravel&lt;/a&gt; demonstrates both patterns in a complete, working implementation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hit rate limits in production?&lt;/strong&gt; The queue-based throttling and backpressure strategies in &lt;a href="https://origin-main.com/guides/laravel-claude-api-integration-guide/" rel="noopener noreferrer"&gt;Laravel Claude API Integration Guide&lt;/a&gt; apply directly to OpenAI workloads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Want a server-driven real-time UI without JavaScript frameworks?&lt;/strong&gt; &lt;a href="https://origin-main.com/coding-tutorials/laravel-livewire-claude-api-real-time-chat/" rel="noopener noreferrer"&gt;Laravel Livewire + Claude Integration&lt;/a&gt; shows how to build AI features that stay in PHP.&lt;/p&gt;

&lt;p&gt;Questions? &lt;a href="https://origin-main.com/questions/" rel="noopener noreferrer"&gt;Ask in our Developer Q&amp;amp;A&lt;/a&gt; or reach out directly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Official References:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;a href="https://platform.openai.com/docs/guides/text-generation" rel="noopener noreferrer"&gt;OpenAI Chat Completions API&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://platform.openai.com/docs/models" rel="noopener noreferrer"&gt;OpenAI Models Reference&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://laravel.com/docs/http-client" rel="noopener noreferrer"&gt;Laravel HTTP Client Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://laravel.com/docs/queues" rel="noopener noreferrer"&gt;Laravel Queues Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>laravel</category>
      <category>ai</category>
      <category>openai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>RAG-less Architecture in Laravel: Long-Context Caching with Gemini</title>
      <dc:creator>Dewald Hugo</dc:creator>
      <pubDate>Thu, 23 Apr 2026 03:01:13 +0000</pubDate>
      <link>https://dev.to/dewaldhugo/rag-less-architecture-in-laravel-long-context-caching-with-gemini-5h36</link>
      <guid>https://dev.to/dewaldhugo/rag-less-architecture-in-laravel-long-context-caching-with-gemini-5h36</guid>
      <description>&lt;p&gt;For the past few years, RAG has been presented as the only serious approach to grounding AI on large private datasets. Build a vector database, chunk your documents, embed the chunks, retrieve the closest matches, inject them into the prompt. Repeat for every query. It works, and for certain use cases it is genuinely the right call.&lt;/p&gt;

&lt;p&gt;But for others — particularly repeated, deep-analysis tasks on the same corpus — RAG is an expensive workaround for a context window problem that the models have largely outgrown.&lt;/p&gt;

&lt;p&gt;Google’s Gemini 2.5 Flash and 2.5 Pro both ship with a 1-million-token context window. That is approximately 750,000 words, or a mid-sized Laravel monolith with room to spare. More importantly, the Gemini API now supports &lt;strong&gt;Explicit Context Caching&lt;/strong&gt;: you upload that corpus once, pay full input price once, and reference the cached context on every subsequent query at up to 75% off the standard input rate. For workloads that involve many questions against the same codebase, document set, or knowledge base, this is a fundamentally different cost model — and a fundamentally different architecture.&lt;/p&gt;

&lt;p&gt;This guide walks you through building a &lt;code&gt;GeminiContextCacheService&lt;/code&gt; in Laravel 11/12, binding it to the Service Container, and building a production-ready audit endpoint around it. We will cover the full cache lifecycle, including the storage cost trap that most tutorials skip entirely.&lt;/p&gt;

&lt;h2&gt;
  
  
  RAG vs. Long-Context Caching with Gemini: The Architectural Trade-Off
&lt;/h2&gt;

&lt;p&gt;Before touching code, we need to be honest about what each approach actually is. Neither is universally superior.&lt;/p&gt;

&lt;p&gt;RAG trades context completeness for cost predictability. You pay only for the retrieved chunks, not the full corpus — which is excellent when your dataset is enormous (hundreds of GB) and any single query only needs a narrow slice of it. The problem is that retrieval is fundamentally lossy. A semantic search might pull the right controller method and miss the &lt;code&gt;BaseModel&lt;/code&gt; that overrides &lt;code&gt;save()&lt;/code&gt;. It will find the public interface and skip the private helper that silently transforms the input. That retrieval gap is not a bug you can patch; it is the architecture.&lt;/p&gt;

&lt;p&gt;Long-context caching with Gemini trades retrieval precision for corpus completeness. The model sees everything. There is no retrieval step, no chunking pipeline, no embedding model to maintain. The trade-off is cost structure: the first query is expensive (you pay to ingest the full context), and you carry a per-hour storage fee while the cache lives. Beyond the break-even point — which arrives faster than you expect — repeated queries cost significantly less than the RAG equivalent.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cost Curve: Why Caching Is the Financial Lever
&lt;/h2&gt;

&lt;p&gt;Let us put real numbers on this. Using Gemini 2.5 Flash, which offers the best price-to-context ratio for this workload:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Standard input rate (long context &amp;gt;128K tokens):&lt;/strong&gt; $0.30/MTok&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Cached input rate:&lt;/strong&gt; $0.03/MTok (90% reduction at Flash long-context tier)&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Cache storage:&lt;/strong&gt; billed per hour per MTok while the cache lives&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a 500K-token codebase audit with 20 questions:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Without caching:&lt;/strong&gt; Each query sends 500K tokens to the API. That is $0.15 per question × 20 = &lt;strong&gt;$3.00 total&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With caching:&lt;/strong&gt; Initial cache creation: $0.15 (one-time). Each subsequent query references the cache at $0.015. Twenty queries: $0.15 + ($0.015 × 19) = &lt;strong&gt;$0.435 total&lt;/strong&gt; — plus a small storage fee for however long the cache lives.&lt;/p&gt;

&lt;p&gt;The break-even is query 2. Every question after that is 90% cheaper than the RAG alternative that re-sends the same context.&lt;/p&gt;

&lt;h2&gt;
  
  
  When RAG Still Wins
&lt;/h2&gt;

&lt;p&gt;We should be direct about this before anyone rips out a working vector pipeline.&lt;/p&gt;

&lt;p&gt;If your corpus exceeds 1M tokens — a multi-year ticket archive, a large documentation set, a multi-repo monorepo — the long-context window is not a solution. You physically cannot fit the data. RAG is the architecture. Our existing guide on &lt;a href="https://origin-main.com/ai-foundations/laravel-vector-database-embeddings-pgvector-rag/" rel="noopener noreferrer"&gt;Laravel Embeddings, Vector Databases, and RAG&lt;/a&gt; covers pgvector and the full embedding pipeline if that is your situation.&lt;/p&gt;

&lt;p&gt;Similarly, if your corpus changes frequently — user-generated content, live database records — cache invalidation becomes expensive. Re-creating a 1M-token cache on every data change defeats the cost benefit.&lt;/p&gt;

&lt;p&gt;Long-context caching targets a specific sweet spot: a corpus that is large enough to matter (above RAG’s retrieval quality threshold), small enough to fit in 1M tokens, and stable enough that you will run many queries before the next update. B2B codebase auditing, legal document analysis, product catalog deep-dives — these are the use cases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation: The Stateful Expert Pattern
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;p&gt;Add your Gemini API key to &lt;code&gt;.env&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="py"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;your_key_here&lt;/span&gt;
&lt;span class="py"&gt;GEMINI_MODEL&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash&lt;/span&gt;
&lt;span class="py"&gt;GEMINI_CACHE_TTL&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;7200&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then register the configuration values in &lt;code&gt;config/services.php&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="s1"&gt;'gemini'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'api_key'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'GEMINI_API_KEY'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="s1"&gt;'base_url'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'https://generativelanguage.googleapis.com/v1beta'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'model'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'GEMINI_MODEL'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'gemini-2.5-flash'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="s1"&gt;'cache_ttl'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'GEMINI_CACHE_TTL'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7200&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;],&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The minimum content size for Gemini context caching is 32,768 tokens — roughly 25,000 words. You cannot cache a handful of files. You need to be pushing a full module, a full codebase, or a substantial document set. Keep that constraint in mind when designing your corpus preparation step.&lt;/p&gt;

&lt;p&gt;If you have not yet set up your Gemini client or worked through model selection on your Laravel stack, our &lt;a href="https://origin-main.com/guides/laravel-gemini-integration/" rel="noopener noreferrer"&gt;Integrating Gemini into the Laravel AI SDK&lt;/a&gt; guide covers the foundational setup decisions before you start hitting the API directly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: The Context Caching Service
&lt;/h3&gt;

&lt;p&gt;This service owns two responsibilities: cache lifecycle (create, retrieve, delete) and query dispatch. We keep them in one class because the cache name is required for query — splitting them would force unnecessary dependency injection.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Services&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\Cache&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\Http&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\Log&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;RuntimeException&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GeminiContextCacheService&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$apiKey&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$baseUrl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;    &lt;span class="nv"&gt;$ttl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;apiKey&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'services.gemini.api_key'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;baseUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'services.gemini.base_url'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'services.gemini.model'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ttl&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'services.gemini.cache_ttl'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cd"&gt;/**
     * Return an existing live cache name from Redis, or create a new one.
     * The cache key is your application's identifier (e.g. "audit:project-42").
     * The Gemini cache name is what the API issues back — it looks like
     * "cachedContents/abc123xyz" and is what you reference in generation requests.
     */&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;getOrCreateCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$cacheKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$systemPrompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="kt"&gt;array&lt;/span&gt;  &lt;span class="nv"&gt;$documents&lt;/span&gt;
    &lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$geminiCacheName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Cache&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"gemini_cache:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$cacheKey&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$geminiCacheName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Gemini cache hit'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'key'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$cacheKey&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$geminiCacheName&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;createCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$cacheKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$systemPrompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$documents&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;createCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$cacheKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$systemPrompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="kt"&gt;array&lt;/span&gt;  &lt;span class="nv"&gt;$documents&lt;/span&gt;
    &lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Each document becomes a separate part in the cached content&lt;/span&gt;
        &lt;span class="nv"&gt;$parts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;array_map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'text'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$doc&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="nv"&gt;$documents&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Http&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;withHeaders&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="s1"&gt;'x-goog-api-key'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'Content-Type'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'application/json'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;baseUrl&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/cachedContents"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'model'&lt;/span&gt;            &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"models/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'displayName'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$cacheKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'systemInstruction'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="s1"&gt;'parts'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;'text'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$systemPrompt&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'contents'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'parts'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$parts&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'ttl'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ttl&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;failed&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Gemini cache creation failed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="s1"&gt;'status'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="s1"&gt;'body'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="s1"&gt;'key'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$cacheKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;]);&lt;/span&gt;

            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RuntimeException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="s1"&gt;'Failed to create Gemini context cache: '&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;body&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="nv"&gt;$cacheName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// Store with a 60-second buffer before the Gemini TTL expires&lt;/span&gt;
        &lt;span class="c1"&gt;// so we never reference a cache the API has already deleted&lt;/span&gt;
        &lt;span class="nc"&gt;Cache&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="s2"&gt;"gemini_cache:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$cacheKey&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nv"&gt;$cacheName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ttl&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Gemini context cache created'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'name'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$cacheName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'key'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$cacheKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$cacheName&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cd"&gt;/**
     * Run a user question against an existing cached context.
     */&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$cacheName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$question&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Http&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;withHeaders&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="s1"&gt;'x-goog-api-key'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'Content-Type'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'application/json'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;baseUrl&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/models/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:generateContent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'cachedContent'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$cacheName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'contents'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'parts'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;'text'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$question&lt;/span&gt;&lt;span class="p"&gt;]]],&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;failed&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;429&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RuntimeException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="s1"&gt;'Gemini rate limit exceeded. Implement exponential backoff before retrying.'&lt;/span&gt;
                &lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Gemini query failed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="s1"&gt;'status'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="s1"&gt;'cache'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$cacheName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;]);&lt;/span&gt;

            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RuntimeException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Gemini query failed: '&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;body&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="nv"&gt;$text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'candidates.0.content.parts.0.text'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$text&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RuntimeException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="s1"&gt;'Unexpected Gemini response structure: '&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;body&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="cd"&gt;/**
     * Explicitly delete a cache when the session ends.
     * Do not leave caches running — storage costs accumulate per hour.
     */&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;deleteCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$cacheKey&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$geminiCacheName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Cache&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"gemini_cache:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$cacheKey&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nv"&gt;$geminiCacheName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="nc"&gt;Http&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;withHeaders&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'x-goog-api-key'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nb"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;baseUrl&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$geminiCacheName&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;Cache&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;forget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"gemini_cache:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$cacheKey&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Gemini context cache deleted'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'key'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$cacheKey&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;[Architect’s Note]&lt;/strong&gt; The 60-second TTL buffer on the Redis key is not optional. If your Laravel cache expiry aligns exactly with Gemini’s TTL, a query that arrives in the final seconds will find a valid Redis key but hit a deleted Gemini cache. The API returns a 404, your &lt;code&gt;query()&lt;/code&gt; method throws, and your user gets a 503. Build the buffer in. If you are concerned about token governance and want to log every token consumed by cached vs. standard queries separately, our &lt;a href="https://origin-main.com/laravel-architecture/laravel-ai-middleware-token-tracking/" rel="noopener noreferrer"&gt;Laravel AI Middleware: Token Tracking &amp;amp; Rate Limiting&lt;/a&gt; guide covers the instrumentation layer you would wrap around this service.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 2: Binding to the Service Container
&lt;/h3&gt;

&lt;p&gt;Register the service as a singleton in &lt;code&gt;AppServiceProvider&lt;/code&gt;. Because the service reads from config in its constructor, a singleton is correct here — we want one instance per request cycle, not one per injection.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Providers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Services\GeminiContextCacheService&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\ServiceProvider&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AppServiceProvider&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;ServiceProvider&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;register&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;singleton&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;GeminiContextCacheService&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Laravel’s Service Container handles the rest. Any controller, job, or command that type-hints &lt;code&gt;GeminiContextCacheService&lt;/code&gt; receives the same instance within a given request.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: The Audit Controller
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Http\Controllers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Services\GeminiContextCacheService&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Http\JsonResponse&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Http\Request&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\Log&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;RuntimeException&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CodeAuditController&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Controller&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kt"&gt;GeminiContextCacheService&lt;/span&gt; &lt;span class="nv"&gt;$gemini&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;audit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;JsonResponse&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$validated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="s1"&gt;'project_id'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'required'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'string'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'max:100'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'question'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'required'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'string'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'max:2000'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;loadProjectDocuments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$validated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'project_id'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

            &lt;span class="nv"&gt;$systemPrompt&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'You are a senior Laravel architect reviewing a production codebase. '&lt;/span&gt;
                &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="s1"&gt;'When answering questions, cite specific files and line numbers. '&lt;/span&gt;
                &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="s1"&gt;'Flag security risks, legacy patterns, and architectural concerns explicitly.'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="nv"&gt;$cacheName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;gemini&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getOrCreateCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;cacheKey&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;     &lt;span class="s2"&gt;"audit:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$validated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'project_id'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;systemPrompt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;$systemPrompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="nv"&gt;$documents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="nv"&gt;$answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;gemini&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$cacheName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$validated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'question'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'answer'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$answer&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;RuntimeException&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Code audit failed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="s1"&gt;'project_id'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$validated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'project_id'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="s1"&gt;'error'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getMessage&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;]);&lt;/span&gt;

            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Audit service temporarily unavailable.'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="mi"&gt;503&lt;/span&gt;
            &lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;loadProjectDocuments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$projectId&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;storage_path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"app/projects/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$projectId&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/codebase.txt"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nb"&gt;file_exists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RuntimeException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="s2"&gt;"Codebase not found for project: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$projectId&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// Return as array — each element becomes a separate Part in the cached content.&lt;/span&gt;
        &lt;span class="c1"&gt;// Split large codebases into logical chunks (per-module) if approaching 1M tokens.&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;file_get_contents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="p"&gt;)];&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Register the route with Sanctum auth and a conservative throttle. Twenty questions per minute is generous for a codebase audit endpoint — adjust to match your billing ceiling.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// routes/api.php&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Http\Controllers\CodeAuditController&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'auth:sanctum'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'throttle:20,1'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/audit'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;CodeAuditController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'audit'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Securing this endpoint properly is not optional — an unauthenticated audit endpoint is a direct vector for API cost abuse. Our &lt;a href="https://origin-main.com/guides/laravel-sanctum-api-authentication/" rel="noopener noreferrer"&gt;Laravel Sanctum API Authentication&lt;/a&gt; guide covers token scoping and rate-limit configuration if you need to tighten this down further.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cache Lifecycle Management
&lt;/h2&gt;

&lt;p&gt;This is the section most tutorials omit. Skip it and you will find $800 in unexpected Gemini storage charges at the end of the month.&lt;/p&gt;

&lt;p&gt;Gemini charges per MTok per hour for every active cache. The pricing varies by model — check &lt;a href="https://ai.google.dev/pricing" rel="noopener noreferrer"&gt;Google’s official API pricing page&lt;/a&gt; for current rates. The pattern you want is: create the cache when the audit session starts, delete it when the session ends, and run a scheduled cleanup job to catch anything that leaked through.&lt;/p&gt;

&lt;p&gt;The cleanup command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Console\Commands&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Carbon\Carbon&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Console\Command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\Http&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\Log&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CleanExpiredGeminiCaches&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Command&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="nv"&gt;$signature&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'gemini:clean-caches'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="nv"&gt;$description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Delete expired or stale Gemini context caches'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$apiKey&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'services.gemini.api_key'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$baseUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'services.gemini.base_url'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Http&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;withHeaders&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'x-goog-api-key'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$apiKey&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$baseUrl&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/cachedContents"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;failed&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Failed to list caches: '&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;body&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;FAILURE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="nv"&gt;$caches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'cachedContents'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]);&lt;/span&gt;

        &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$caches&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$cache&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$expireTime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Carbon&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'expireTime'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$expireTime&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;isPast&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;Http&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;withHeaders&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'x-goog-api-key'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$apiKey&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nb"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$baseUrl&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

                &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Deleted stale Gemini cache'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'name'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;]]);&lt;/span&gt;
                &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Deleted: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;SUCCESS&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Schedule it in &lt;code&gt;bootstrap/app.php&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;withSchedule&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Schedule&lt;/span&gt; &lt;span class="nv"&gt;$schedule&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$schedule&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'gemini:clean-caches'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;hourly&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;[Edge Case Alert]&lt;/strong&gt; If a queue worker crashes mid-audit before &lt;code&gt;deleteCache()&lt;/code&gt; is called, the Gemini cache lives until its TTL expires and storage charges accumulate. The scheduled cleanup command is your safety net, not a replacement for explicit deletion. Consider also dispatching a &lt;code&gt;DeleteGeminiCacheJob&lt;/code&gt; via &lt;code&gt;defer()&lt;/code&gt; at the end of each audit session as a belt-and-suspenders measure.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Use Case: The Persistent Auditor in Practice
&lt;/h2&gt;

&lt;p&gt;Here is where the architecture earns its keep. You are auditing a legacy Laravel 6 application for a client. The codebase is 400K tokens — controllers, models, migrations, the lot.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The RAG approach:&lt;/strong&gt; You ask about &lt;code&gt;UserController::store()&lt;/code&gt;. The retrieval pulls three relevant chunks. The answer looks correct. But the client’s &lt;code&gt;BaseModel&lt;/code&gt; has an overridden &lt;code&gt;save()&lt;/code&gt; method that intercepts every Eloquent write and transforms certain fields. RAG did not retrieve &lt;code&gt;BaseModel&lt;/code&gt; — it was not semantically close enough to the query. The AI gives you an answer that describes the surface behaviour and misses the hack underneath. The audit is wrong.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The cached-context approach:&lt;/strong&gt; You upload the full codebase. When you ask about &lt;code&gt;UserController::store()&lt;/code&gt;, Gemini has already read &lt;code&gt;BaseModel&lt;/code&gt;. It answers that the apparent save logic is intercepted downstream, identifies the override, and flags it as a maintenance risk. The context is complete. The answer is complete.&lt;/p&gt;

&lt;p&gt;This is what we mean by Architectural Integrity. RAG optimises for retrieval cost. Long-context caching optimises for answer correctness. For an audit product where a wrong answer has professional liability implications, correctness wins.&lt;/p&gt;

&lt;p&gt;The cost math, using our earlier figures: 20 audit questions on a 500K-token codebase costs $3.00 without caching and approximately $0.44 with caching — plus storage for however long the session runs. For a B2B audit priced at $200 per report, the infrastructure cost is negligible either way. But the quality difference is not.&lt;/p&gt;

&lt;p&gt;For production agentic pipelines — where the “questions” are not human prompts but tool calls from an orchestrated agent — the cost advantage compounds quickly. Our guide on &lt;a href="https://origin-main.com/laravel-architecture/laravel-agentic-workflow-schema-validation/" rel="noopener noreferrer"&gt;Hardening Laravel Agentic Workflows: Schema Validation Against LLM Hallucinations&lt;/a&gt; covers the validation layer you will want around those agent outputs, regardless of which AI architecture you choose.&lt;/p&gt;

&lt;h2&gt;
  
  
  Preparing Your Corpus for the Cache
&lt;/h2&gt;

&lt;p&gt;The code examples above load a pre-processed &lt;code&gt;codebase.txt&lt;/code&gt; file. In production, you need a pipeline that generates this file from your actual source. A simple Artisan command handles it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Console\Commands&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Console\Command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\File&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Component\Finder\Finder&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PrepareCodebaseCorpus&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Command&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="nv"&gt;$signature&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'corpus:prepare {project_id} {path}'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="nv"&gt;$description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Concatenate a project codebase into a single corpus file'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$projectId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'project_id'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$path&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'path'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nv"&gt;$finder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Finder&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;files&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;in&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;name&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'*.php'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'*.json'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'*.yaml'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'*.env.example'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;notPath&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'vendor'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'node_modules'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'storage'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'.git'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sortByName&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="nv"&gt;$corpus&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nv"&gt;$counter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$finder&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$relativePath&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str_replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="s1"&gt;'/'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$file&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getRealPath&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
            &lt;span class="nv"&gt;$corpus&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;// FILE: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$relativePath&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nv"&gt;$corpus&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$file&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getContents&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="nv"&gt;$counter&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="nv"&gt;$outputDir&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;storage_path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"app/projects/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$projectId&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$outputPath&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$outputDir&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="s1"&gt;'/codebase.txt'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="nc"&gt;File&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;ensureDirectoryExists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$outputDir&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;File&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$outputPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$corpus&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nv"&gt;$sizeKb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$corpus&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Corpus prepared: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$counter&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; files, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$sizeKb&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; KB → &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$outputPath&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;SUCCESS&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run it before starting an audit session: &lt;code&gt;php artisan corpus:prepare project-42 /path/to/legacy-app&lt;/code&gt;. The file headers (&lt;code&gt;// FILE: app/Models/BaseModel.php&lt;/code&gt;) give Gemini the file path context it needs to cite locations in its answers.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;[Efficiency Gain]&lt;/strong&gt; Strip comments and docblocks from PHP files before creating the corpus. A typical Laravel application loses 20–30% of its token count after comment stripping, which can push a borderline 1.2M-token codebase under the 1M limit and save meaningfully on cache creation costs. PHP’s &lt;code&gt;token_get_all()&lt;/code&gt; makes this straightforward; a simple token filter removes &lt;code&gt;T_COMMENT&lt;/code&gt; and &lt;code&gt;T_DOC_COMMENT&lt;/code&gt; tokens before concatenation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Conclusion: Context Is the New Database
&lt;/h2&gt;

&lt;p&gt;RAG will not disappear. For certain corpus shapes — enormous, dynamic, multi-tenant — it remains the correct architecture. But it has been treated as the default answer for too long, applied reflexively to problems that long-context models can now solve more cleanly and more accurately.&lt;/p&gt;

&lt;p&gt;Explicit context caching in Gemini changes the trade-off. The vector database, the chunking strategy, the embedding model, the retrieval tuning — for a well-bounded corpus, all of that overhead collapses into a single cached upload and a TTL you manage in Redis. The model sees the whole codebase. The answers reflect the whole codebase.&lt;/p&gt;

&lt;p&gt;The architecture we have built here — &lt;code&gt;GeminiContextCacheService&lt;/code&gt; bound as a singleton, TTL-buffered cache keys in Redis, explicit lifecycle management, a corpus preparation command — is production-ready. Wire up your Sanctum-protected audit endpoint, run the corpus preparation pipeline, and you have a stateful expert that knows your codebase better than any retrieval pipeline can.&lt;/p&gt;

&lt;p&gt;For official context caching documentation and current pricing, refer directly to &lt;a href="https://ai.google.dev/gemini-api/docs/caching" rel="noopener noreferrer"&gt;Google’s Gemini API Context Caching guide&lt;/a&gt; — pricing has shifted frequently in 2026 and the source of truth is Google’s pricing page, not any third-party aggregator.&lt;/p&gt;

</description>
      <category>laravel</category>
      <category>gemini</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Integrating Gemini into the Laravel AI SDK</title>
      <dc:creator>Dewald Hugo</dc:creator>
      <pubDate>Thu, 16 Apr 2026 13:45:05 +0000</pubDate>
      <link>https://dev.to/dewaldhugo/beyond-the-api-key-integrating-gemini-into-the-laravel-ai-sdk-325o</link>
      <guid>https://dev.to/dewaldhugo/beyond-the-api-key-integrating-gemini-into-the-laravel-ai-sdk-325o</guid>
      <description>&lt;p&gt;If your Laravel application already speaks to OpenAI or Anthropic through the first-party &lt;code&gt;laravel/ai&lt;/code&gt; SDK, adding Gemini feels like it should be a five-minute job. Set a new API key. Point to a new provider. Deploy. Done. That assumption is mostly correct, but the gap between “mostly” and “done” is where production systems fall apart. This guide closes that gap.&lt;/p&gt;

&lt;p&gt;We will cover the full &lt;strong&gt;Laravel Gemini integration&lt;/strong&gt; path: the OpenAI-compatible soft entry that lets you validate the connection in under ten minutes, the native Gemini provider configuration that actually belongs in a long-lived codebase, and a &lt;code&gt;GeminiManager&lt;/code&gt; service pattern that gives you model failover, rate-limit awareness, and a response contract your Blade templates never have to think about.&lt;/p&gt;

&lt;p&gt;By the end, Gemini will be a first-class citizen in your AI stack, not an afterthought bolted to a controller.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Add Gemini At All?
&lt;/h2&gt;

&lt;p&gt;Fair question. If you are already running &lt;code&gt;claude-sonnet-4-6&lt;/code&gt; or &lt;code&gt;gpt-4o&lt;/code&gt; in production, the case for a third provider needs to be more compelling than “Google exists.” Here is the honest pitch.&lt;/p&gt;

&lt;p&gt;Gemini 2.5 Flash is fast, cheap, and ships with a one-million-token context window as standard. For high-throughput tasks—document summarisation, classification pipelines, RAG over large corpora—it undercuts the competition on cost without giving up much capability. Gemini 2.5 Pro sits at the other end: serious reasoning, multimodal by default, and Vertex AI integration if you are already inside the Google Cloud ecosystem.&lt;/p&gt;

&lt;p&gt;The second reason is resilience. Single-provider AI architectures have a hard failure mode. When OpenAI rate-limits your 2 AM batch job, the entire feature goes dark. Gemini as a fallback provider changes that calculation. We are not talking about a hot standby you never test; we are talking about a properly designed failover that routes intelligently.&lt;/p&gt;

&lt;p&gt;The third reason is the &lt;code&gt;laravel/ai&lt;/code&gt; SDK itself. As covered in detail in &lt;a href="https://origin-main.com/laravel/laravel-13-ai-sdk-features-breakdown/" rel="noopener noreferrer"&gt;What Laravel 13 Actually Changes for AI Development&lt;/a&gt;, the SDK was designed from the start to be provider-agnostic. Gemini is a first-class supported provider. The architecture already expects you to run multiple models.&lt;/p&gt;

&lt;h2&gt;
  
  
  Route 1: The OpenAI-Compatible Endpoint (The Soft Entry)
&lt;/h2&gt;

&lt;p&gt;Before committing to the native integration, you can validate that Gemini works for your use case using Google’s OpenAI-compatible endpoint. This is genuinely useful for a quick proof-of-concept, or when you are migrating from an existing &lt;code&gt;openai-php/client&lt;/code&gt; setup rather than the &lt;code&gt;laravel/ai&lt;/code&gt; SDK.&lt;/p&gt;

&lt;p&gt;Google maintains an endpoint at &lt;code&gt;https://generativelanguage.googleapis.com/v1beta/openai/&lt;/code&gt; that accepts Chat Completions API requests in the standard OpenAI format. The swap is mechanical: change the base URL, swap the API key, and change the model string.&lt;/p&gt;

&lt;p&gt;Start with your &lt;code&gt;.env&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="py"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;your-google-ai-studio-key&lt;/span&gt;
&lt;span class="py"&gt;GEMINI_BASE_URL&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;https://generativelanguage.googleapis.com/v1beta/openai/&lt;/span&gt;
&lt;span class="py"&gt;GEMINI_MODEL&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you are on an existing OpenAI configuration in &lt;code&gt;config/ai.php&lt;/code&gt;, you can add a second connection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// config/ai.php&lt;/span&gt;
&lt;span class="s1"&gt;'connections'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'openai'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'driver'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'openai'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'api_key'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'OPENAI_API_KEY'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;

    &lt;span class="s1"&gt;'gemini_compat'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'driver'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'openai'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;           &lt;span class="c1"&gt;// still using the OpenAI driver&lt;/span&gt;
        &lt;span class="s1"&gt;'api_key'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'GEMINI_API_KEY'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s1"&gt;'base_url'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'GEMINI_BASE_URL'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;],&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can now call Gemini through any code that already uses the &lt;code&gt;openai&lt;/code&gt; driver, just by swapping the connection name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="no"&gt;Illuminate\Support\Facades\AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;AI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'gemini_compat'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'gemini-2.5-flash'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works well enough for a smoke test. Resist the urge to ship it as your production integration.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;[Production Pitfall]&lt;/strong&gt; The OpenAI-compatible endpoint has real coverage gaps. Gemini-native features—grounding via Google Search, the thinking budget parameter, native multimodal tool use, and fine-grained safety settings—are not available through the compat layer. If your feature depends on any of those, the compat approach will silently drop them and return a degraded response with no error. You will not find out until QA, or worse, production.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Route 2: The Native Gemini Provider (The Right Way)
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;laravel/ai&lt;/code&gt; SDK ships with a first-party Gemini driver. Use it. The compat approach is a migration crutch; the native driver is what belongs in your &lt;code&gt;config/ai.php&lt;/code&gt; long-term.&lt;/p&gt;

&lt;p&gt;Install the SDK if you have not already:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;composer require laravel/ai
php artisan vendor:publish &lt;span class="nt"&gt;--provider&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"Laravel&lt;/span&gt;&lt;span class="se"&gt;\A&lt;/span&gt;&lt;span class="s2"&gt;i&lt;/span&gt;&lt;span class="se"&gt;\A&lt;/span&gt;&lt;span class="s2"&gt;iServiceProvider"&lt;/span&gt;
php artisan migrate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then register the native Gemini connection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// config/ai.php&lt;/span&gt;
&lt;span class="s1"&gt;'connections'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'openai'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'driver'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'openai'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'api_key'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'OPENAI_API_KEY'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;

    &lt;span class="s1"&gt;'anthropic'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'driver'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'anthropic'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'api_key'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ANTHROPIC_API_KEY'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;

    &lt;span class="s1"&gt;'gemini'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'driver'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'gemini'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'api_key'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'GEMINI_API_KEY'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;],&lt;/span&gt;

&lt;span class="s1"&gt;'default'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'AI_DEFAULT_CONNECTION'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'openai'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your &lt;code&gt;.env&lt;/code&gt; stays clean:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="py"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;your-google-ai-studio-key&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A basic request through the native driver looks identical to your existing OpenAI calls:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="no"&gt;Illuminate\Support\Facades\AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;AI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'gemini'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'gemini-2.5-flash'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That’s the surface. Now let’s build something that actually runs in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Selection: Which Gemini Model?
&lt;/h2&gt;

&lt;p&gt;Not all Gemini models are created equal, and Google’s naming conventions have become genuinely confusing over the last year. Here is the practical breakdown for production use.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;gemini-2.5-flash&lt;/code&gt; is your default. It hit general availability in June 2025, it is fast, it is cost-efficient, and it handles up to one million tokens of context. For most text generation, summarisation, and classification tasks, it is the correct choice.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;gemini-2.5-pro&lt;/code&gt; is for heavy lifting. Complex reasoning chains, code generation over large codebases, long-form synthesis tasks. It is meaningfully more expensive than Flash and slower under load, so do not reach for it by default.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;[Edge Case Alert]&lt;/strong&gt; Do not use &lt;code&gt;gemini-2.0-flash&lt;/code&gt; in new integrations. Google has announced that Gemini 2.0 Flash and Flash-Lite models will be shut down on June 1, 2026. Any production system pointing at those model strings will start throwing &lt;code&gt;404&lt;/code&gt; errors at that date with no warning beyond the deprecation announcement. If you have any existing references to &lt;code&gt;gemini-2.0-flash&lt;/code&gt; in your codebase, change them now.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;For preview models in the Gemini 3.x family, treat them exactly as you would treat &lt;code&gt;preview&lt;/code&gt; Claude models: useful for internal testing, not for customer-facing production paths until they have a stable designation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The GeminiManager Pattern
&lt;/h2&gt;

&lt;p&gt;A single AI connection configured in &lt;code&gt;config/ai.php&lt;/code&gt; is fine for prototyping. Production systems need something more deliberate: model selection logic, rate-limit handling, failover between providers, and a response contract that the rest of your application can depend on regardless of which model actually ran.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;GeminiManager&lt;/code&gt; service class is that layer. It sits between the &lt;code&gt;laravel/ai&lt;/code&gt; facade and your application code, owns the failover logic, and returns a typed response object your Blade templates or API controllers can render without knowing which provider answered the request.&lt;/p&gt;

&lt;p&gt;Start by creating the service in &lt;code&gt;app/Services/AI/&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Services\AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="no"&gt;Illuminate\Support\Facades\AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\Log&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\DTOs\AiResponse&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GeminiManager&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$modelPriority&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'gemini-2.5-flash'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'gemini-2.5-pro'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;];&lt;/span&gt;

    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$fallbackConnection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'openai'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$fallbackModel&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'gpt-4o-mini'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]):&lt;/span&gt; &lt;span class="kt"&gt;AiResponse&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;modelPriority&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;attemptGemini&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AiResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;      &lt;span class="nv"&gt;$model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="s1"&gt;'gemini'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;inputTokens&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="o"&gt;?-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;inputTokens&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;outputTokens&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="o"&gt;?-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;outputTokens&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;\Exception&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Gemini model failed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                    &lt;span class="s1"&gt;'model'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="s1"&gt;'error'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getMessage&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                    &lt;span class="s1"&gt;'prompt_length'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;strlen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="p"&gt;]);&lt;/span&gt;

                &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// All Gemini models exhausted. Fall back to configured provider.&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;fallback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;attemptGemini&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;mixed&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;AI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'gemini'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt;
            &lt;span class="mf"&gt;...&lt;/span&gt;&lt;span class="nv"&gt;$options&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;fallback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;AiResponse&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'All Gemini models exhausted. Falling back.'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'fallback_connection'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;fallbackConnection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'fallback_model'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;fallbackModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;AI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;fallbackConnection&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;fallbackModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AiResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;     &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;       &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;fallbackModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;fallbackConnection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;inputTokens&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="o"&gt;?-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;inputTokens&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;outputTokens&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="o"&gt;?-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;outputTokens&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;AiResponse&lt;/code&gt; DTO keeps your response contract stable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\DTOs&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiResponse&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;    &lt;span class="nv"&gt;$inputTokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;    &lt;span class="nv"&gt;$outputTokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Bind it in your &lt;code&gt;bootstrap/app.php&lt;/code&gt;, using Laravel 11/12 syntax:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Services\AI\GeminiManager&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;withProviders&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;app&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;singleton&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;GeminiManager&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;&lt;strong&gt;[Architect’s Note]&lt;/strong&gt; This pattern mirrors exactly what we described in &lt;a href="https://origin-main.com/laravel-architecture/production-grade-ai-architecture-in-laravel/" rel="noopener noreferrer"&gt;Production-Grade AI Architecture in Laravel: Contracts, Governance &amp;amp; Telemetry&lt;/a&gt;—the response DTO is your contract, and nothing outside the service layer should care which provider produced the response. The moment your Blade template starts conditionally rendering based on &lt;code&gt;$response-&amp;gt;provider&lt;/code&gt;, you have an architecture problem, not a template problem.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Rate Limit Handling
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;GeminiManager&lt;/code&gt; above catches all exceptions and continues to the next model. That is fine for a first pass but too broad for production. You want to distinguish between rate-limit errors, which should trigger immediate failover, and authentication or configuration errors, which should not silently retry.&lt;/p&gt;

&lt;p&gt;Gemini rate-limit responses return HTTP &lt;code&gt;429&lt;/code&gt;. A tighter catch block:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Http\Client\RequestException&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;attemptGemini&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;mixed&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;AI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'gemini'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$prompt&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt;
            &lt;span class="mf"&gt;...&lt;/span&gt;&lt;span class="nv"&gt;$options&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;RequestException&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;429&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Rate limited. Try next model.&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;401&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nv"&gt;$status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Auth failure. Do not retry silently.&lt;/span&gt;
            &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;critical&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Gemini authentication failure'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'status'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$status&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;\RuntimeException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Gemini authentication failed. Check GEMINI_API_KEY.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// Anything else: surface it.&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For consistent rate-limit and token tracking across all your AI providers, the middleware pattern we detailed in &lt;a href="https://origin-main.com/laravel-architecture/laravel-ai-middleware-token-tracking/" rel="noopener noreferrer"&gt;Laravel AI Middleware: Token Tracking &amp;amp; Rate Limiting&lt;/a&gt; plugs directly into this architecture. Wire it as HTTP middleware on your AI routes and your &lt;code&gt;AiResponse&lt;/code&gt; DTO’s token counts feed it automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Keeping Your Blade Templates Clean
&lt;/h2&gt;

&lt;p&gt;This is the part most tutorials ignore. It is also where the design falls apart in real applications.&lt;/p&gt;

&lt;p&gt;Your Blade templates should receive an &lt;code&gt;AiResponse&lt;/code&gt; object and render &lt;code&gt;$response-&amp;gt;content&lt;/code&gt;. That is it. They should not know the provider is Gemini, they should not branch on &lt;code&gt;$response-&amp;gt;model&lt;/code&gt;, and they should never contain try/catch logic. If your failover logic is working correctly, the view never knows a failover happened.&lt;/p&gt;

&lt;p&gt;In a controller:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Services\AI\GeminiManager&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ArticleSummaryController&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Controller&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kt"&gt;GeminiManager&lt;/span&gt; &lt;span class="nv"&gt;$ai&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Article&lt;/span&gt; &lt;span class="nv"&gt;$article&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;View&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;ai&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"Summarise this article in three sentences: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$article&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'articles.show'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'article'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'summary'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$summary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Blade template:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"summary"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    {{ $summary-&amp;gt;content }}
&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Done. No provider logic. No model names. No error handling. The controller injects the service, the service handles everything messy, and the view renders a string.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;[Word to the Wise]&lt;/strong&gt; The discipline to keep AI provider logic out of your views pays off when you swap models six months from now. And you will swap models. Gemini 2.5 Flash will eventually give way to whatever Google releases next. If your template has &lt;code&gt;@if ($summary-&amp;gt;model === 'gemini-2.5-flash')&lt;/code&gt; buried inside it, that migration becomes a grep exercise across your entire view layer. Design the abstraction correctly now.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Registering Gemini as a Queued Job Provider
&lt;/h2&gt;

&lt;p&gt;Synchronous AI calls in web requests are a bad idea at scale. Wrap your &lt;code&gt;GeminiManager&lt;/code&gt; calls in a queued job for anything that is not a real-time user interaction:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Jobs&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Services\AI\GeminiManager&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Models\Article&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Bus\Queueable&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Contracts\Queue\ShouldQueue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Foundation\Bus\Dispatchable&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Queue\InteractsWithQueue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Queue\SerializesModels&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GenerateArticleSummary&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;ShouldQueue&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Dispatchable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;InteractsWithQueue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Queueable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;SerializesModels&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$tries&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$backoff&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// seconds&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$articleId&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;GeminiManager&lt;/span&gt; &lt;span class="nv"&gt;$ai&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$article&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Article&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;findOrFail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;articleId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nv"&gt;$summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$ai&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"Summarise this article in three sentences: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$article&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nv"&gt;$article&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="s1"&gt;'summary'&lt;/span&gt;          &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$summary&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'summary_model'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$summary&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'summary_provider'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$summary&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;failed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;\Throwable&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;void&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Article summary generation failed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'article_id'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;articleId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'error'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getMessage&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;$tries = 3&lt;/code&gt; and &lt;code&gt;$backoff = 30&lt;/code&gt; give you three attempts with a 30-second gap between them, which is usually enough to ride out a transient Gemini rate limit without wasting queue resources. The failed hook gives you an audit trail.&lt;/p&gt;

&lt;h2&gt;
  
  
  Official Documentation
&lt;/h2&gt;

&lt;p&gt;Two references worth bookmarking as you build this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;a href="https://ai.google.dev/gemini-api/docs/openai" rel="noopener noreferrer"&gt;Google Gemini OpenAI Compatibility Documentation&lt;/a&gt; — the canonical reference for what is and is not supported through the compat layer, including current model strings and endpoint coverage.&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://laravel.com/blog/introducing-the-laravel-ai-sdk" rel="noopener noreferrer"&gt;Laravel AI SDK Documentation&lt;/a&gt; — the official introduction covering provider setup, agent workflows, and the provider-agnostic design philosophy.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Production Deployment Checklist
&lt;/h2&gt;

&lt;p&gt;A few things that catch teams off-guard when they move this from staging to production.&lt;/p&gt;

&lt;p&gt;Google AI Studio API keys are rate-limited by default. For production workloads, you need to request increased quota through Google Cloud, or switch to Vertex AI. The Vertex AI endpoint requires a different authentication mechanism (service account credentials rather than a plain API key) and a slightly different base URL, so plan for that migration if your volume grows.&lt;/p&gt;

&lt;p&gt;Environment-specific model configuration matters. You might run &lt;code&gt;gemini-2.5-flash&lt;/code&gt; in production for cost reasons and &lt;code&gt;gemini-2.5-pro&lt;/code&gt; in staging for testing purposes. Keep that in your &lt;code&gt;.env&lt;/code&gt; and out of your &lt;code&gt;GeminiManager&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="c"&gt;# .env.production
&lt;/span&gt;&lt;span class="py"&gt;GEMINI_MODEL&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash&lt;/span&gt;

&lt;span class="c"&gt;# .env.staging
&lt;/span&gt;&lt;span class="py"&gt;GEMINI_MODEL&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-pro&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read it in the service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$modelPriority&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;modelPriority&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'services.gemini.model'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'gemini-2.5-flash'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s1"&gt;'gemini-2.5-pro'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="c1"&gt;// fallback within Gemini&lt;/span&gt;
    &lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And add to &lt;code&gt;config/services.php&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="s1"&gt;'gemini'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'model'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'GEMINI_MODEL'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'gemini-2.5-flash'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;],&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you are deploying to Laravel Forge or a similar managed environment, confirm your queue workers restart after deployment. A worker that caches the old &lt;code&gt;GeminiManager&lt;/code&gt; configuration will ignore model changes until it restarts. This is not Gemini-specific, but it catches teams every time they change provider configuration without restarting workers.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;[Efficiency Gain]&lt;/strong&gt; Gemini 2.5 Flash’s one-million-token context window is genuinely useful for RAG pipelines. If you are building document search or retrieval-augmented generation, the ability to stuff a much larger context window means fewer chunking edge cases and simpler retrieval logic. If you are already running embeddings and vector search in Laravel, the integration pattern described in &lt;a href="https://origin-main.com/ai-foundations/laravel-vector-database-embeddings-pgvector-rag/" rel="noopener noreferrer"&gt;Laravel Embeddings, Vector Databases, and RAG: A Production Implementation Guide&lt;/a&gt; maps directly to Gemini’s native multimodal context handling.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to Go From Here
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;GeminiManager&lt;/code&gt; pattern above is a solid foundation. The logical next step is versioning your prompts so that model upgrades do not introduce silent regressions—something that becomes critical the moment you start comparing output quality between &lt;code&gt;gemini-2.5-flash&lt;/code&gt; and whatever Gemini 3.x stable lands in your production window. The &lt;a href="https://origin-main.com/laravel-architecture/laravel-prompt-migrations-version-control-ai-prompts/" rel="noopener noreferrer"&gt;Prompt Migrations: Bringing Determinism to AI in Laravel&lt;/a&gt; tutorial covers exactly that workflow.&lt;/p&gt;

&lt;p&gt;The Google AI ecosystem moves fast. &lt;code&gt;gemini-3.1-flash-lite-preview&lt;/code&gt; is already in the preview tier as of this writing, and Google is actively iterating. The architecture we have built here isolates model selection to a single config value and a single service class. When the next stable Gemini model lands, your migration is a one-line change.&lt;/p&gt;

</description>
      <category>laravel</category>
      <category>gemini</category>
      <category>webdev</category>
      <category>ai</category>
    </item>
    <item>
      <title>Laravel Filament Admin Dashboard for AI Applications: Token Costs, Prompt Management, and Agent Audit Trails</title>
      <dc:creator>Dewald Hugo</dc:creator>
      <pubDate>Thu, 16 Apr 2026 03:34:17 +0000</pubDate>
      <link>https://dev.to/dewaldhugo/laravel-filament-admin-dashboard-for-ai-applications-token-costs-prompt-management-and-agent-10g9</link>
      <guid>https://dev.to/dewaldhugo/laravel-filament-admin-dashboard-for-ai-applications-token-costs-prompt-management-and-agent-10g9</guid>
      <description>&lt;p&gt;You have the AI pipeline. You have the service classes, the middleware, the queued jobs. The backend is instrumented. What you do not have is any way to &lt;em&gt;look&lt;/em&gt; at it without writing a SQL query at 2am because costs just spiked.&lt;/p&gt;

&lt;p&gt;That is the gap this article closes.&lt;/p&gt;

&lt;p&gt;A Laravel Filament admin dashboard is not the glamorous part of AI development. But the teams that skip it are the ones who discover they have been over-spending on tokens for six weeks, or that an agent ran a destructive tool call nobody authorised, or that a prompt change from last Tuesday quietly broke output quality across the board. You cannot govern what you cannot see.&lt;/p&gt;

&lt;p&gt;The Laravel Filament admin dashboard built in this guide surfaces four things: real-time token cost monitoring, usage analytics broken down by user and model, a prompt version management panel, and a full agent audit trail. Each section is a Filament v3 Resource or Widget, built on top of the telemetry data your existing middleware and service layer already generate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Filament, and Why Now
&lt;/h2&gt;

&lt;p&gt;There is no serious competition in the Laravel ecosystem for building admin interfaces right now. Filament v3 ships with a full panel system, Livewire-powered tables, stats widgets, chart widgets, custom pages, and a role-based authorization layer. It works inside your existing Laravel application. It does not require a separate process or a Node frontend.&lt;/p&gt;

&lt;p&gt;The Livewire-driven rendering keeps things reactive without JavaScript complexity. The Eloquent-backed Resources mean your existing models map directly to admin panels with almost no boilerplate. And because it is all Laravel under the hood, your Service Container, policies, events, and observers all still apply — you are not learning a new paradigm, just adding a UI layer on top of architecture you have already built.&lt;/p&gt;

&lt;p&gt;Filament v3 requires PHP 8.1+ and Laravel 10+. On Laravel 11 or 12 you will be fine.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation and Panel Setup
&lt;/h2&gt;

&lt;p&gt;Install Filament with the panel scaffolding command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;composer require filament/filament:&lt;span class="s2"&gt;"^3.0"&lt;/span&gt; &lt;span class="nt"&gt;-W&lt;/span&gt;
php artisan filament:install &lt;span class="nt"&gt;--panels&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This generates &lt;code&gt;app/Providers/Filament/AdminPanelProvider.php&lt;/code&gt; and registers it via &lt;code&gt;bootstrap/app.php&lt;/code&gt;. The panel provider is where you wire discovery, colours, middleware, and authentication. Here is a clean starting configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Providers/Filament/AdminPanelProvider.php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Providers\Filament&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Http\Middleware\Authenticate&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Http\Middleware\DisableBladeIconComponents&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Http\Middleware\DispatchServingFilamentEvent&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Panel&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\PanelProvider&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Support\Colors\Color&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Cookie\Middleware\AddQueuedCookiesToResponse&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Cookie\Middleware\EncryptCookies&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Foundation\Http\Middleware\VerifyCsrfToken&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Routing\Middleware\SubstituteBindings&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Session\Middleware\AuthenticateSession&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Session\Middleware\StartSession&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\View\Middleware\ShareErrorsFromSession&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AdminPanelProvider&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;PanelProvider&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;panel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Panel&lt;/span&gt; &lt;span class="nv"&gt;$panel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;Panel&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$panel&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'admin'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'admin'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;login&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;colors&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'primary'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;Color&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nc"&gt;Violet&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;discoverResources&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;app_path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Filament/Resources'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'App\\Filament\\Resources'&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;discoverPages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;app_path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Filament/Pages'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'App\\Filament\\Pages'&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;discoverWidgets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;app_path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Filament/Widgets'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'App\\Filament\\Widgets'&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="nc"&gt;EncryptCookies&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;AddQueuedCookiesToResponse&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;StartSession&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;AuthenticateSession&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;ShareErrorsFromSession&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;VerifyCsrfToken&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;SubstituteBindings&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;DisableBladeIconComponents&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;DispatchServingFilamentEvent&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;authMiddleware&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="nc"&gt;Authenticate&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;authorization&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;?-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;is_admin&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That last &lt;code&gt;-&amp;gt;authorization()&lt;/code&gt; call is the line most tutorials omit. Without it, any authenticated user can reach your admin panel. We will come back to access control later — but set that guard from day one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Modeling the Data Your Dashboard Will Surface
&lt;/h2&gt;

&lt;p&gt;Before you build a single Filament Resource, you need to decide what data you are collecting. On a production AI application, there are two primary log types worth persisting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI Usage Logs&lt;/strong&gt; capture every inference call: which provider, which model, how many tokens, what it cost, which feature triggered it, and whether it succeeded.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent Audit Logs&lt;/strong&gt; capture agentic actions: which agent class ran, what action it performed, which tool it called, what the input and output were, and how long it took.&lt;/p&gt;

&lt;p&gt;Both should be Eloquent models backed by indexed tables. Start with the usage log migration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// database/migrations/2026_04_15_000001_create_ai_usage_logs_table.php&lt;/span&gt;

&lt;span class="nc"&gt;Schema&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ai_usage_logs'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Blueprint&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;foreignId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'user_id'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;constrained&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullOnDelete&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'provider'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;           &lt;span class="c1"&gt;// 'anthropic', 'openai'&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'model'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'feature'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// 'chat', 'summarize', 'agent'&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'prompt_tokens'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'completion_tokens'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'total_tokens'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;decimal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'cost_usd'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'prompt_version'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'metadata'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'success'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'error_message'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;timestamps&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'user_id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'provider'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'model'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'feature'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The compound indexes on &lt;code&gt;created_at&lt;/code&gt; are non-negotiable. Dashboard widgets run aggregation queries constantly, and without them you will feel the difference under real traffic.&lt;/p&gt;

&lt;p&gt;The agent audit log has a slightly different shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// database/migrations/2026_04_15_000002_create_agent_audit_logs_table.php&lt;/span&gt;

&lt;span class="nc"&gt;Schema&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'agent_audit_logs'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Blueprint&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;foreignId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'user_id'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;constrained&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullOnDelete&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'agent_class'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'action'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'tool_used'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'input_payload'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'output_payload'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'tokens_used'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'duration_ms'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'success'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;timestamps&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'agent_class'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'user_id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'success'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both models should cast their JSON columns and use standard Eloquent relationships back to &lt;code&gt;User&lt;/code&gt;. Nothing exotic required.&lt;/p&gt;

&lt;h2&gt;
  
  
  Token Cost Monitoring: The Stats Overview Widget
&lt;/h2&gt;

&lt;p&gt;The first thing an operator needs to see when they open the dashboard is a top-level summary of what the AI layer has cost in the last 24 hours. Filament’s &lt;code&gt;StatsOverviewWidget&lt;/code&gt; handles this cleanly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;php artisan make:filament-widget AiStatsOverview &lt;span class="nt"&gt;--stats-overview&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Filament/Widgets/AiStatsOverview.php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Filament\Widgets&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Models\AiUsageLog&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Widgets\StatsOverviewWidget&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nc"&gt;BaseWidget&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Widgets\StatsOverviewWidget\Stat&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiStatsOverview&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;BaseWidget&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?int&lt;/span&gt; &lt;span class="nv"&gt;$sort&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;getStats&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$today&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AiUsageLog&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;whereDate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;today&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
        &lt;span class="nv"&gt;$month&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AiUsageLog&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;whereMonth&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;whereYear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="nc"&gt;Stat&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Today's Tokens"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;number_format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$today&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'total_tokens'&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
                &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Across all providers and models'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'primary'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;

            &lt;span class="nc"&gt;Stat&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Today's Cost"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'$'&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nb"&gt;number_format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$today&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'cost_usd'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'USD — live tally'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'warning'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;

            &lt;span class="nc"&gt;Stat&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Monthly Spend'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'$'&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nb"&gt;number_format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$month&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'cost_usd'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'F Y'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'danger'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;

            &lt;span class="nc"&gt;Stat&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Failures Today'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;number_format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$today&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'success'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nb"&gt;count&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
                &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Inference errors across all features'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'gray'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Register this widget on your dashboard in the panel provider’s &lt;code&gt;-&amp;gt;widgets()&lt;/code&gt; array. It polls on a configurable interval using Livewire’s polling mechanism — add &lt;code&gt;protected static ?string $pollingInterval = '30s';&lt;/code&gt; if you want it to refresh automatically.&lt;/p&gt;

&lt;p&gt;If you have not yet built the middleware layer that feeds these logs, the &lt;a href="https://origin-main.com/laravel-architecture/laravel-ai-middleware-token-tracking/" rel="noopener noreferrer"&gt;Laravel AI Middleware guide on token tracking and rate limiting&lt;/a&gt; covers the exact pattern for intercepting inference calls and persisting usage data before the response reaches the controller.&lt;/p&gt;

&lt;h2&gt;
  
  
  Daily Token Usage: The Chart Widget
&lt;/h2&gt;

&lt;p&gt;Raw totals are useful. Trend lines are what actually surface problems — a cost spike on Tuesday, a failure rate that spiked after a deployment. Add a chart widget to give operators a 30-day view:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;php artisan make:filament-widget TokenUsageChart &lt;span class="nt"&gt;--chart&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Filament/Widgets/TokenUsageChart.php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Filament\Widgets&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Models\AiUsageLog&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Widgets\ChartWidget&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TokenUsageChart&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;ChartWidget&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$heading&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Daily Token Usage — Last 30 Days'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?int&lt;/span&gt; &lt;span class="nv"&gt;$sort&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$filter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'tokens'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;getFilters&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;?array&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'tokens'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Total Tokens'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'cost'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Cost (USD)'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;getData&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$column&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;'cost'&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="s1"&gt;'cost_usd'&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'total_tokens'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nv"&gt;$label&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;'cost'&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="s1"&gt;'Cost (USD)'&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'Total Tokens'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="nv"&gt;$data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AiUsageLog&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;selectRaw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"DATE(created_at) as date, SUM(&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$column&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;) as value"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'&amp;gt;='&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;subDays&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;groupBy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'date'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;orderBy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'date'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'datasets'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;[&lt;/span&gt;
                    &lt;span class="s1"&gt;'label'&lt;/span&gt;       &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$label&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="s1"&gt;'data'&lt;/span&gt;        &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$data&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;pluck&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'value'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                    &lt;span class="s1"&gt;'borderColor'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'#8b5cf6'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="s1"&gt;'tension'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="s1"&gt;'fill'&lt;/span&gt;        &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'labels'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$data&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;pluck&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'date'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;getType&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;'line'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;$filter&lt;/code&gt; toggle lets operators switch between token volume and cost in a single widget, which keeps the dashboard clean without multiplying panels.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Usage Log Resource: Per-User and Per-Model Breakdown
&lt;/h2&gt;

&lt;p&gt;Stats and charts give you the summary. When you need to investigate — which user ran 400,000 tokens in one session, which model is responsible for the cost spike — you need a sortable, filterable table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;php artisan make:filament-resource AiUsageLog &lt;span class="nt"&gt;--generate&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then replace the generated table definition with something that has real operator value:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Filament/Resources/AiUsageLogResource.php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Filament\Resources&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Filament\Resources\AiUsageLogResource\Pages&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Models\AiUsageLog&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Resources\Resource&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables\Table&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables\Columns\IconColumn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables\Columns\TextColumn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables\Filters\SelectFilter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables\Filters\Filter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Database\Eloquent\Builder&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiUsageLogResource&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Resource&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;           &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AiUsageLog&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$navigationIcon&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'heroicon-o-chart-bar'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$navigationGroup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'AI Governance'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$navigationLabel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Usage Logs'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Table&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;Table&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'user.name'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;label&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'User'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;searchable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sortable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;placeholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'—'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;

                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'provider'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;badge&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sortable&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;

                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'model'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;searchable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;copyable&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;

                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'feature'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;badge&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'info'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;placeholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'—'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;

                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'total_tokens'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;numeric&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sortable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;summarize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Tables\Columns\Summarizers\Sum&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;label&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Total'&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;

                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'cost_usd'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;money&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'usd'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sortable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;summarize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Tables\Columns\Summarizers\Sum&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;label&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Total'&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;

                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'prompt_version'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;badge&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'warning'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;placeholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'—'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;

                &lt;span class="nc"&gt;IconColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'success'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;

                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;dateTime&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sortable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;since&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toggleable&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;filters&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="nc"&gt;SelectFilter&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'provider'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;options&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                        &lt;span class="s1"&gt;'anthropic'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Anthropic'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="s1"&gt;'openai'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'OpenAI'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="p"&gt;]),&lt;/span&gt;

                &lt;span class="nc"&gt;SelectFilter&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'success'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;options&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                        &lt;span class="s1"&gt;'1'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Successful'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="s1"&gt;'0'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Failed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="p"&gt;]),&lt;/span&gt;

                &lt;span class="nc"&gt;Filter&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'today'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;label&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Today only'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Builder&lt;/span&gt; &lt;span class="nv"&gt;$query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$query&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;whereDate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;today&lt;/span&gt;&lt;span class="p"&gt;())),&lt;/span&gt;
            &lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;defaultSort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'desc'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;paginated&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;getPages&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'index'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;Pages\ListAiUsageLogs&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;canCreate&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two things to note here. First, &lt;code&gt;-&amp;gt;summarize()&lt;/code&gt; on the token and cost columns gives you running totals in the table footer — incredibly useful when you have applied filters to isolate a user or a date range. Second, &lt;code&gt;canCreate(): false&lt;/code&gt; disables the “New record” button. Usage logs are system-generated. Nobody should be hand-entering them.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://origin-main.com/laravel-architecture/production-grade-ai-architecture-in-laravel/" rel="noopener noreferrer"&gt;Production-Grade AI Architecture in Laravel: Contracts, Governance &amp;amp; Telemetry&lt;/a&gt; details the telemetry layer that populates this data. If your AI services are not already emitting structured usage events to an observable logging interface, that article is the right place to start before wiring up this dashboard.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prompt Version Management
&lt;/h2&gt;

&lt;p&gt;Operators need to know which prompt is active. They also need to see when it changed, what the previous version said, and whether cost or quality shifted after a change. This is where the dashboard and your version control system connect.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://origin-main.com/laravel-architecture/laravel-prompt-migrations-version-control-ai-prompts/" rel="noopener noreferrer"&gt;Prompt Migrations: Bringing Determinism to AI in Laravel&lt;/a&gt; covers building the underlying prompt migration system. Assuming you have a &lt;code&gt;prompt_versions&lt;/code&gt; table from that guide, the Filament Resource surfaces it with minimal effort:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;php artisan make:filament-resource PromptVersion &lt;span class="nt"&gt;--generate&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Filament/Resources/PromptVersionResource.php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Filament\Resources&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Filament\Resources\PromptVersionResource\Pages&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Models\PromptVersion&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Forms\Components\Textarea&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Forms\Components\TextInput&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Forms\Components\Toggle&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Forms\Form&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Resources\Resource&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables\Table&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables\Columns\IconColumn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables\Columns\TextColumn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PromptVersionResource&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Resource&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;           &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PromptVersion&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$navigationIcon&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'heroicon-o-document-text'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$navigationGroup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'AI Governance'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$navigationLabel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Prompt Versions'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;form&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Form&lt;/span&gt; &lt;span class="nv"&gt;$form&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;Form&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$form&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="nc"&gt;TextInput&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'key'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;required&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;disabled&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="nc"&gt;TextInput&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'version'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;required&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;disabled&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="nc"&gt;Textarea&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'system_prompt'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;columnSpanFull&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="nc"&gt;Textarea&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'description'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;columnSpanFull&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="nc"&gt;Toggle&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'is_active'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;label&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Active'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;disabled&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Table&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;Table&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'key'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;searchable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sortable&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'version'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;badge&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'primary'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'description'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;placeholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'—'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="nc"&gt;IconColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'is_active'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;label&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Active'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;dateTime&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sortable&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;defaultSort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'desc'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;getPages&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'index'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;Pages\ListPromptVersions&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="s1"&gt;'view'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;Pages\ViewPromptVersion&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/{record}'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;canCreate&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;canEdit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$record&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;canDelete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$record&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All three &lt;code&gt;can*&lt;/code&gt; methods return false. Prompt changes belong in migrations under version control. The admin panel gives you read access only. This is intentional. The moment you allow operators to edit prompt content via a UI form, you have created an unversioned, un-reviewable, un-rollbackable change path that will cause you pain.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;[Architect’s Note]&lt;/strong&gt; Keep your admin panel read-only for prompt content. The dashboard is a &lt;em&gt;governance tool&lt;/em&gt;, not a content editor. Every prompt mutation should flow through a migration, be committed to version control, and be deployable like code. If your team is asking for a UI to edit prompts live, that is a process problem masquerading as a tooling request. Fix the process.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Agent Audit Trails
&lt;/h2&gt;

&lt;p&gt;This is the section most teams skip. It is also the one that saves them when something goes wrong.&lt;/p&gt;

&lt;p&gt;Agentic workflows make decisions. They call tools. They read data and write data. When a user complains that the agent deleted something it should not have, or an integration breaks because the agent started formatting its output differently, you need a record of exactly what happened.&lt;/p&gt;

&lt;p&gt;If you are building agentic pipelines, the &lt;a href="https://origin-main.com/laravel-architecture/laravel-agentic-workflow-schema-validation/" rel="noopener noreferrer"&gt;Hardening Laravel Agentic Workflows guide&lt;/a&gt; covers schema validation and output hardening. The audit trail built here is the operational companion to that — it answers “what did it do?” as opposed to “did it do it correctly?”&lt;/p&gt;

&lt;p&gt;Emit an audit log from your base agent class on every action:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/AI/Agents/BaseAgent.php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\AI\Agents&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Models\AgentAuditLog&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;abstract&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;BaseAgent&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;logAction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$toolUsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$tokensUsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="nv"&gt;$durationMs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="nv"&gt;$success&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
    &lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;AgentAuditLog&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="s1"&gt;'user_id'&lt;/span&gt;        &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="s1"&gt;'agent_class'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'action'&lt;/span&gt;         &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'tool_used'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$toolUsed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'input_payload'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'output_payload'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'tokens_used'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$tokensUsed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'duration_ms'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$durationMs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'success'&lt;/span&gt;        &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$success&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'error'&lt;/span&gt;          &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Filament Resource for the audit log is where the detail view earns its keep. Operators need to be able to click into a single agent run and see the full input/output payload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Filament/Resources/AgentAuditLogResource.php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Filament\Resources&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Filament\Resources\AgentAuditLogResource\Pages&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Models\AgentAuditLog&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Infolists\Components\KeyValueEntry&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Infolists\Components\Section&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Infolists\Components\TextEntry&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Infolists\Infolist&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Resources\Resource&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables\Table&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables\Columns\IconColumn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables\Columns\TextColumn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Filament\Tables\Filters\SelectFilter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AgentAuditLogResource&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Resource&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;           &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentAuditLog&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$navigationIcon&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'heroicon-o-cpu-chip'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$navigationGroup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'AI Governance'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt; &lt;span class="nv"&gt;$navigationLabel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Agent Audit Trail'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Table&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;Table&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'user.name'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;label&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'User'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;searchable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;placeholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'System'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'agent_class'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;badge&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;searchable&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'action'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;searchable&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'tool_used'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;badge&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'info'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;placeholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'—'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'tokens_used'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;numeric&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sortable&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'duration_ms'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;label&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ms'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;numeric&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sortable&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="nc"&gt;IconColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'success'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="nc"&gt;TextColumn&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;dateTime&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;sortable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;since&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;filters&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="nc"&gt;SelectFilter&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'success'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;options&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'1'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Successful'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'0'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Failed'&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
                &lt;span class="nc"&gt;SelectFilter&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'agent_class'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;relationship&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'agentClass'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'agent_class'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;defaultSort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'desc'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;recordUrl&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;getUrl&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'view'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'record'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$record&lt;/span&gt;&lt;span class="p"&gt;]));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;infolist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Infolist&lt;/span&gt; &lt;span class="nv"&gt;$infolist&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;Infolist&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$infolist&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="nc"&gt;Section&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Run Details'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                        &lt;span class="nc"&gt;TextEntry&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'agent_class'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;badge&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                        &lt;span class="nc"&gt;TextEntry&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'action'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                        &lt;span class="nc"&gt;TextEntry&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'tool_used'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;badge&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'info'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;placeholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'None'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                        &lt;span class="nc"&gt;TextEntry&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'tokens_used'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;numeric&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                        &lt;span class="nc"&gt;TextEntry&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'duration_ms'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;label&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Duration (ms)'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;numeric&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                        &lt;span class="nc"&gt;TextEntry&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'success'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;badge&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getStateUsing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                            &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$record&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;success&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="s1"&gt;'Success'&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'Failed'&lt;/span&gt;
                        &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$state&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;'Success'&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="s1"&gt;'success'&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'danger'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="p"&gt;]),&lt;/span&gt;

                &lt;span class="nc"&gt;Section&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Input Payload'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                        &lt;span class="nc"&gt;KeyValueEntry&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'input_payload'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;label&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="p"&gt;]),&lt;/span&gt;

                &lt;span class="nc"&gt;Section&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Output Payload'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                        &lt;span class="nc"&gt;KeyValueEntry&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'output_payload'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;label&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="p"&gt;]),&lt;/span&gt;

                &lt;span class="nc"&gt;Section&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Error'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nc"&gt;TextEntry&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;placeholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'None'&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
                    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;visible&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nv"&gt;$record&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;success&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;getPages&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'index'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;Pages\ListAgentAuditLogs&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="s1"&gt;'view'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;Pages\ViewAgentAuditLog&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/{record}'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;canCreate&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;canEdit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$record&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;Infolist&lt;/code&gt; view is the right choice here over a form, because you are displaying data, not editing it. The &lt;code&gt;KeyValueEntry&lt;/code&gt; component renders JSON payloads cleanly without any custom Blade work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Locking Down the Admin Panel
&lt;/h2&gt;

&lt;p&gt;Your Filament admin is a high-value target. It contains cost data, user data, prompt intellectual property, and agent decision history. Basic access control is not optional.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;-&amp;gt;authorization()&lt;/code&gt; callback in your panel provider is your first line of defence. The example earlier used &lt;code&gt;is_admin&lt;/code&gt; on the User model. In practice, most teams reach for Spatie’s Laravel Permission package for this. Your panel provider call then becomes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;authorization&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;?-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;hasRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'admin'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For more granular control over individual Resources, use Laravel Policies. Filament automatically discovers and respects a policy for any model that has one. A &lt;code&gt;AiUsageLogPolicy&lt;/code&gt; with &lt;code&gt;viewAny&lt;/code&gt; returning &lt;code&gt;$user-&amp;gt;hasRole('admin')&lt;/code&gt; gives you fine-grained control without putting access logic inside the Resource class itself.&lt;/p&gt;

&lt;p&gt;One pattern worth enforcing: scope your AI admin panel to a separate &lt;code&gt;admin&lt;/code&gt; guard, and ensure it sits behind your Sanctum token authentication if your application issues API tokens to admin users. The &lt;a href="https://origin-main.com/guides/laravel-sanctum-api-authentication/" rel="noopener noreferrer"&gt;Laravel Sanctum API Authentication guide&lt;/a&gt; covers token scoping and ability verification, and the same principles apply when you are protecting admin panel access programmatically.&lt;/p&gt;

&lt;p&gt;Do not expose the Filament panel under the default &lt;code&gt;/admin&lt;/code&gt; path in production. Change the path in your panel provider:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ops-internal'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// or any non-obvious string&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not security through obscurity as a replacement for authentication. It is a minor but free win against automated scanning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploying Filament Alongside Your AI Stack
&lt;/h2&gt;

&lt;p&gt;Filament adds Livewire, Alpine.js, and its own asset pipeline to your application. Running &lt;code&gt;php artisan filament:assets&lt;/code&gt; publishes vendor assets to your &lt;code&gt;public/&lt;/code&gt; directory. Make sure your deployment script runs this command after every Filament update — if you skip it, operators will see unstyled panels or JavaScript errors after a version bump.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;php artisan filament:assets
php artisan filament:optimize  &lt;span class="c"&gt;# Filament v3.1+&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;filament:optimize&lt;/code&gt; caches component discovery. On applications with many Resources and Widgets, it makes a measurable difference to panel load time.&lt;/p&gt;

&lt;p&gt;If your application uses Redis for caching, you may want to cache the stats queries powering your overview widgets. Filament’s widgets do not cache their data by default. For a stats widget that runs four &lt;code&gt;SUM()&lt;/code&gt; aggregations on every page load, wrapping them in &lt;code&gt;cache()-&amp;gt;remember()&lt;/code&gt; with a 60-second TTL is worth doing before you go to production.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://origin-main.com/guides/deploy-laravel-to-production/" rel="noopener noreferrer"&gt;complete Laravel production deployment guide&lt;/a&gt; covers the full deployment pipeline in detail. When you add Filament, extend the deploy script to include &lt;code&gt;filament:assets&lt;/code&gt; and &lt;code&gt;filament:optimize&lt;/code&gt; alongside your standard &lt;code&gt;migrate&lt;/code&gt;, &lt;code&gt;config:cache&lt;/code&gt;, and &lt;code&gt;queue:restart&lt;/code&gt; steps. Missing any of these on a Filament upgrade is a common source of post-deploy surprises.&lt;/p&gt;




&lt;h2&gt;
  
  
  Organising the Navigation: The AI Governance Group
&lt;/h2&gt;

&lt;p&gt;With three or more Filament Resources all related to AI observability, you want them grouped in the sidebar. Every Resource in this article already carries &lt;code&gt;protected static ?string $navigationGroup = 'AI Governance';&lt;/code&gt;. That single property is enough to group them under a collapsible sidebar section.&lt;/p&gt;

&lt;p&gt;You can control group ordering in the panel provider:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;navigationGroups&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="s1"&gt;'AI Governance'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'User Management'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'System'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Groups listed here appear in that order in the sidebar. Groups not listed appear at the bottom. Keep AI Governance at the top — it is why operators are opening this panel.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Cache, What to Queue
&lt;/h2&gt;

&lt;p&gt;Two performance considerations you will hit once real traffic arrives.&lt;/p&gt;

&lt;p&gt;The stats overview widget runs database queries on every page render. If 10 admins have this dashboard open, that is 10 concurrent aggregation queries on your &lt;code&gt;ai_usage_logs&lt;/code&gt; table. Cache the results. Override &lt;code&gt;getStats()&lt;/code&gt; to use &lt;code&gt;cache()-&amp;gt;remember('ai_stats_today', 60, fn() =&amp;gt; ...)&lt;/code&gt; for each query group.&lt;/p&gt;

&lt;p&gt;Bulk insert your usage logs via a queued job rather than synchronously inside the middleware response cycle. A single queued insert dispatched after the AI response returns adds zero latency to the user-facing request and prevents your &lt;code&gt;ai_usage_logs&lt;/code&gt; table from becoming a write bottleneck under load. This is especially relevant if you are logging agent audit trails — agent runs that call multiple tools can generate a burst of log writes in a short window.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;[Production Pitfall]&lt;/strong&gt; If you are writing &lt;code&gt;AiUsageLog::create()&lt;/code&gt; synchronously inside an AI middleware or a service class, you will see that call become a bottleneck under sustained load. A single slow database write can add 20–50ms to every AI response. Dispatch a queued job instead. Your usage logs do not need to be written before the response returns — they need to be written &lt;em&gt;eventually and reliably&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Extending the Dashboard Over Time
&lt;/h2&gt;

&lt;p&gt;This guide builds the foundation. Here is what comes next.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Budget alerts.&lt;/strong&gt; Add an Eloquent Observer on &lt;code&gt;AiUsageLog&lt;/code&gt; that triggers a notification when daily spend crosses a configurable threshold. Filament’s database notifications work well here — the panel will show the alert in the notification bell without any additional UI work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Per-feature cost attribution.&lt;/strong&gt; The &lt;code&gt;feature&lt;/code&gt; column on &lt;code&gt;ai_usage_logs&lt;/code&gt; lets you break cost down by product feature. A second chart widget comparing cost by feature over time gives product teams the data they need to make informed decisions about which AI features are worth their current token spend.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt performance correlation.&lt;/strong&gt; Once you have both usage logs (with &lt;code&gt;prompt_version&lt;/code&gt;) and an output quality signal — whether that is a user rating, a downstream validation pass/fail, or a schema validation score from your agentic pipeline — you can join them in a custom Filament page and show quality versus cost per prompt version. That is a genuinely differentiated internal tool that most teams never build.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model comparison.&lt;/strong&gt; If you are testing multiple models for the same feature (Claude versus GPT-4o, for example), the &lt;code&gt;model&lt;/code&gt; column combined with a quality signal gives you a cost-per-quality-point comparison you can act on. That decision gets much easier when you can see it in a table rather than inferring it from CloudWatch logs.&lt;/p&gt;




&lt;p&gt;Refer to the &lt;a href="https://filamentphp.com/docs/3.x/panels/getting-started" rel="noopener noreferrer"&gt;official Filament v3 documentation&lt;/a&gt; for the full API surface on widgets, infolists, and panel configuration. The patterns in this guide are production-tested starting points, not exhaustive implementations.&lt;/p&gt;




</description>
      <category>laravel</category>
      <category>php</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Laravel Sanctum API Authentication: The Complete Production Guide</title>
      <dc:creator>Dewald Hugo</dc:creator>
      <pubDate>Thu, 09 Apr 2026 22:06:40 +0000</pubDate>
      <link>https://dev.to/dewaldhugo/laravel-sanctum-api-authentication-the-complete-production-guide-58ac</link>
      <guid>https://dev.to/dewaldhugo/laravel-sanctum-api-authentication-the-complete-production-guide-58ac</guid>
      <description>&lt;p&gt;There's a quiet assumption baked into almost every Laravel AI integration tutorial: authentication exists. Routes are protected. Tokens are issued. The API is locked down.&lt;/p&gt;

&lt;p&gt;That assumption breaks the moment you sit down to build something real.&lt;/p&gt;

&lt;p&gt;Laravel Sanctum is the framework's answer to lightweight API token authentication. It ships with Laravel, it integrates cleanly with Eloquent, and it handles the two most common authentication patterns - SPA cookie-based sessions and mobile/external API token issuance, without pulling in a full OAuth server. This guide covers both patterns, but it leans hard into the personal access token model, because that's what you need when you're building an API that your own frontend, mobile app, or third-party client will consume.&lt;/p&gt;

&lt;p&gt;By the end, you'll have a production-ready authentication layer: token issuance with ability scoping, protected routes, revocation endpoints, rate limiting via Redis, and a multi-tenant token pattern that holds up under real load. We're also covering the Laravel 11/12 &lt;code&gt;bootstrap/app.php&lt;/code&gt; configuration style throughout, no legacy &lt;code&gt;Kernel.php&lt;/code&gt; references.&lt;/p&gt;

&lt;p&gt;Let's get into it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Sanctum Actually Does
&lt;/h2&gt;

&lt;p&gt;Before writing a single line of code, you need to understand where Sanctum fits in the ecosystem. Sanctum is not OAuth. It doesn't issue refresh tokens. It doesn't support third-party authorization flows. If you need those things, reach for &lt;a href="https://laravel.com/docs/passport" rel="noopener noreferrer"&gt;Laravel Passport&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;What Sanctum does exceptionally well is issue hashed personal access tokens tied to your &lt;code&gt;users&lt;/code&gt; table via a polymorphic &lt;code&gt;personal_access_tokens&lt;/code&gt; table. Each token can carry a set of &lt;em&gt;abilities&lt;/em&gt; - scoped permissions that your application checks at runtime. The token itself is a random string; only the SHA-256 hash lives in the database. That's a sensible default.&lt;/p&gt;

&lt;p&gt;For SPAs on the same domain, Sanctum piggybacks on Laravel's existing session authentication via a cookie. This is the &lt;code&gt;EnsureFrontendRequestsAreStateful&lt;/code&gt; middleware doing its job. We'll touch on this briefly, but the majority of this guide focuses on token-based auth for API clients.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation and Setup (Laravel 11 / 12)
&lt;/h2&gt;

&lt;p&gt;Sanctum ships with Laravel 11 and 12. If you're starting a fresh project, it's already in your &lt;code&gt;composer.json&lt;/code&gt;. Confirm it's there:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;composer show laravel/sanctum
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you're on an older install that upgraded to Laravel 11, you may need to install it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;composer require laravel/sanctum
php artisan vendor:publish &lt;span class="nt"&gt;--provider&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"Laravel&lt;/span&gt;&lt;span class="se"&gt;\S&lt;/span&gt;&lt;span class="s2"&gt;anctum&lt;/span&gt;&lt;span class="se"&gt;\S&lt;/span&gt;&lt;span class="s2"&gt;anctumServiceProvider"&lt;/span&gt;
php artisan migrate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That publish step drops a &lt;code&gt;config/sanctum.php&lt;/code&gt; file and a migration for the &lt;code&gt;personal_access_tokens&lt;/code&gt; table. Run the migration before anything else.&lt;/p&gt;

&lt;h3&gt;
  
  
  bootstrap/app.php - The Laravel 11/12 Way
&lt;/h3&gt;

&lt;p&gt;In Laravel 11, middleware registration moved out of &lt;code&gt;Kernel.php&lt;/code&gt; and into &lt;code&gt;bootstrap/app.php&lt;/code&gt;. This is where you register the Sanctum stateful middleware for SPA authentication:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// bootstrap/app.php&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Sanctum\Http\Middleware\EnsureFrontendRequestsAreStateful&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;withMiddleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Middleware&lt;/span&gt; &lt;span class="nv"&gt;$middleware&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$middleware&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;statefulApi&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Calling &lt;code&gt;$middleware-&amp;gt;statefulApi()&lt;/code&gt; is the clean Laravel 11 way to register Sanctum's SPA middleware. Avoid manually listing the middleware class inline, &lt;code&gt;statefulApi()&lt;/code&gt; handles the correct ordering.&lt;/p&gt;

&lt;p&gt;For token-only APIs (no SPA), you can skip this entirely. Your token-protected routes will use the &lt;code&gt;auth:sanctum&lt;/code&gt; guard, which handles everything through the &lt;code&gt;Authorization: Bearer&lt;/code&gt; header.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configuring the Guard
&lt;/h3&gt;

&lt;p&gt;In &lt;code&gt;config/auth.php&lt;/code&gt;, make sure your &lt;code&gt;api&lt;/code&gt; guard points to Sanctum:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="s1"&gt;'guards'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'web'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'driver'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'session'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'provider'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'users'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;

    &lt;span class="s1"&gt;'api'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'driver'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'sanctum'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'provider'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'users'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;],&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means &lt;code&gt;auth:api&lt;/code&gt; and &lt;code&gt;auth:sanctum&lt;/code&gt; both resolve to the same thing. Pick one and be consistent across your routes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Preparing the User Model
&lt;/h2&gt;

&lt;p&gt;Add the &lt;code&gt;HasApiTokens&lt;/code&gt; trait to your &lt;code&gt;User&lt;/code&gt; model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Sanctum\HasApiTokens&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Authenticatable&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;HasApiTokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;HasFactory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Notifiable&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That trait gives Eloquent's &lt;code&gt;User&lt;/code&gt; model the &lt;code&gt;createToken()&lt;/code&gt;, &lt;code&gt;tokens()&lt;/code&gt;, and &lt;code&gt;currentAccessToken()&lt;/code&gt; methods. Everything downstream depends on this being in place.&lt;/p&gt;

&lt;h2&gt;
  
  
  Issuing Personal Access Tokens
&lt;/h2&gt;

&lt;p&gt;Token issuance happens through a dedicated endpoint. Keep it clean: one controller, one responsibility.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Http/Controllers/Auth/TokenController.php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Http\Controllers\Auth&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Http\Controllers\Controller&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Http\Request&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\Hash&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Validation\ValidationException&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Models\User&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TokenController&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Controller&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;\Illuminate\Http\JsonResponse&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="s1"&gt;'email'&lt;/span&gt;       &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'required'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'email'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'password'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'required'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'device_name'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'required'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'string'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'max:255'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="nv"&gt;$user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'email'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nc"&gt;Hash&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;check&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nc"&gt;ValidationException&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;withMessages&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="s1"&gt;'email'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'The provided credentials are incorrect.'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="p"&gt;]);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="nv"&gt;$token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;createToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;device_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'api:read'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'api:write'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="s1"&gt;'token'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$token&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;plainTextToken&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;revoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;\Illuminate\Http\JsonResponse&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;currentAccessToken&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nb"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'message'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Token revoked.'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;device_name&lt;/code&gt; field is part of Sanctum's design, it lets users see which device or client issued each token. The &lt;code&gt;ValidationException&lt;/code&gt; approach returns a clean 422 with field-level errors rather than a 401, which is what most frontend clients expect.&lt;/p&gt;

&lt;p&gt;Route registration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// routes/api.php&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Http\Controllers\Auth\TokenController&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/auth/token'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;TokenController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'issue'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/auth/token'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;TokenController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'revoke'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'auth:sanctum'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Architect's Note:&lt;/strong&gt; Never issue tokens from inside a route closure. The moment you need to add logging, rate limiting, or ability scoping, you're refactoring. Always use a dedicated controller from day one, the Service Container will thank you when you need to inject dependencies later.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Token Abilities: Scoped Permissions
&lt;/h2&gt;

&lt;p&gt;Token abilities are Sanctum's lightweight alternative to OAuth scopes. You define them as strings at issuance time, then check them at the route or controller level.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Read-only token&lt;/span&gt;
&lt;span class="nv"&gt;$token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;createToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'mobile-client'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'api:read'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="c1"&gt;// Full-access token&lt;/span&gt;
&lt;span class="nv"&gt;$token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;createToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'admin-panel'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'api:read'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'api:write'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'api:delete'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="c1"&gt;// AI feature token&lt;/span&gt;
&lt;span class="nv"&gt;$token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;createToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ai-assistant'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'ai:query'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'api:read'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check abilities in your controllers using &lt;code&gt;tokenCan()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;JsonResponse&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;tokenCan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'api:write'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Insufficient token abilities.'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// proceed with write operation&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or apply the &lt;code&gt;abilities&lt;/code&gt; middleware directly on route groups:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'auth:sanctum'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'abilities:api:read,api:write'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/documents'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;DocumentController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'store'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'auth:sanctum'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'ability:api:read'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/documents'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;DocumentController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'index'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note the difference: &lt;code&gt;abilities&lt;/code&gt; (plural) requires the token to have &lt;em&gt;all&lt;/em&gt; listed abilities. &lt;code&gt;ability&lt;/code&gt; (singular) requires &lt;em&gt;at least one&lt;/em&gt;. This distinction catches people off guard in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Protecting Routes
&lt;/h2&gt;

&lt;p&gt;Standard protected route group for a JSON API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// routes/api.php&lt;/span&gt;

&lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'auth:sanctum'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

    &lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;apiResource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'documents'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;DocumentController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ai'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/query'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;AiQueryController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'query'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ability:ai:query'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/history'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;AiQueryController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'history'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ability:api:read'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;auth:sanctum&lt;/code&gt; guard resolves the authenticated user from the Bearer token, making &lt;code&gt;$request-&amp;gt;user()&lt;/code&gt; available throughout the request lifecycle.&lt;/p&gt;

&lt;h2&gt;
  
  
  Token Revocation
&lt;/h2&gt;

&lt;p&gt;You need at minimum three revocation endpoints:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Revoke current token (logout this device)&lt;/span&gt;
&lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/auth/token'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;TokenController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'revoke'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'auth:sanctum'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Revoke a specific token by ID&lt;/span&gt;
&lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/auth/tokens/{tokenId}'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;TokenController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'revokeById'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'auth:sanctum'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Revoke all tokens (logout everywhere)&lt;/span&gt;
&lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/auth/tokens'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;TokenController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'revokeAll'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'auth:sanctum'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The controller methods:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;revokeById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$tokenId&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;JsonResponse&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$deleted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$tokenId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nb"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nv"&gt;$deleted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Token not found.'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'message'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Token revoked.'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;revokeAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;JsonResponse&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nb"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'message'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'All tokens revoked.'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;-&amp;gt;where('id', $tokenId)&lt;/code&gt; scoping on the tokens relationship is critical. Without it, a user could delete another user's token by guessing IDs — an IDOR vulnerability. The relationship scope ensures only that user's tokens are in play before the delete fires.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Production Pitfall:&lt;/strong&gt; Token revocation is only as fast as your database query. Under heavy concurrent traffic, a "revoke all" operation that hits an unindexed &lt;code&gt;tokenable_id&lt;/code&gt; column will cause measurable latency spikes. Add an index to &lt;code&gt;personal_access_tokens.tokenable_id&lt;/code&gt;, Laravel's migration doesn't include it by default in all versions.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// In a new migration&lt;/span&gt;
&lt;span class="nc"&gt;Schema&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'personal_access_tokens'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Blueprint&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'tokenable_type'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'tokenable_id'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Token Expiry
&lt;/h2&gt;

&lt;p&gt;By default, Sanctum tokens don't expire. Set an expiry in &lt;code&gt;config/sanctum.php&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="s1"&gt;'expiration'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// 30 days, in minutes&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pair this with a scheduled prune command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// routes/console.php (Laravel 11)&lt;/span&gt;
&lt;span class="nc"&gt;Schedule&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'sanctum:prune-expired --hours=24'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;daily&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Left unchecked, the &lt;code&gt;personal_access_tokens&lt;/code&gt; table grows indefinitely and starts dragging down every authenticated request.&lt;/p&gt;

&lt;h2&gt;
  
  
  Rate Limiting with Redis
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;auth:sanctum&lt;/code&gt; middleware doesn't include rate limiting. You're responsible for adding it. This is not optional for any API that calls an external AI provider - without it, a single misbehaving client can burn through your budget in minutes.&lt;/p&gt;

&lt;p&gt;Define rate limiters in &lt;code&gt;bootstrap/app.php&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Cache\RateLimiting\Limit&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\RateLimiter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nc"&gt;RateLimiter&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'api'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="nc"&gt;Limit&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;perMinute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Limit&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;perMinute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;ip&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nc"&gt;RateLimiter&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ai'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="nc"&gt;Limit&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;perMinute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Limit&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;perMinute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;ip&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply them to route groups:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'auth:sanctum'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'throttle:api'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// standard API routes&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'auth:sanctum'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'throttle:ai'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// AI query routes&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Switch your cache driver to Redis. The default &lt;code&gt;file&lt;/code&gt; driver won't share rate limit counts across multiple server instances.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;CACHE_DRIVER&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;redis&lt;/span&gt;
&lt;span class="py"&gt;REDIS_HOST&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;127.0.0.1&lt;/span&gt;
&lt;span class="py"&gt;REDIS_PORT&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;6379&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Efficiency Gain:&lt;/strong&gt; Redis handles both rate limiting and Sanctum's token lookup cache on the same connection. At scale, Sanctum resolves the user on every request by hashing the incoming token and hitting the database. If you're seeing measurable auth overhead in production profiling, an application-level cache keyed to the token hash eliminates that DB hit - but profile first, optimize second.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Multi-Tenant Token Scoping
&lt;/h2&gt;

&lt;p&gt;A user who belongs to multiple organizations shouldn't be able to use a token issued for Organization A to access Organization B's data. The cleanest fix: store a &lt;code&gt;team_id&lt;/code&gt; on the token itself using a custom token model.&lt;/p&gt;

&lt;p&gt;Extend the default token model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Models/PersonalAccessToken.php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Models&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Sanctum\PersonalAccessToken&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nc"&gt;SanctumToken&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PersonalAccessToken&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;SanctumToken&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="nv"&gt;$fillable&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'token'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'abilities'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'expires_at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'team_id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Register it in &lt;code&gt;AppServiceProvider&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Sanctum\Sanctum&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Models\PersonalAccessToken&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;boot&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;Sanctum&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;usePersonalAccessTokenModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;PersonalAccessToken&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add the column:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nc"&gt;Schema&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'personal_access_tokens'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Blueprint&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;unsignedBigInteger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'team_id'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Issue tokens with team context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;createToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;device_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'api:read'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'api:write'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="nv"&gt;$token&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;accessToken&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;forceFill&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'team_id'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$currentTeam&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Authorize against both user and team on every request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$tokenTeamId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;currentAccessToken&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;team_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$tokenTeamId&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nv"&gt;$resource&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;team_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Token not authorized for this team.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Listing Tokens for the Authenticated User
&lt;/h2&gt;

&lt;p&gt;Give users visibility into their active tokens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;JsonResponse&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'abilities'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'last_used_at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'expires_at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;latest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="s1"&gt;'id'&lt;/span&gt;           &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$token&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s1"&gt;'name'&lt;/span&gt;         &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$token&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s1"&gt;'abilities'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$token&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;abilities&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s1"&gt;'last_used_at'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$token&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;last_used_at&lt;/span&gt;&lt;span class="o"&gt;?-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toDateTimeString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="s1"&gt;'expires_at'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$token&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;expires_at&lt;/span&gt;&lt;span class="o"&gt;?-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toDateTimeString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="s1"&gt;'created_at'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$token&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toDateTimeString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;];&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'tokens'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$tokens&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The raw token value is never returned here, that's only available at issuance via &lt;code&gt;plainTextToken&lt;/code&gt;. After that, only the hash is stored.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing Sanctum Authentication
&lt;/h2&gt;

&lt;p&gt;Sanctum ships with a clean testing helper:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Sanctum\Sanctum&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DocumentApiTest&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;TestCase&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;RefreshDatabase&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;test_authenticated_user_can_list_documents&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;factory&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="nc"&gt;Sanctum&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;actingAs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'api:read'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getJson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/api/documents'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;assertStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;test_token_without_write_ability_cannot_create_document&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;factory&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="nc"&gt;Sanctum&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;actingAs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'api:read'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt; &lt;span class="c1"&gt;// no api:write&lt;/span&gt;

        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;postJson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/api/documents'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'title'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Test'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Test content'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;assertStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;test_unauthenticated_request_is_rejected&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getJson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/api/documents'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;assertStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;Sanctum::actingAs()&lt;/code&gt; bypasses the actual token lookup and tells the guard to treat the request as authenticated with those abilities. Your tests stay fast; no real tokens are issued.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Word to the Wise:&lt;/strong&gt; Test the ability checks explicitly. Developers regularly ship applications where the happy path tests pass but the authorization boundaries are never verified. A token with &lt;code&gt;api:read&lt;/code&gt; silently passing a &lt;code&gt;api:write&lt;/code&gt; endpoint is a data integrity problem, not just a security one. Write the negative cases, they catch the middleware misconfiguration you didn't know you'd made.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Returning Consistent Auth Error Responses
&lt;/h2&gt;

&lt;p&gt;By default, an unauthenticated request to a Sanctum-protected route returns a redirect to the login page. That's wrong for a JSON API. Fix it in &lt;code&gt;bootstrap/app.php&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;withExceptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Exceptions&lt;/span&gt; &lt;span class="nv"&gt;$exceptions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$exceptions&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;render&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;AuthenticationException&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;expectsJson&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Unauthenticated.'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="nv"&gt;$exceptions&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;render&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;AuthorizationException&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;expectsJson&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Forbidden.'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A 302 redirect from an API client that expected a 401 is a confusing failure mode that causes hours of frontend debugging.&lt;/p&gt;

&lt;h2&gt;
  
  
  SPA Authentication (Cookie-Based)
&lt;/h2&gt;

&lt;p&gt;For a Vue or React SPA on the same top-level domain, Sanctum's stateful middleware handles auth through the session cookie — no tokens needed.&lt;/p&gt;

&lt;p&gt;The flow: the SPA GETs &lt;code&gt;/sanctum/csrf-cookie&lt;/code&gt; to initialize the CSRF cookie, POSTs credentials to your login endpoint, then includes the &lt;code&gt;X-XSRF-TOKEN&lt;/code&gt; header on subsequent requests.&lt;/p&gt;

&lt;p&gt;Configure your stateful domains in &lt;code&gt;config/sanctum.php&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="s1"&gt;'stateful'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;explode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;','&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'SANCTUM_STATEFUL_DOMAINS'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'%s%s'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'localhost,localhost:3000,127.0.0.1,127.0.0.1:8000,::1'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nc"&gt;Sanctum&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;currentApplicationUrlWithPort&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;))),&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set &lt;code&gt;SANCTUM_STATEFUL_DOMAINS&lt;/code&gt; explicitly in production. Wildcard domains are not supported.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Edge Case Alert:&lt;/strong&gt; If your SPA and API are on different subdomains (&lt;code&gt;app.example.com&lt;/code&gt; and &lt;code&gt;api.example.com&lt;/code&gt;), the session cookie won't work across domains due to browser SameSite restrictions. Use token-based auth even for your SPA in this case. The cookie approach only works when both are served from the same domain.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Production Deployment Checklist
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Database&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;personal_access_tokens&lt;/code&gt; migration applied&lt;/li&gt;
&lt;li&gt;Index on &lt;code&gt;tokenable_type&lt;/code&gt; and &lt;code&gt;tokenable_id&lt;/code&gt; exists&lt;/li&gt;
&lt;li&gt;Token expiry configured and &lt;code&gt;sanctum:prune-expired&lt;/code&gt; scheduled&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Configuration&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;APP_KEY&lt;/code&gt; is set and rotated from the default&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SESSION_DRIVER&lt;/code&gt; is &lt;code&gt;redis&lt;/code&gt; or &lt;code&gt;database&lt;/code&gt; in multi-server setups&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CACHE_DRIVER&lt;/code&gt; is &lt;code&gt;redis&lt;/code&gt; for rate limiting across instances&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SANCTUM_STATEFUL_DOMAINS&lt;/code&gt; explicitly set if using SPA auth&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Rate Limiting&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Named rate limiters defined for all API route groups&lt;/li&gt;
&lt;li&gt;Stricter limiters applied to AI query endpoints&lt;/li&gt;
&lt;li&gt;Rate limit responses return &lt;code&gt;Retry-After&lt;/code&gt; headers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Security&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Authorization&lt;/code&gt; header stripped at the CDN for non-API routes&lt;/li&gt;
&lt;li&gt;HTTPS enforced — tokens over HTTP is a non-starter&lt;/li&gt;
&lt;li&gt;Tokens are never logged — audit your logging configuration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Monitoring&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Track &lt;code&gt;personal_access_tokens&lt;/code&gt; table row count as a metric&lt;/li&gt;
&lt;li&gt;Alert on unusual token issuance spikes&lt;/li&gt;
&lt;li&gt;Log token revocation events for audit trails&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What Sanctum Doesn't Cover
&lt;/h2&gt;

&lt;p&gt;Sanctum doesn't do: OAuth2 authorization flows, refresh tokens, third-party client authorization, dynamic scope negotiation, or JWT issuance. If any of those are requirements, use Passport.&lt;/p&gt;

&lt;p&gt;For the overwhelming majority of Laravel APIs — including every AI integration pattern — Sanctum's personal access tokens, SPA sessions, ability scoping, and revocation model cover the full authentication lifecycle. Don't reach for Passport's complexity if Sanctum's model fits your use case. The operational overhead isn't trivial.&lt;/p&gt;

&lt;h2&gt;
  
  
  Official Documentation
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://laravel.com/docs/sanctum" rel="noopener noreferrer"&gt;Laravel Sanctum Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://laravel.com/docs/http-tests" rel="noopener noreferrer"&gt;Laravel HTTP Tests Documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://origin-main.com/guides/laravel-sanctum-api-authentication" rel="noopener noreferrer"&gt;origin-main.com&lt;/a&gt; — Laravel + AI integration guides for production developers.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>laravel</category>
      <category>php</category>
      <category>authentication</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Laravel AI Middleware: Token Tracking &amp; Rate Limiting</title>
      <dc:creator>Dewald Hugo</dc:creator>
      <pubDate>Sun, 05 Apr 2026 00:10:17 +0000</pubDate>
      <link>https://dev.to/dewaldhugo/laravel-ai-middleware-token-tracking-rate-limiting-1k6c</link>
      <guid>https://dev.to/dewaldhugo/laravel-ai-middleware-token-tracking-rate-limiting-1k6c</guid>
      <description>&lt;p&gt;Integrating Claude or GPT-4o into a Laravel app is deceptively easy. One Http::post() and you feel like a genius. Then a single user loops your endpoint, or a bot scripts it, and your OpenAI bill climbs $300 before you wake up. The answer is a dedicated Laravel AI middleware layer, and most tutorials skip it entirely.&lt;/p&gt;

&lt;p&gt;That’s not a hypothetical. It happens.&lt;/p&gt;

&lt;p&gt;The fix isn’t smarter prompts, it’s infrastructure. In this guide, we build a proper AI management layer: tiered rate limiting, per-user token tracking with async logging, and a Pest test suite that actually validates the guards are working. We’re not demoing a toy. We’re building something you can ship. For &lt;a href="https://origin-main.com/laravel-architecture/laravel-prompt-migrations-version-control-ai-prompts/" rel="noopener noreferrer"&gt;prompt versioning&lt;/a&gt; and deployment discipline on the prompt side of that infrastructure, see the prompt migrations guide.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. The Strategy: Why a Laravel AI Middleware Layer?
&lt;/h2&gt;

&lt;p&gt;Middleware in the Laravel lifecycle is the right place for this because it separates &lt;strong&gt;infrastructure concerns&lt;/strong&gt; (cost, rate limits) from &lt;strong&gt;business logic&lt;/strong&gt; (the prompt itself). This decoupling means you can swap &lt;code&gt;gpt-4o&lt;/code&gt; for a self-hosted Llama instance without touching a single Controller.&lt;/p&gt;

&lt;p&gt;For the output side of that separation, validating what the model returns before your application acts on it, see the guide on &lt;a href="https://origin-main.com/laravel-architecture/laravel-agentic-workflow-schema-validation/" rel="noopener noreferrer"&gt;schema validation against LLM hallucinations&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We’re tracking cost using the standard token pricing formula:&lt;/p&gt;

&lt;p&gt;$$Total Cost = \sum_{i=1}^{n} (Tokens_{in} \cdot Price_{in} + Tokens_{out} \cdot Price_{out})$$&lt;/p&gt;

&lt;p&gt;Centralise that calculation once. Never scatter it across Controllers.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Database Schema: An Audit Trail, Not Just a Counter
&lt;/h2&gt;

&lt;p&gt;Don’t fall into the trap of incrementing a single &lt;code&gt;total_tokens&lt;/code&gt; integer on your &lt;code&gt;users&lt;/code&gt; table. You need a row per request so you can audit, refund, and debug.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// database/migrations/xxxx_xx_xx_create_ai_usage_logs_table.php&lt;/span&gt;
&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;up&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;Schema&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ai_usage_logs'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Blueprint&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;foreignId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'user_id'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;constrained&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;onDelete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'cascade'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'model_used'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;        &lt;span class="c1"&gt;// e.g., 'gpt-4o', 'claude-sonnet-4-6'&lt;/span&gt;
        &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'feature_name'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;      &lt;span class="c1"&gt;// e.g., 'blog_generator', 'chat'&lt;/span&gt;
        &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'prompt_tokens'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'completion_tokens'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;decimal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'estimated_cost'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;timestamps&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  3. Tiered Rate Limiting
&lt;/h2&gt;

&lt;p&gt;Define your tiers in &lt;code&gt;AppServiceProvider&lt;/code&gt;. Free users get 5 requests per minute. Premium gets 100. Adjust to whatever your unit economics demand.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Providers/AppServiceProvider.php&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Cache\RateLimiting\Limit&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\RateLimiter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Http\Request&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;boot&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;RateLimiter&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ai-api'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;?-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;is_premium&lt;/span&gt;
            &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="nc"&gt;Limit&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;perMinute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Limit&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;perMinute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Senior Dev Tip:&lt;/strong&gt; Back this with Redis, not your database. Set &lt;code&gt;CACHE_DRIVER=redis&lt;/code&gt; in &lt;code&gt;.env&lt;/code&gt;. If you’re running the default &lt;code&gt;file&lt;/code&gt; or &lt;code&gt;database&lt;/code&gt; driver, every AI request triggers a DB read-write cycle under the rate limiter — that’s a quiet bottleneck that won’t show up until you’re under load. See the &lt;a href="https://laravel.com/docs/cache" rel="noopener noreferrer"&gt;Laravel Cache documentation&lt;/a&gt; for driver configuration.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  4. Building the Middleware
&lt;/h2&gt;

&lt;p&gt;Generate the class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;php artisan make:middleware EnsureUserHasAiCredits
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This middleware enforces auth and your per-user spending ceiling before the request ever reaches the Controller:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Http/Middleware/EnsureUserHasAiCredits.php&lt;/span&gt;
&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Http\Middleware&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Closure&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Http\Request&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Component\HttpFoundation\Response&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnsureUserHasAiCredits&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;Closure&lt;/span&gt; &lt;span class="nv"&gt;$next&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;Response&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;user&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Unauthorized'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;monthly_spend&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;spending_limit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="s1"&gt;'error'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Monthly spending limit reached.'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s1"&gt;'upgrade'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'billing.upgrade'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;402&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Registering the Middleware (Laravel 11+)
&lt;/h3&gt;

&lt;p&gt;Laravel 11 dropped &lt;code&gt;app/Http/Kernel.php&lt;/code&gt;. Register your middleware in &lt;code&gt;bootstrap/app.php&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// bootstrap/app.php&lt;/span&gt;
&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;withMiddleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Middleware&lt;/span&gt; &lt;span class="nv"&gt;$middleware&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$middleware&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;appendToGroup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'api'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="nc"&gt;\App\Http\Middleware\EnsureUserHasAiCredits&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then apply the rate limiter on your route:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// routes/api.php&lt;/span&gt;
&lt;span class="nc"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/generate'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;AiController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'generate'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'throttle:ai-api'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  5. Externalising Model Pricing
&lt;/h2&gt;

&lt;p&gt;Hardcoding &lt;code&gt;0.000005&lt;/code&gt; per token inside your Job is a maintenance trap. One pricing change from OpenAI and your cost accounting silently breaks. Put rates in &lt;code&gt;config/ai.php&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// config/ai.php&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'models'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'gpt-4o'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'prompt_rate'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.000005&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'completion_rate'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.000015&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="s1"&gt;'claude-sonnet-4-5'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'prompt_rate'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.000003&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'completion_rate'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.000015&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  6. Async Usage Logging with a Queued Job
&lt;/h2&gt;

&lt;p&gt;Once the API responds, log asynchronously. Don’t make the user wait for a DB write.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Jobs/LogAiUsage.php&lt;/span&gt;
&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Jobs&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Models\AiUsageLog&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Bus\Queueable&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Contracts\Queue\ShouldQueue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Foundation\Bus\Dispatchable&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Queue\InteractsWithQueue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Queue\SerializesModels&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LogAiUsage&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;ShouldQueue&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Dispatchable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;InteractsWithQueue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Queueable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;SerializesModels&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;    &lt;span class="nv"&gt;$userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt;  &lt;span class="nv"&gt;$usageData&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$model&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usageData&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'model'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
        &lt;span class="nv"&gt;$rates&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"ai.models.&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$model&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'prompt_rate'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.000005&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'completion_rate'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.000015&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="nv"&gt;$cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usageData&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'prompt_tokens'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;$rates&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'prompt_rate'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
              &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usageData&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'completion_tokens'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;$rates&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'completion_rate'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="nc"&gt;AiUsageLog&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="s1"&gt;'user_id'&lt;/span&gt;           &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'model_used'&lt;/span&gt;        &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'feature_name'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usageData&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'feature_name'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'prompt_tokens'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usageData&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'prompt_tokens'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'completion_tokens'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usageData&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'completion_tokens'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'estimated_cost'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$cost&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Notice &lt;code&gt;feature_name&lt;/code&gt; is included in both the Job payload and the &lt;code&gt;create()&lt;/code&gt; call. The original migration defines that column, if you omit it from the insert, Laravel will either throw a &lt;code&gt;NOT NULL&lt;/code&gt; constraint violation or silently ignore it if it’s nullable. Either way, you’ve lost attribution data. Always match your payload to your schema.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  7. The Controller: Gluing It Together
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Http/Controllers/AiController.php&lt;/span&gt;
&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Http\Controllers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Jobs\LogAiUsage&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Http\Request&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Http\JsonResponse&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;OpenAI\Laravel\Facades\OpenAI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Validation\ValidationException&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiController&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Controller&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;JsonResponse&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$validated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="s1"&gt;'prompt'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'required'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'string'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'max:4000'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'feature'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'required'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'string'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'in:blog_generator,chat,summariser'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="s1"&gt;'model'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'gpt-4o'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s1"&gt;'messages'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$validated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'prompt'&lt;/span&gt;&lt;span class="p"&gt;]]],&lt;/span&gt;
            &lt;span class="p"&gt;]);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;\OpenAI\Exceptions\TransporterException&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nf"&gt;report&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'AI service unavailable. Please try again.'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;503&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;\OpenAI\Exceptions\ErrorException&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Catches 429s, quota exceeded, invalid requests from OpenAI&lt;/span&gt;
            &lt;span class="nf"&gt;report&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getMessage&lt;/span&gt;&lt;span class="p"&gt;()],&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getCode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="nc"&gt;LogAiUsage&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;dispatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'model'&lt;/span&gt;             &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'gpt-4o'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'feature_name'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$validated&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'feature'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'prompt_tokens'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$result&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;promptTokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'completion_tokens'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$result&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;completionTokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$result&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you’re integrating Anthropic’s SDK instead, the pattern is identical — swap the client call and update the model string to &lt;code&gt;claude-sonnet-4-5&lt;/code&gt;. See the &lt;a href="https://docs.anthropic.com/en/api/getting-started" rel="noopener noreferrer"&gt;Anthropic API documentation&lt;/a&gt; for the current SDK reference.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Testing the Architecture with Pest
&lt;/h2&gt;

&lt;p&gt;A production system is only as solid as the test suite backing it. We need to verify three things independently:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; The rate limiter blocks users who’ve exhausted their quota.&lt;/li&gt;
&lt;li&gt; The middleware halts users who’ve hit their spending ceiling.&lt;/li&gt;
&lt;li&gt; The &lt;code&gt;LogAiUsage&lt;/code&gt; Job is dispatched with the correct payload — &lt;strong&gt;without hitting the real API&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That last point is where most AI testing tutorials fail you. &lt;code&gt;Bus::fake()&lt;/code&gt; queues the Job but does nothing about the &lt;code&gt;OpenAI&lt;/code&gt; facade. If you don’t mock it, your test suite is burning real API credits.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// tests/Feature/AiGenerationTest.php&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Jobs\LogAiUsage&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Models\User&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\Bus&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\RateLimiter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;OpenAI\Laravel\Facades\OpenAI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;OpenAI\Responses\Chat\CreateResponse&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nf"&gt;beforeEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;Bus&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nf"&gt;it&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'blocks a free user who has hit their rate limit'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;factory&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'is_premium'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

    &lt;span class="c1"&gt;// Simulate 5 prior requests — the free-tier ceiling&lt;/span&gt;
    &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;RateLimiter&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;hit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ai-api:'&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;actingAs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;postJson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/api/generate'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'prompt'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Hello AI'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'feature'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'chat'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;assertStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;429&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nf"&gt;it&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'blocks a user who has exceeded their monthly spending limit'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;factory&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="s1"&gt;'monthly_spend'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;50.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'spending_limit'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;50.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;

    &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;actingAs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;postJson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/api/generate'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'prompt'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Hello AI'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'feature'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'chat'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;assertStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;402&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;assertJsonPath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Monthly spending limit reached.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nf"&gt;it&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'dispatches a LogAiUsage job with correct token data on success'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;factory&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'is_premium'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

    &lt;span class="c1"&gt;// Mock the OpenAI facade — never hit the real API in tests&lt;/span&gt;
    &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="nc"&gt;CreateResponse&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="s1"&gt;'choices'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'message'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'assistant'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Mocked response'&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'usage'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="s1"&gt;'prompt_tokens'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s1"&gt;'completion_tokens'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s1"&gt;'total_tokens'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;57&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;

    &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;actingAs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;postJson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'/api/generate'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'prompt'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Test prompt'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'feature'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'chat'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;assertOk&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;assertJsonPath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mocked response'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="nc"&gt;Bus&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;assertDispatched&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;LogAiUsage&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;LogAiUsage&lt;/span&gt; &lt;span class="nv"&gt;$job&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$job&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
            &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;$job&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usageData&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'prompt_tokens'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;
            &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;$job&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usageData&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'completion_tokens'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;
            &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;$job&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;usageData&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'feature_name'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;'chat'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you’re building out the Eloquent side of this and want to understand how to test model relationships and factory states at scale, the patterns covered in &lt;a href="https://origin-main.com/laravel/building-robust-test-factories-in-laravel/" rel="noopener noreferrer"&gt;building robust Laravel test factories&lt;/a&gt; translate directly to the &lt;code&gt;User&lt;/code&gt; and &lt;code&gt;AiUsageLog&lt;/code&gt; models here.&lt;/p&gt;

&lt;h2&gt;
  
  
  9. What You Now Have
&lt;/h2&gt;

&lt;p&gt;Shipping this architecture gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Cost protection&lt;/strong&gt; — tiered rate limiting via the Laravel &lt;code&gt;RateLimiter&lt;/code&gt; facade backed by Redis, plus a hard monthly ceiling enforced in middleware.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Full audit trail&lt;/strong&gt; — every token consumed is recorded per-user, per-feature, per-model via an Eloquent model in the &lt;strong&gt;Service Container&lt;/strong&gt;-resolved &lt;code&gt;LogAiUsage&lt;/code&gt; Job.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Zero latency impact&lt;/strong&gt; — async logging means your users never wait for a DB write.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;A test suite that doesn’t cheat&lt;/strong&gt; — the OpenAI facade is mocked, so your CI pipeline stays free and your assertions are actually trustworthy.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The gap between a weekend project and a production system isn’t the AI integration. It’s everything around it. This is that layer.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;References:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;a href="https://laravel.com/docs/routing#rate-limiting" rel="noopener noreferrer"&gt;Laravel Rate Limiting&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://docs.anthropic.com/en/api/getting-started" rel="noopener noreferrer"&gt;Anthropic API Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://github.com/openai-php/client" rel="noopener noreferrer"&gt;openai-php/client on GitHub&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>laravel</category>
      <category>ai</category>
      <category>webdev</category>
      <category>php</category>
    </item>
    <item>
      <title>Why I Still Choose Laravel in a World Full of Node and Python AI Stacks</title>
      <dc:creator>Dewald Hugo</dc:creator>
      <pubDate>Tue, 31 Mar 2026 17:27:16 +0000</pubDate>
      <link>https://dev.to/dewaldhugo/why-i-still-choose-laravel-in-a-world-full-of-node-and-python-ai-stacks-e33</link>
      <guid>https://dev.to/dewaldhugo/why-i-still-choose-laravel-in-a-world-full-of-node-and-python-ai-stacks-e33</guid>
      <description>&lt;p&gt;This article comes from a question I keep getting from developers working in JavaScript and Python ecosystems: “Why are you building AI systems with Laravel?” It usually comes up in Slack threads, code reviews, or on X, and more often than not there’s some skepticism behind it. With Node.js and Python dominating most AI conversations, Laravel isn’t the obvious choice. But after using it to build and run real systems, I’ve found that assumption doesn’t really hold up. &lt;/p&gt;

&lt;h2&gt;
  
  
  The Noise Gets Louder Every Year
&lt;/h2&gt;

&lt;p&gt;Open any developer forum right now and the message is consistent. Python for AI. Node for APIs. Go for infrastructure. The JavaScript ecosystem has colonised the frontend and is making a serious run at the backend. Python owns the ML research pipeline and, by extension, a growing portion of production AI services.&lt;/p&gt;

&lt;p&gt;The noise is real. And it’s loud.&lt;/p&gt;

&lt;p&gt;When OpenAI shipped their Python SDK first, and their Node SDK second it sent a signal. When LangChain became the default mental model for AI agent architecture, it was written in Python. When the majority of AI tutorials you find online begin with &lt;code&gt;pip install&lt;/code&gt;, you start to wonder whether your technology choice has quietly become an act of stubbornness.&lt;/p&gt;

&lt;p&gt;It hasn’t. But you do need a clear-headed reason to stay. “PHP has come a long way” is not that reason. A genuine architectural argument is.&lt;/p&gt;

&lt;h2&gt;
  
  
  Let’s Be Honest About What Node and Python Actually Offer
&lt;/h2&gt;

&lt;p&gt;Node is fast. Non-blocking I/O was genuinely transformative when it arrived, and for high-concurrency, low-computation workloads it remains a defensible choice. The npm ecosystem is enormous. TypeScript has matured to the point where large Node codebases are actually maintainable. I’m not here to dismiss it.&lt;/p&gt;

&lt;p&gt;Python’s position in AI is structural, not accidental. The scientific computing ecosystem — NumPy, PyTorch, Hugging Face Transformers, was built in Python because that’s where ML researchers lived. That inertia doesn’t disappear. If you’re training models, fine-tuning, or doing anything below the inference layer, Python is the correct tool. Full stop.&lt;/p&gt;

&lt;p&gt;But here’s the thing most of these comparisons consistently miss. The majority of us are not building model-training pipelines. We’re building applications that consume AI APIs. We’re wiring GPT-4o or Claude Sonnet into business logic, user interfaces, queue workers, and database-backed workflows. That is a very different problem. The 2025 Stack Overflow Developer Survey, with responses from over 49,000 developers, puts some hard numbers against the landscape we’re all navigating:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F14m5xxyocvbxnngydy7o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F14m5xxyocvbxnngydy7o.png" alt=" " width="800" height="407"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;small&gt;Source: Stack Overflow Developer Survey 2025 (49,000+ respondents). Usage = % of all respondents reporting active use in the past year. Ratings reflect relative strength within each category; all frameworks are production-capable. Node.js 2025 figure reflects SO methodology consolidation - direct comparison with 2024 (42.7%) is approximate.&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;The usage figures are worth sitting with for a moment. Node.js dominates at nearly half of all respondents, but look at the full-stack fit and AI ecosystem scores. High adoption does not mean best fit for the problem at hand. FastAPI is the fastest-growing framework in the survey, up five percentage points year-on-year, and it earns its AI ecosystem score. If you are building a dedicated inference microservice, FastAPI is a serious option. But Laravel’s full-stack cohesion score is the relevant column for the majority of us building complete, stateful applications around AI features. No other framework in this comparison comes close on that dimension.&lt;/p&gt;

&lt;h2&gt;
  
  
  Laravel for AI Development Is Not a Compromise
&lt;/h2&gt;

&lt;p&gt;The framing I keep pushing back on is that choosing Laravel for AI development means accepting a trade-off. That you’re surrendering capability in exchange for developer ergonomics. In 2026, that framing is simply wrong.&lt;/p&gt;

&lt;p&gt;Laravel ships with a queue system that handles background AI jobs cleanly and at scale. Redis integration is first-class - which matters the moment you start caching embeddings or enforcing per-user rate limits on API consumption. Horizon gives you real-time visibility into your queue workers without standing up a separate observability tool. Reverb brings WebSocket support in-house, which is directly relevant when you’re streaming LLM responses to a frontend in real time. Octane keeps your application warm between requests, reducing cold-start overhead that becomes noticeable in latency-sensitive AI features.&lt;/p&gt;

&lt;p&gt;None of this requires third-party assembly. It is cohesive, well-documented, and it behaves predictably under load.&lt;/p&gt;

&lt;p&gt;Compare that to a Node AI backend assembled from Express, BullMQ, Socket.io, and Prisma. Each of those is a capable library. Together, they are a distributed maintenance surface. The integration points between them are where your bugs live at 2am.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Service Container Is Still the Best Idea in Backend Development
&lt;/h2&gt;

&lt;p&gt;I’ve worked across enough stacks to have a view on this. Nothing in most ecosystems matches the elegance of Laravel’s Service Container for managing application complexity at the point where it actually gets complex.&lt;/p&gt;

&lt;p&gt;When your AI integration grows beyond a single API call you need somewhere to put the abstraction. Provider contracts, retry logic, fallback behaviour, token accounting, telemetry hooks. The Service Container lets you bind all of that behind an interface and swap implementations without touching the business logic that depends on them. It’s the kind of architectural freedom that sounds theoretical until the day you need to switch from OpenAI to Anthropic under a tight deadline - and you do it in one service provider file, not forty controller methods.&lt;/p&gt;

&lt;p&gt;If you’ve been following the production on my blog, you’ll recognise this exact approach from our piece on &lt;a href="https://origin-main.com/laravel-architecture/production-grade-ai-architecture-in-laravel/" rel="noopener noreferrer"&gt;Production-Grade AI Architecture in Laravel: Contracts, Governance &amp;amp; Telemetry&lt;/a&gt; — binding AI providers behind contracts isn’t just clean code, it’s the architectural difference between an AI feature and an AI-dependent liability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Eloquent Doesn’t Get Enough Credit in the AI Conversation
&lt;/h2&gt;

&lt;p&gt;Everyone talks about the inference layer. Nobody talks about the data layer. But AI applications are deeply stateful. Conversation history, user context, embedding vectors, usage logs, rate-limit counters, prompt version records - all of it needs to live somewhere reliable, queryable, and maintainable.&lt;/p&gt;

&lt;p&gt;Eloquent ORM, combined with PostgreSQL and pgvector for semantic search, handles this with a maturity that ORMs in younger ecosystems are still building toward. Eloquent scopes make filtering conversation history by user, session, or token budget trivially readable. Eloquent events let you hook telemetry into model persistence without polluting your service logic. The relationship system means your AI-related domain models compose naturally with the rest of your application data.&lt;/p&gt;

&lt;p&gt;This is the part of the stack that’s invisible when everything works and catastrophic when it doesn’t. Laravel’s data layer has fifteen years of production hardening behind it. That’s not a minor detail.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Ecosystem Has Quietly Caught Up
&lt;/h2&gt;

&lt;p&gt;The argument that Laravel lacks AI tooling was fair eighteen months ago. It is not fair now.&lt;/p&gt;

&lt;p&gt;Prism PHP brings a unified, multi-provider AI interface to Laravel - supporting tool calling, RAG pipelines, and streaming, without you writing SDK glue code by hand. We’ve covered what that looks like in practice in our guide to &lt;a href="https://origin-main.com/ai-agents/laravel-prism-php-agentic-apps/" rel="noopener noreferrer"&gt;building agentic Laravel apps with Prism PHP&lt;/a&gt;, and the developer experience is genuinely impressive. The abstraction is clean, provider support is broad, and it composes naturally with the Service Container patterns you’d already be using. Beyond Prism, the broader package ecosystem has responded: prompt versioning tools, queue-based AI pipeline packages, vector search integrations. The gaps are closing fast.&lt;/p&gt;

&lt;p&gt;Taylor Otwell and the core team have also signalled clearly that AI tooling is a first-party priority on Laravel’s roadmap. When the framework authors treat your use case as a core concern, that is a meaningful long-term signal. The trend data below adds useful context, and the story it tells is more nuanced than the noise in developer forums would have you believe:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3m2cd66voj4rm4mcus57.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3m2cd66voj4rm4mcus57.png" alt=" " width="800" height="418"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Source: Stack Overflow Developer Survey 2020–2025. Usage = % of all respondents reporting active use in the past year. FastAPI was first tracked in 2021. The 2025 Node.js figure (48.7%, dashed) reflects a Stack Overflow methodology consolidation; 2024 figure was 42.7% — direct year-on-year comparison is approximate. Express and Django shown as dashed lines to indicate secondary/framework-layer status vs runtime-level Node.js. 2020–2023 intermediate values for Express, FastAPI, and Laravel are interpolated from published Stack Overflow survey trend commentary.&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;FastAPI’s trajectory is the most important line on that chart. It is rising sharply, and deservedly so for pure inference workloads. But look at where it is rising from, it has only just crossed Django’s share after four years of consistent growth, and it is still well below Laravel’s current position. Node.js, despite the headline numbers, has been in structural decline since its 2020 peak of 51.4%. Laravel’s decline is real and worth acknowledging honestly, but the rate is shallow, and the developer base that remains is building serious production applications. The community is consolidating around quality, not haemorrhaging. There is a difference.&lt;/p&gt;

&lt;h2&gt;
  
  
  Developer Velocity Is a Business Metric
&lt;/h2&gt;

&lt;p&gt;Here is the argument that tends to land hardest, and it’s also the most honest one.&lt;/p&gt;

&lt;p&gt;Speed of iteration matters more than theoretical capability, especially in the early and middle stages of an AI product. The developer who can ship a working, testable, observable AI feature in a day using Laravel is more valuable than the developer who spends three days assembling a Python FastAPI service with the same surface area of functionality. The gap between “it works locally” and “it works in production” is where Laravel’s ecosystem earns its keep - the test factories, the job batching, the queue introspection, the deployment ergonomics that come with a genuinely mature framework.&lt;/p&gt;

&lt;p&gt;This is not tribalism. It’s a productivity calculation. And in 2026, with AI features now expected in virtually every serious web product, the developer who moves fast without breaking things has a structural advantage. Laravel, applied properly, is that advantage.&lt;/p&gt;

&lt;h2&gt;
  
  
  So Why Do Developers Keep Reaching for Node and Python?
&lt;/h2&gt;

&lt;p&gt;Honestly? Familiarity and social proof. AI tutorials default to Python because that’s where the research originates. JavaScript developers building frontends reach for Node because the context switch is smaller. Neither instinct is wrong, learning inside your comfort zone is rational behaviour.&lt;/p&gt;

&lt;p&gt;But comfort zone and best tool are not synonyms.&lt;/p&gt;

&lt;p&gt;If you’re a Laravel developer who has been watching the AI conversation happen in other ecosystems, quietly wondering whether you’re missing something, you’re not. The framework you already know is more capable for this problem space than most of what you’ve been reading would suggest. The Service Container, Eloquent, Redis, Horizon, Reverb, Prism - these are not consolation prizes. They’re a coherent, production-tested AI application stack.&lt;/p&gt;

&lt;p&gt;You don’t need to leave the ecosystem to build great AI-powered software. You need to go deeper in it.&lt;/p&gt;

&lt;h2&gt;
  
  
  One More Thing Before You Go
&lt;/h2&gt;

&lt;p&gt;This has been an opinion piece, and I want to be transparent about that. I’ve tried to make the case honestly, acknowledging where Node and Python are genuinely strong, and being specific about where I think Laravel holds its own. Reasonable people disagree with parts of this, and I’d genuinely like to hear where you land.&lt;/p&gt;

&lt;p&gt;Have you migrated an AI backend away from Laravel? Did it go the way you expected? Are you running a polyglot stack and making it work? Drop your thoughts in the comments below. This is exactly the kind of conversation worth having in public rather than in private Slack threads where nobody learns anything. If this resonated with you, share it with a developer who’s currently weighing up this decision. The more honest voices in that conversation, the better.&lt;/p&gt;

</description>
      <category>laravel</category>
      <category>ai</category>
      <category>webdev</category>
      <category>php</category>
    </item>
    <item>
      <title>What Laravel 13 Actually Changes for AI Development</title>
      <dc:creator>Dewald Hugo</dc:creator>
      <pubDate>Sat, 28 Mar 2026 13:39:56 +0000</pubDate>
      <link>https://dev.to/dewaldhugo/what-laravel-13-actually-changes-for-ai-development-43ad</link>
      <guid>https://dev.to/dewaldhugo/what-laravel-13-actually-changes-for-ai-development-43ad</guid>
      <description>&lt;p&gt;Laravel 13 dropped a production-stable AI SDK on release day and nobody's talking about what it actually replaces. Here's the full breakdown.&lt;/p&gt;

&lt;p&gt;Laravel 13 launched on March 17, 2026 , and Taylor Otwell kept his Laracon EU promise: zero breaking changes, a clean upgrade from Laravel 12, and one headline shift that resets the default approach to AI in PHP. The Laravel 13 AI SDK is now production-stable, first-party, and catalogued inside the official Laravel documentation as a core concern — not a side-effect of a community package you bolt on after you’ve already made architectural decisions you’ll later regret.&lt;/p&gt;

&lt;p&gt;This is not a release you skim for changelog bullet points and move on. For anyone building AI-powered features on Laravel, this release fundamentally changes how you should be structuring that work. Let’s go through it properly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Laravel 13 Is an AI-First Release
&lt;/h2&gt;

&lt;p&gt;Laravel 13 continues Laravel’s annual release cadence with a focus on AI-native workflows, stronger defaults, and more expressive developer APIs. The official framing is “minimal breaking changes,” and that’s accurate — but undersells it. The infrastructure laid here directly shapes whether your AI features remain maintainable at scale or turn into the kind of controller-level spaghetti you spend the next two years paying down.&lt;/p&gt;

&lt;p&gt;The Laravel AI SDK moves from beta to production-stable on the same day as Laravel 13. It is included as a first-party package and gives you a single, provider-agnostic interface for text generation, tool-calling agents, image creation, audio synthesis, and embedding generation. The SDK handles retry logic, error normalisation, and queue integration behind the scenes. You get all of that without writing a custom abstraction layer, and without coupling your application to a specific provider’s SDK contract.&lt;/p&gt;

&lt;p&gt;That last point deserves more weight than the release notes give it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Laravel AI SDK: What It Actually Gives You
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Text Generation and Agents
&lt;/h3&gt;

&lt;p&gt;The simplest entry point looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Ai\Agents\SalesCoach&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SalesCoach&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Analyse this sales transcript...'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With the AI SDK, you can build provider-agnostic AI features while keeping a consistent, Laravel-native developer experience. That means your &lt;code&gt;SalesCoach&lt;/code&gt; agent does not care whether it’s backed by OpenAI, Anthropic, or Google Gemini. You wire the provider in &lt;code&gt;config/ai.php&lt;/code&gt; and the agent contract stays unchanged.&lt;/p&gt;

&lt;p&gt;The default models used for text, images, audio, transcription, and embeddings are now configurable in your application’s &lt;code&gt;config/ai.php&lt;/code&gt; file. This gives you granular control over the exact models you’d like to use if you don’t want to rely on the package defaults. A minimal configuration targeting Anthropic looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// config/ai.php&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="s1"&gt;'models'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'text'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'default'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;env&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'AI_TEXT_MODEL'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'claude-sonnet-4-6'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="s1"&gt;'cheapest'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'claude-haiku-4-5-20251001'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'smartest'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'claude-opus-4-6'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You are not hardcoding a model string into a service class. You are not parsing a &lt;code&gt;.env&lt;/code&gt; file in three different controllers. One config file governs the whole application. If you need to roll back a model mid-incident, it’s one value and a &lt;code&gt;php artisan config:cache&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Images and Audio
&lt;/h3&gt;

&lt;p&gt;For visual generation use cases, the SDK offers a clean API for creating images from plain-language prompts. For voice experiences, you can synthesize natural-sounding audio from text for assistants, narrations, and accessibility features.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Ai\Image&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Ai\Audio&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;// Image generation&lt;/span&gt;
&lt;span class="nv"&gt;$image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Image&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'A product shot of a minimalist desk lamp'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nv"&gt;$rawContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;$image&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;// Audio synthesis&lt;/span&gt;
&lt;span class="nv"&gt;$audio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Audio&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Your order has been confirmed.'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nv"&gt;$rawContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;$audio&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fluent API is consistent across modalities. That consistency is not accidental. It means a developer who’s only worked with text generation can pick up image or audio synthesis in minutes — no mental context switch, no separate SDK documentation to parse.&lt;/p&gt;

&lt;h3&gt;
  
  
  Embeddings and the &lt;code&gt;Str&lt;/code&gt; Helper
&lt;/h3&gt;

&lt;p&gt;Here’s where it gets genuinely interesting. Embedding generation is wired directly into Laravel’s &lt;code&gt;Str&lt;/code&gt;helper:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Str&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$vector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Str&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Napa Valley has exceptional Cabernet Sauvignon.'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toEmbeddings&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That’s not a convenience method. That’s the framework signalling that embeddings are now a first-class data type. You’re not reaching for a one-off utility class — you’re calling a helper that sits alongside &lt;code&gt;Str::slug()&lt;/code&gt; and &lt;code&gt;Str::limit()&lt;/code&gt; as a standard part of the toolkit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;[Architect’s Note]&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;The fact that &lt;code&gt;toEmbeddings()&lt;/code&gt; lives on the &lt;code&gt;Str&lt;/code&gt;helper rather than a dedicated &lt;code&gt;Embedding&lt;/code&gt;facade is a deliberate design choice. It keeps embedding generation composable with the rest of your string manipulation pipeline. Chain it after a &lt;code&gt;limit()&lt;/code&gt; call to trim tokens before you generate — or after &lt;code&gt;markdown()&lt;/code&gt; to clean formatting before vectorising. Think of it as the framework nudging you toward preprocessing being part of the same expression.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Native Vector Search: Semantic Queries Without a Search Engine Subscription
&lt;/h2&gt;

&lt;p&gt;Laravel 13 deepens its semantic search story with native vector query support, embedding workflows, and related APIs. These features make it straightforward to build AI-powered search experiences using PostgreSQL and pgvector, including similarity search against embeddings generated directly from strings.&lt;/p&gt;

&lt;p&gt;The query builder extension looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;DB&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'documents'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;whereVectorSimilarTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'embedding'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Best restaurants in Cape Town'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Under the hood, this generates a pgvector-compatible cosine similarity query against your Postgres database. No Elasticsearch cluster. No Typesense subscription. No Algolia bill. For a significant number of use cases — internal knowledge bases, product catalogue search, customer support retrieval — Postgres with pgvector is more than sufficient, and the operational overhead is zero if you’re already on Postgres.&lt;/p&gt;

&lt;p&gt;The migration for the vector column uses a new column type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nc"&gt;Schema&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'documents'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Blueprint&lt;/span&gt; &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'embedding'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// dimension matches your model's output&lt;/span&gt;
    &lt;span class="nv"&gt;$table&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;timestamps&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A realistic ingestion pipeline that generates and stores embeddings looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Str&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Models\Document&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;IngestDocumentJob&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;ShouldQueue&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Dispatchable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;InteractsWithQueue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Queueable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;SerializesModels&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Str&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toEmbeddings&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="s1"&gt;'content'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'embedding'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Dispatch it from a controller, and your embedding pipeline is queued, retried on failure, and observable via Horizon — exactly the same way you’d treat any other background job.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;[Production Pitfall]&lt;/strong&gt; Embedding API calls are slow — typically 300–800ms depending on payload size and provider latency. Do not generate embeddings synchronously in a request lifecycle. Every embedding generation call belongs in a queued job, period. Under load, synchronous embedding generation will saturate your PHP-FPM pool faster than almost anything else. We’ve seen this exact pattern cause cascading timeouts in staging under traffic replay. Don’t let it reach production.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Provider Agnosticism: What It Means Practically
&lt;/h2&gt;

&lt;p&gt;Switch providers by changing one config value. Teams that offer professional Laravel development services can now build AI-powered features inside a standard Laravel project — no custom abstraction layers, no third-party SDK gymnastics required.&lt;/p&gt;

&lt;p&gt;This matters more than it reads. Prior to the AI SDK going stable, the common pattern was to inject an OpenAI client through the Service Container, write a thin wrapper around it, and pray that the wrapper held when you inevitably needed to test responses or swap a model mid-sprint. Most teams ended up with one of three things: a god-service that knew too much, a leaky abstraction that only worked for their current provider, or no abstraction at all and raw &lt;code&gt;Http::post()&lt;/code&gt; calls in controllers.&lt;/p&gt;

&lt;p&gt;The AI SDK replaces all three anti-patterns with a single, testable, swappable interface that the framework itself maintains. If the underlying provider deprecates an API method, the fix lands in a framework update — not in your codebase.&lt;/p&gt;

&lt;p&gt;Here’s how you’d swap to a different provider for a specific agent without touching the rest of the application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Ai\Facades\Ai&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Ai\Enums\Provider&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ai&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;using&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Provider&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;withModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'claude-opus-4-6'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Summarise this legal document...'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;using()&lt;/code&gt; call overrides the &lt;code&gt;config/ai.php&lt;/code&gt; default for this invocation only. You can do per-request provider overrides without touching global configuration. That’s useful in multi-tenant applications where different plans map to different model tiers.&lt;/p&gt;

&lt;p&gt;If you’re already running a custom abstraction over the OpenAI PHP SDK, we wrote about how to migrate that pattern properly — including how to handle cost visibility and telemetry — in &lt;a href="https://origin-main.com/laravel-architecture/production-grade-ai-architecture-in-laravel/" rel="noopener noreferrer"&gt;Production-Grade AI Architecture in Laravel: Contracts, Governance &amp;amp; Telemetry&lt;/a&gt;. The AI SDK doesn’t replace the need for those architectural decisions; it gives you a better foundation to implement them on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tool-Calling Agents: Building Agentic Workflows Natively
&lt;/h2&gt;

&lt;p&gt;The AI SDK’s agent support deserves its own section. Tool-calling — the mechanism by which an LLM decides to invoke a function in your application — is the primitive that makes agentic workflows possible.&lt;/p&gt;

&lt;p&gt;A basic agent with tools looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Ai\Agents&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Ai\Agent&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Ai\Attributes\Tool&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OrderAssistant&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'claude-sonnet-4-6'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$instructions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'You are a helpful order management assistant.'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="na"&gt;#[Tool('Look up the status of an order by ID')]&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;getOrderStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$orderId&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$order&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Order&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;findOrFail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$orderId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;"Order #&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$orderId&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; is currently &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$order&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;."&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="na"&gt;#[Tool('List all open orders for a given customer')]&lt;/span&gt;
    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;listOpenOrders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$customerId&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Order&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'customer_id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$customerId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'status'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'open'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;pluck&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'id'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;#[Tool]&lt;/code&gt; attribute wires the method directly into the model’s tool-calling interface. The SDK handles the round-trip: the model sees the tool definition, decides to call it, the SDK invokes your method, and the result is injected back into the conversation automatically.&lt;/p&gt;

&lt;p&gt;Notice that &lt;code&gt;getOrderStatus&lt;/code&gt;and &lt;code&gt;listOpenOrders&lt;/code&gt;are standard Eloquent queries. There is nothing AI-specific about the business logic inside the tools. This is the correct separation. Your tools are Laravel code. The AI SDK manages the protocol layer between your code and the model.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;[Edge Case Alert]&lt;/strong&gt; Tool-calling agents can enter infinite loops if the model repeatedly decides to call the same tool with the same arguments — this happens when the tool’s return value doesn’t advance the conversation’s goal. Always set a &lt;code&gt;maxSteps&lt;/code&gt;limit on your agents and handle &lt;code&gt;MaxStepsExceededException&lt;/code&gt;explicitly. Defaulting to unlimited steps in production is asking for a runaway API bill.&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OrderAssistant&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;maxSteps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"What's the status of order 99?"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you’re building more sophisticated multi-step agentic workflows and need schema validation on tool outputs — which you almost certainly will once you’re past demos — the &lt;a href="https://origin-main.com/laravel-architecture/laravel-agentic-workflow-schema-validation/" rel="noopener noreferrer"&gt;Hardening Laravel Agentic Workflows: Schema Validation Against LLM Hallucinations guide&lt;/a&gt; covers exactly that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Queue Integration and &lt;code&gt;config/ai.php&lt;/code&gt; as Operational Infrastructure
&lt;/h2&gt;

&lt;p&gt;One under-discussed aspect of the AI SDK is how it integrates with Laravel’s Queue system. Transcription now supports timeouts, giving you better control in production workloads and preventing long-running requests from tying up workers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$transcript&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Transcription&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;fromPath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'./podcast.mp3'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;240&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Lab&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nc"&gt;ElevenLabs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;timeout()&lt;/code&gt; call maps directly to the queue worker’s job timeout. If you’re already familiar with $timeout on job classes, this is the same mechanism — now surfaced at the SDK level so you don’t have to know to set it yourself.&lt;/p&gt;

&lt;p&gt;Pair this with Laravel 13’s new &lt;code&gt;Queue::route()&lt;/code&gt; method for clean queue topology:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Providers/AppServiceProvider.php&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Illuminate\Support\Facades\Queue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nc"&gt;Queue&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="nc"&gt;IngestDocumentJob&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'embeddings'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nc"&gt;GenerateImageJob&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'ai-images'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nc"&gt;TranscribeAudioJob&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'transcription'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AI jobs should never share a queue with your standard application jobs. Embedding generation and image synthesis calls have entirely different latency profiles and retry semantics than sending a welcome email. The new &lt;code&gt;Queue::route()&lt;/code&gt; method lets you define which queue and connection each job class uses from a single location in a service provider. Previously, teams either set queue properties on each job class or repeated the configuration at every dispatch site.&lt;/p&gt;

&lt;p&gt;Centralise that in a Service Provider and you’ve got one place to retune queue topology when your AI workload changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Error Handling: What the SDK Does, and What It Doesn’t
&lt;/h2&gt;

&lt;p&gt;The AI SDK handles retry logic and error normalisation internally, but that does not mean you write no error handling. It means you handle fewer low-level concerns. The contracts you still own:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Ai\Exceptions\AiProviderException&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Ai\Exceptions\RateLimitException&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Laravel\Ai\Exceptions\ContentFilterException&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SalesCoach&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;make&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$userInput&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;RateLimitException&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// The SDK exhausted its internal retry budget — back off and re-queue&lt;/span&gt;
    &lt;span class="nc"&gt;InvokeSalesCoachJob&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;dispatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$userInput&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;addSeconds&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ContentFilterException&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// The provider flagged the input — log it, don't retry&lt;/span&gt;
    &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Content filter triggered'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'input_hash'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'sha256'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$userInput&lt;/span&gt;&lt;span class="p"&gt;)]);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Input could not be processed.'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;422&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;AiProviderException&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Unknown provider error — log full context, fail gracefully&lt;/span&gt;
    &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'AI provider failure'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'message'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getMessage&lt;/span&gt;&lt;span class="p"&gt;()]);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s1"&gt;'error'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Service temporarily unavailable.'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;503&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The SDK’s internal retry logic handles transient 5xx errors and network timeouts. RateLimitException is thrown when the SDK’s retry budget is exhausted — at that point, you need application-level backpressure, not another retry. Handle these as distinct failure modes. They are.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;[Word to the Wise]&lt;/strong&gt; Logging &lt;code&gt;$e-&amp;gt;getMessage()&lt;/code&gt; on a provider exception often includes the raw prompt in the message string depending on the provider. Sanitise before logging in any context where the prompt may contain user PII. This is not a theoretical concern — it’s the kind of thing that shows up in a GDPR audit.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bigger Picture: Laravel’s AI Roadmap Signal
&lt;/h2&gt;

&lt;p&gt;The Laravel AI SDK going stable on the same day as Laravel 13 is a deliberate signal: the framework’s roadmap is now AI-first. The docs now include a dedicated AI section with entries for the AI SDK, MCP integration, and Laravel Boost — Taylor’s AI-assisted development tooling. The installation docs now include a dedicated “Laravel and AI” section, and existing apps are pointed to Laravel Boost for AI-assisted development workflows.&lt;/p&gt;

&lt;p&gt;Read that as a directional commitment. The primitives shipped in Laravel 13 — provider-agnostic text generation, first-class embeddings, native vector queries, tool-calling agents — are the foundation. What gets built on top of them in 13.x point releases and Laravel 14 will assume they exist. If you’re planning AI features over the next 12 months and you’re not on Laravel 13, you’re doing that planning on a weaker foundation than the framework now offers.&lt;/p&gt;

&lt;p&gt;The upgrade path is genuinely low-friction for most apps. The official release notes emphasise minimal breaking changes, and the official upgrade guide estimates about 10 minutes for many applications. The real risk is in custom cache behaviour, request forgery edge cases, and hand-rolled framework integrations.&lt;/p&gt;

&lt;p&gt;If you’re running a custom AI integration today — your own OpenAI service class, your own retry decorator, your own token-counting middleware — the migration question is not “should I switch to the AI SDK?” It’s “how quickly can I make the switch before the ecosystem diverges enough that porting gets expensive?” For token tracking and rate limiting specifically, see &lt;a href="https://origin-main.com/laravel-architecture/laravel-ai-middleware-token-tracking/" rel="noopener noreferrer"&gt;Laravel AI Middleware: Token Tracking &amp;amp; Rate Limiting&lt;/a&gt; — that pattern ports cleanly onto the SDK’s event hooks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Upgrade Checklist: AI-Specific Steps for Laravel 13
&lt;/h2&gt;

&lt;p&gt;Before you run &lt;code&gt;composer require laravel/framework:^13.0&lt;/code&gt;, address the following:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. PHP version&lt;/strong&gt;. Laravel 13 requires PHP 8.3 as a minimum. PHP 8.5, released November 2025, is also supported by Laravel 13, bringing further JIT improvements and native URI handling. Check your server runtime before anything else.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Remove redundant provider SDKs&lt;/strong&gt;. If you’re injecting &lt;code&gt;openai-php/client&lt;/code&gt; or &lt;code&gt;anthropic/anthropic-sdk-php&lt;/code&gt; directly via the Service Container, evaluate whether the AI SDK replaces that dependency entirely. In most cases, it does. Keeping both creates competing abstraction layers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Audit your queue configuration&lt;/strong&gt;. With &lt;code&gt;Queue::route()&lt;/code&gt; now available, centralise your AI job routing in &lt;code&gt;AppServiceProvider&lt;/code&gt;. If you’re currently setting &lt;code&gt;public string $queue = 'ai'&lt;/code&gt; on individual job classes, that works — but &lt;code&gt;Queue::route()&lt;/code&gt; is cleaner and easier to update without touching job classes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Generate and store your &lt;code&gt;config/ai.php&lt;/code&gt;&lt;/strong&gt;. The SDK expects this file. Publish it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;php artisan vendor:publish &lt;span class="nt"&gt;--tag&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ai-config
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then pin your model names explicitly rather than relying on package defaults. Model defaults change between SDK minor releases. You want to control when your application picks up a new default model — not discover that it happened because a patch updated the SDK.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Add pgvector to your Postgres instance&lt;/strong&gt; if you plan to use &lt;code&gt;whereVectorSimilarTo&lt;/code&gt;. The extension is available in RDS, Supabase, and Neon without additional configuration. On self-managed Postgres, install it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="o"&gt;//&lt;/span&gt;&lt;span class="k"&gt;sql&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;6. Test your request forgery protection&lt;/strong&gt;. Laravel 13 formalises &lt;code&gt;PreventRequestForgery&lt;/code&gt;with origin-aware verification on top of token-based CSRF. If you have custom CSRF handling or API routes with unusual origin configurations, test them explicitly before deploying.&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s Not Yet There
&lt;/h2&gt;

&lt;p&gt;Let’s be direct about the gaps.&lt;/p&gt;

&lt;p&gt;Streaming responses — where the model outputs tokens progressively rather than in a single payload — are not yet a first-class concern in the AI SDK’s stable release. For chat interfaces that need token streaming, you’ll still be reaching for provider-specific solutions or custom SSE handling. Watch the 13.x changelog; this is an obvious next primitive.&lt;/p&gt;

&lt;p&gt;Multi-modal input — sending images or audio to the model rather than from it — is also not yet documented in the stable SDK surface area. It’s likely coming in a 13.x release, but don’t plan around it until it ships.&lt;/p&gt;

&lt;p&gt;The vector search integration is Postgres-only for now. If you’re on MySQL or MariaDB, &lt;code&gt;whereVectorSimilarTo&lt;/code&gt;is not available. For those stacks, external vector stores (Pinecone, Qdrant, Weaviate) remain the path, and you’ll need your own integration layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Laravel 13 is not a rewrite. It does not break your application. What it does is establish a new baseline for what “standard Laravel AI work” looks like — and that baseline is considerably higher than it was under Laravel 12. The teams that adopt this now and build their AI features against SDK contracts rather than raw provider clients will have significantly less migration debt when Laravel 14 arrives and extends these primitives further.&lt;/p&gt;

&lt;p&gt;Get on PHP 8.3, publish &lt;code&gt;config/ai.php&lt;/code&gt;, and start moving your AI layer onto the SDK. The upgrade cost is low. The cost of not doing it compounds.&lt;/p&gt;

&lt;p&gt;For the official release notes and full SDK documentation, see the &lt;a href="https://laravel.com/docs/13.x/releases" rel="noopener noreferrer"&gt;Laravel 13 Release Notes&lt;/a&gt; and the &lt;a href="https://laravel.com/docs/13.x/ai-sdk" rel="noopener noreferrer"&gt;Laravel AI SDK documentation&lt;/a&gt; directly.&lt;/p&gt;

</description>
      <category>laravel</category>
      <category>php</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>The Complete Guide to Integrating Claude API with Laravel</title>
      <dc:creator>Dewald Hugo</dc:creator>
      <pubDate>Sun, 15 Feb 2026 08:18:16 +0000</pubDate>
      <link>https://dev.to/dewaldhugo/the-complete-guide-to-integrating-claude-api-with-laravel-5413</link>
      <guid>https://dev.to/dewaldhugo/the-complete-guide-to-integrating-claude-api-with-laravel-5413</guid>
      <description>&lt;p&gt;Most Laravel developers hit the same wall when integrating Claude: the official docs focus on JavaScript, most community guides assume Python, and Laravel tutorials stop at toy chatbots that break under real load.&lt;/p&gt;

&lt;p&gt;This guide shows how Claude fits into a production Laravel 11 codebase—how to structure the integration cleanly and why certain decisions matter once AI becomes a dependency instead of an experiment.&lt;/p&gt;

&lt;p&gt;We’ll build in layers. Each part expands the system, explains why a pattern exists, and walks through the code. By the end, you’ll understand not just how to call Claude, but how to reason about AI integration the same way you reason about databases, queues, and external services.&lt;/p&gt;

&lt;p&gt;Stack: Laravel 11, Claude Messages API, raw HTTP (no SDK).&lt;/p&gt;

&lt;p&gt;Why Integrate Claude with Laravel&lt;br&gt;
Claude is designed for longer reasoning chains, clearer structured output, and safer handling of ambiguous prompts. This makes it well-suited to applications where AI output is part of a workflow, not a novelty feature.&lt;/p&gt;

&lt;p&gt;Laravel provides exactly the tooling required to manage such workflows: service containers, background jobs, caching, and structured error handling. AI interactions become first-class application concerns.&lt;/p&gt;

&lt;p&gt;Rather than building a single-purpose chatbot, we’ll design a Claude integration that supports synchronous requests, streaming output, conversational memory, and background processing. Each pattern builds on the same foundation.&lt;/p&gt;
&lt;h2&gt;
  
  
  Part 1: Foundation
&lt;/h2&gt;

&lt;p&gt;Authentication and API Setup&lt;br&gt;
Claude uses straightforward API key authentication—no token exchange or refresh mechanism. This simplifies integration but puts responsibility on you to manage access carefully.&lt;/p&gt;

&lt;p&gt;Store the API key in environment configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CLAUDE_API_KEY=sk-ant-…
CLAUDE_API_VERSION=2023-06-01
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Laravel’s config system surfaces these cleanly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// config/claude.php
return [
    'api_key' =&amp;gt; env('CLAUDE_API_KEY'),
    'version' =&amp;gt; env('CLAUDE_API_VERSION', '2023-06-01'),
    'base_url' =&amp;gt; 'https://api.anthropic.com/v1/messages',
];
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This centralizes Claude-specific configuration and makes testing and environment overrides trivial without touching application code.&lt;/p&gt;

&lt;p&gt;Laravel Service Architecture&lt;br&gt;
Don’t call the Claude API directly from controllers. Create a dedicated service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;php artisan make:class Services/ClaudeService
namespace App\Services;
use Illuminate\Support\Facades\Http;
class ClaudeService
{
    public function send(array $messages, array $options = []): array
    {
        $response = Http::withHeaders([
            'x-api-key' =&amp;gt; config('claude.api_key'),
            'anthropic-version' =&amp;gt; config('claude.version'),
            'content-type' =&amp;gt; 'application/json',
        ])-&amp;gt;post(config('claude.base_url'), [
            'model' =&amp;gt; $options['model'] ?? 'claude-3-5-sonnet-latest',
            'messages' =&amp;gt; $messages,
            'max_tokens' =&amp;gt; $options['max_tokens'] ?? 1024,
        ]);
        if ($response-&amp;gt;failed()) {
            throw new \RuntimeException(
                'Claude API error: ' . $response-&amp;gt;body()
            );
        }
        return $response-&amp;gt;json();
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This service becomes the single point of truth for Claude communication.&lt;/p&gt;

&lt;p&gt;Error Handling Strategy&lt;br&gt;
AI APIs fail differently from traditional services. Errors may represent transient capacity issues rather than application bugs. Error handling should prioritize resilience and observability.&lt;/p&gt;

&lt;p&gt;Wrap calls in try/catch blocks and report exceptions. Log raw API responses for debugging, but never return them directly to users.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;try {
    $result = $claude-&amp;gt;send($messages);
} catch (\Throwable $e) {
    report($e);
    return response()-&amp;gt;json([
        'error' =&amp;gt; 'AI service temporarily unavailable'
    ], 503);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Environment Sensitivity&lt;br&gt;
Different environments require different tolerances. Development benefits from low token limits and aggressive timeouts. Production requires higher ceilings.&lt;/p&gt;

&lt;p&gt;Externalize these values to tune behavior without redeploying:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CLAUDE_MAX_TOKENS=1024
CLAUDE_TIMEOUT=30
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;'timeout' =&amp;gt; env('CLAUDE_TIMEOUT', 30),
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Part 2: Core Integration Patterns
&lt;/h2&gt;

&lt;p&gt;Once authentication and setup are complete, the challenge isn’t calling Claude—it’s deciding where that call should live, how it should fail, and how the rest of your application should interact with it.&lt;/p&gt;

&lt;p&gt;Designing a Claude Service Boundary&lt;br&gt;
The most common mistake is letting API calls leak into controllers, jobs, or Livewire components. That works for prototypes but quickly becomes untestable and fragile.&lt;/p&gt;

&lt;p&gt;Treat Claude like any other external dependency: Stripe, S3, or an internal microservice. Create a single, dedicated client.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;namespace App\Services\Claude;
use Illuminate\Support\Facades\Http;
final class ClaudeClient
{
    public function message(array $payload): array
    {
        return Http::withHeaders([
                'x-api-key' =&amp;gt; config('services.claude.key'),
                'anthropic-version' =&amp;gt; '2023-06-01',
            ])
            -&amp;gt;post('https://api.anthropic.com/v1/messages', $payload)
            -&amp;gt;throw()
            -&amp;gt;json();
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This class does one thing: speak Claude’s protocol. It doesn’t know about your application’s domain.&lt;/p&gt;

&lt;p&gt;You can swap models without touching business logic, mock the client cleanly in tests, and isolate vendor churn to one file.&lt;/p&gt;

&lt;p&gt;Pattern 1: Simple Request–Response Workflows&lt;br&gt;
Most Claude interactions are: generate text, summarize content, rewrite copy, extract data.&lt;/p&gt;

&lt;p&gt;Instead of calling the client directly, wrap each use case in a domain-specific service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;final class ArticleSummarizer
{
    public function __construct(
        private readonly ClaudeClient $claude
    ) {}
    public function summarize(string $article): string
    {
        $response = $this-&amp;gt;claude-&amp;gt;message([
            'model' =&amp;gt; 'claude-3-5-sonnet-latest',
            'max_tokens' =&amp;gt; 400,
            'messages' =&amp;gt; [
                [
                    'role' =&amp;gt; 'user',
                    'content' =&amp;gt; "Summarize the following article:\n\n{$article}",
                ],
            ],
        ]);
        return $response['content'][0]['text'];
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From the controller’s perspective:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public function store(Request $request, ArticleSummarizer $summarizer)
{
    $summary = $summarizer-&amp;gt;summarize($request-&amp;gt;input('content'));
    return response()-&amp;gt;json(['summary' =&amp;gt; $summary]);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Controllers orchestrate. Services decide how Claude is used. The Claude client handles transport.&lt;/p&gt;

&lt;p&gt;Pattern 2: Streaming Responses&lt;br&gt;
Once output grows beyond a few hundred tokens, synchronous calls start to feel slow—even if they’re technically fast.&lt;/p&gt;

&lt;p&gt;Claude supports streaming responses, and Laravel supports streamed HTTP responses. Your Claude client exposes a streaming method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public function stream(array $payload, callable $onChunk): void
{
    Http::withHeaders([
            'x-api-key' =&amp;gt; config('services.claude.key'),
            'anthropic-version' =&amp;gt; '2023-06-01',
        ])
        -&amp;gt;withOptions(['stream' =&amp;gt; true])
        -&amp;gt;post('https://api.anthropic.com/v1/messages', $payload)
        -&amp;gt;onChunk(function ($chunk) use ($onChunk) {
            $onChunk($chunk);
        });
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then in a controller:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public function generate(Request $request)
{
    return response()-&amp;gt;stream(function () use ($request) {
        $this-&amp;gt;claude-&amp;gt;stream([
            'model' =&amp;gt; 'claude-3-5-sonnet-latest',
            'max_tokens' =&amp;gt; 2000,
            'messages' =&amp;gt; [
                [
                    'role' =&amp;gt; 'user',
                    'content' =&amp;gt; $request-&amp;gt;input('prompt'),
                ],
            ],
        ], function ($chunk) {
            echo $chunk;
            flush();
        });
    });
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This improves perceived performance and unlocks live writing experiences, progressive document analysis, and token-by-token UI updates.&lt;/p&gt;

&lt;p&gt;Claude is still stateless. Streaming doesn’t change memory behavior—only delivery.&lt;/p&gt;

&lt;p&gt;Pattern 3: Managing Conversation State&lt;br&gt;
Claude doesn’t remember previous requests. Conversation memory is always an application concern.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;final class ConversationResponder
{
    public function __construct(
        private readonly ClaudeClient $claude
    ) {}
    public function respond(array $history, string $input): string
    {
        $messages = [
            ...$history,
            ['role' =&amp;gt; 'user', 'content' =&amp;gt; $input],
        ];
        $response = $this-&amp;gt;claude-&amp;gt;message([
            'model' =&amp;gt; 'claude-3-5-sonnet-latest',
            'messages' =&amp;gt; $messages,
            'max_tokens' =&amp;gt; 600,
        ]);
        return $response['content'][0]['text'];
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Where history comes from is a strategic decision: session storage for ephemeral chats, database for persistent conversations, Redis for fast transient workflows.&lt;/p&gt;

&lt;p&gt;Keeping memory outside the client lets you control token growth, privacy boundaries, and cost predictability.&lt;/p&gt;

&lt;p&gt;Pattern 4: Background Processing with Queues&lt;br&gt;
Not all Claude interactions belong in HTTP requests. Batch operations, document ingestion, and analysis jobs should run asynchronously.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;final class AnalyzeDocument implements ShouldQueue
{
    use Dispatchable, Queueable;
    public function __construct(
        private readonly int $documentId
    ) {}
    public function handle(ClaudeClient $claude): void
    {
        $document = Document::findOrFail($this-&amp;gt;documentId);
        $response = $claude-&amp;gt;message([
            'model' =&amp;gt; 'claude-3-5-sonnet-latest',
            'messages' =&amp;gt; [
                [
                    'role' =&amp;gt; 'user',
                    'content' =&amp;gt; "Extract structured data:\n\n{$document-&amp;gt;content}",
                ],
            ],
            'max_tokens' =&amp;gt; 1000,
        ]);
        $document-&amp;gt;update([
            'analysis' =&amp;gt; $response['content'][0]['text'],
        ]);
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Queues give you retries, failure isolation, and throughput control—all critical when AI latency is variable.&lt;/p&gt;

&lt;p&gt;These patterns keep Claude calls predictable, prevent controller bloat, make failures observable, and allow features to grow independently.&lt;/p&gt;

&lt;p&gt;Token Accounting: Making AI Costs Predictable&lt;br&gt;
Tokens are a direct function of input size (prompts, conversation history, documents), output length, model choice, and retry behavior. If you don’t account for tokens early, you lose control over cost and performance.&lt;/p&gt;

&lt;p&gt;Why Token Accounting Belongs in the Service Layer&lt;br&gt;
Claude’s API returns usage information with each response. That data should never be handled in controllers or UI components. Your Claude client is the correct place to extract and normalize it.&lt;/p&gt;

&lt;p&gt;Example response fragment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "usage": {
    "input_tokens": 812,
    "output_tokens": 436
  }
}
Modify your ClaudeClient to surface this explicitly:

final class ClaudeResponse
{
    public function __construct(
        public readonly string $text,
        public readonly int $inputTokens,
        public readonly int $outputTokens
    ) {}
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then adapt the client:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public function message(array $payload): ClaudeResponse
{
    $response = Http::withHeaders([
            'x-api-key' =&amp;gt; config('services.claude.key'),
            'anthropic-version' =&amp;gt; '2023-06-01',
        ])
        -&amp;gt;post('https://api.anthropic.com/v1/messages', $payload)
        -&amp;gt;throw()
        -&amp;gt;json();
    return new ClaudeResponse(
        text: $response['content'][0]['text'],
        inputTokens: $response['usage']['input_tokens'] ?? 0,
        outputTokens: $response['usage']['output_tokens'] ?? 0,
    );
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every Claude interaction becomes measurable.&lt;/p&gt;

&lt;p&gt;Propagating Token Data Without Polluting Business Logic&lt;br&gt;
Your domain services should use token data without being tightly coupled to Claude:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;final class ArticleSummarizer
{
    public function __construct(
        private readonly ClaudeClient $claude
    ) {}
    public function summarize(string $article): array
    {
        $response = $this-&amp;gt;claude-&amp;gt;message([
            'model' =&amp;gt; 'claude-3-5-sonnet-latest',
            'max_tokens' =&amp;gt; 400,
            'messages' =&amp;gt; [
                [
                    'role' =&amp;gt; 'user',
                    'content' =&amp;gt; "Summarize the following article:\n\n{$article}",
                ],
            ],
        ]);
        return [
            'summary' =&amp;gt; $response-&amp;gt;text,
            'tokens' =&amp;gt; [
                'input' =&amp;gt; $response-&amp;gt;inputTokens,
                'output' =&amp;gt; $response-&amp;gt;outputTokens,
            ],
        ];
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This allows you to log usage, display cost hints in admin tools, and enforce limits per request or per user—without leaking Claude-specific details into controllers.&lt;/p&gt;

&lt;p&gt;Estimating Cost Before You Call Claude&lt;br&gt;
Pre-flight estimation is one of the most effective control mechanisms. While you can’t know the exact output token count ahead of time, input tokens are deterministic.&lt;/p&gt;

&lt;p&gt;A simple heuristic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;final class TokenEstimator
{
    public static function estimateInputTokens(string $text): int
    {
        // Rough heuristic: ~4 characters per token
        return (int) ceil(strlen($text) / 4);
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Used defensively:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$estimatedTokens = TokenEstimator::estimateInputTokens($document);
if ($estimatedTokens &amp;gt; 12_000) {
    throw new DomainException('Document too large for single Claude request.');
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This prevents accidental runaway costs and forces intentional chunking strategies.&lt;/p&gt;

&lt;p&gt;Tracking Token Usage Over Time&lt;br&gt;
Once token data is available, persisting it is trivial—and invaluable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Schema::create('ai_usage_logs', function (Blueprint $table) {
    $table-&amp;gt;id();
    $table-&amp;gt;string('feature');
    $table-&amp;gt;integer('input_tokens');
    $table-&amp;gt;integer('output_tokens');
    $table-&amp;gt;timestamps();
});
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Logged at the service boundary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AiUsageLog::create([
    'feature' =&amp;gt; 'article_summary',
    'input_tokens' =&amp;gt; $response-&amp;gt;inputTokens,
    'output_tokens' =&amp;gt; $response-&amp;gt;outputTokens,
]);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you per-feature cost visibility, early warning signals for prompt regressions, and data to justify caching or refactoring decisions.&lt;/p&gt;

&lt;p&gt;Without token accounting, conversation memory grows unchecked, streaming hides true cost, and background jobs quietly explode your bill.&lt;/p&gt;

&lt;p&gt;Read full article here: &lt;a href="https://origin-main.com/guides/the-complete-guide-to-integrating-claude-api-with-laravel/" rel="noopener noreferrer"&gt;https://origin-main.com/guides/the-complete-guide-to-integrating-claude-api-with-laravel/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>claude</category>
      <category>api</category>
      <category>laravel</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
