DEV Community

Cover image for The Complete Guide to Integrating Claude API with Laravel
Dewald Hugo
Dewald Hugo

Posted on • Edited on

The Complete Guide to Integrating Claude API with Laravel

Most Laravel developers hit the same wall when integrating Claude: the official docs focus on JavaScript, most community guides assume Python, and Laravel tutorials stop at toy chatbots that break under real load.

This guide shows how Claude fits into a production Laravel 11 codebase—how to structure the integration cleanly and why certain decisions matter once AI becomes a dependency instead of an experiment.

We’ll build in layers. Each part expands the system, explains why a pattern exists, and walks through the code. By the end, you’ll understand not just how to call Claude, but how to reason about AI integration the same way you reason about databases, queues, and external services.

Stack: Laravel 11, Claude Messages API, raw HTTP (no SDK).

Why Integrate Claude with Laravel
Claude is designed for longer reasoning chains, clearer structured output, and safer handling of ambiguous prompts. This makes it well-suited to applications where AI output is part of a workflow, not a novelty feature.

Laravel provides exactly the tooling required to manage such workflows: service containers, background jobs, caching, and structured error handling. AI interactions become first-class application concerns.

Rather than building a single-purpose chatbot, we’ll design a Claude integration that supports synchronous requests, streaming output, conversational memory, and background processing. Each pattern builds on the same foundation.

Part 1: Foundation

Authentication and API Setup
Claude uses straightforward API key authentication—no token exchange or refresh mechanism. This simplifies integration but puts responsibility on you to manage access carefully.

Store the API key in environment configuration:

CLAUDE_API_KEY=sk-ant-…
CLAUDE_API_VERSION=2023-06-01
Enter fullscreen mode Exit fullscreen mode

Laravel’s config system surfaces these cleanly:

// config/claude.php
return [
    'api_key' => env('CLAUDE_API_KEY'),
    'version' => env('CLAUDE_API_VERSION', '2023-06-01'),
    'base_url' => 'https://api.anthropic.com/v1/messages',
];
Enter fullscreen mode Exit fullscreen mode

This centralizes Claude-specific configuration and makes testing and environment overrides trivial without touching application code.

Laravel Service Architecture
Don’t call the Claude API directly from controllers. Create a dedicated service:

php artisan make:class Services/ClaudeService
namespace App\Services;
use Illuminate\Support\Facades\Http;
class ClaudeService
{
    public function send(array $messages, array $options = []): array
    {
        $response = Http::withHeaders([
            'x-api-key' => config('claude.api_key'),
            'anthropic-version' => config('claude.version'),
            'content-type' => 'application/json',
        ])->post(config('claude.base_url'), [
            'model' => $options['model'] ?? 'claude-3-5-sonnet-latest',
            'messages' => $messages,
            'max_tokens' => $options['max_tokens'] ?? 1024,
        ]);
        if ($response->failed()) {
            throw new \RuntimeException(
                'Claude API error: ' . $response->body()
            );
        }
        return $response->json();
    }
}
Enter fullscreen mode Exit fullscreen mode

This service becomes the single point of truth for Claude communication.

Error Handling Strategy
AI APIs fail differently from traditional services. Errors may represent transient capacity issues rather than application bugs. Error handling should prioritize resilience and observability.

Wrap calls in try/catch blocks and report exceptions. Log raw API responses for debugging, but never return them directly to users.

try {
    $result = $claude->send($messages);
} catch (\Throwable $e) {
    report($e);
    return response()->json([
        'error' => 'AI service temporarily unavailable'
    ], 503);
}
Enter fullscreen mode Exit fullscreen mode

Environment Sensitivity
Different environments require different tolerances. Development benefits from low token limits and aggressive timeouts. Production requires higher ceilings.

Externalize these values to tune behavior without redeploying:

CLAUDE_MAX_TOKENS=1024
CLAUDE_TIMEOUT=30
Enter fullscreen mode Exit fullscreen mode

Then:

'timeout' => env('CLAUDE_TIMEOUT', 30),
Enter fullscreen mode Exit fullscreen mode

Part 2: Core Integration Patterns

Once authentication and setup are complete, the challenge isn’t calling Claude—it’s deciding where that call should live, how it should fail, and how the rest of your application should interact with it.

Designing a Claude Service Boundary
The most common mistake is letting API calls leak into controllers, jobs, or Livewire components. That works for prototypes but quickly becomes untestable and fragile.

Treat Claude like any other external dependency: Stripe, S3, or an internal microservice. Create a single, dedicated client.

namespace App\Services\Claude;
use Illuminate\Support\Facades\Http;
final class ClaudeClient
{
    public function message(array $payload): array
    {
        return Http::withHeaders([
                'x-api-key' => config('services.claude.key'),
                'anthropic-version' => '2023-06-01',
            ])
            ->post('https://api.anthropic.com/v1/messages', $payload)
            ->throw()
            ->json();
    }
}
Enter fullscreen mode Exit fullscreen mode

This class does one thing: speak Claude’s protocol. It doesn’t know about your application’s domain.

You can swap models without touching business logic, mock the client cleanly in tests, and isolate vendor churn to one file.

Pattern 1: Simple Request–Response Workflows
Most Claude interactions are: generate text, summarize content, rewrite copy, extract data.

Instead of calling the client directly, wrap each use case in a domain-specific service:

final class ArticleSummarizer
{
    public function __construct(
        private readonly ClaudeClient $claude
    ) {}
    public function summarize(string $article): string
    {
        $response = $this->claude->message([
            'model' => 'claude-3-5-sonnet-latest',
            'max_tokens' => 400,
            'messages' => [
                [
                    'role' => 'user',
                    'content' => "Summarize the following article:\n\n{$article}",
                ],
            ],
        ]);
        return $response['content'][0]['text'];
    }
}
Enter fullscreen mode Exit fullscreen mode

From the controller’s perspective:

public function store(Request $request, ArticleSummarizer $summarizer)
{
    $summary = $summarizer->summarize($request->input('content'));
    return response()->json(['summary' => $summary]);
}
Enter fullscreen mode Exit fullscreen mode

Controllers orchestrate. Services decide how Claude is used. The Claude client handles transport.

Pattern 2: Streaming Responses
Once output grows beyond a few hundred tokens, synchronous calls start to feel slow—even if they’re technically fast.

Claude supports streaming responses, and Laravel supports streamed HTTP responses. Your Claude client exposes a streaming method:

public function stream(array $payload, callable $onChunk): void
{
    Http::withHeaders([
            'x-api-key' => config('services.claude.key'),
            'anthropic-version' => '2023-06-01',
        ])
        ->withOptions(['stream' => true])
        ->post('https://api.anthropic.com/v1/messages', $payload)
        ->onChunk(function ($chunk) use ($onChunk) {
            $onChunk($chunk);
        });
}
Enter fullscreen mode Exit fullscreen mode

Then in a controller:

public function generate(Request $request)
{
    return response()->stream(function () use ($request) {
        $this->claude->stream([
            'model' => 'claude-3-5-sonnet-latest',
            'max_tokens' => 2000,
            'messages' => [
                [
                    'role' => 'user',
                    'content' => $request->input('prompt'),
                ],
            ],
        ], function ($chunk) {
            echo $chunk;
            flush();
        });
    });
}
Enter fullscreen mode Exit fullscreen mode

This improves perceived performance and unlocks live writing experiences, progressive document analysis, and token-by-token UI updates.

Claude is still stateless. Streaming doesn’t change memory behavior—only delivery.

Pattern 3: Managing Conversation State
Claude doesn’t remember previous requests. Conversation memory is always an application concern.

final class ConversationResponder
{
    public function __construct(
        private readonly ClaudeClient $claude
    ) {}
    public function respond(array $history, string $input): string
    {
        $messages = [
            ...$history,
            ['role' => 'user', 'content' => $input],
        ];
        $response = $this->claude->message([
            'model' => 'claude-3-5-sonnet-latest',
            'messages' => $messages,
            'max_tokens' => 600,
        ]);
        return $response['content'][0]['text'];
    }
}
Enter fullscreen mode Exit fullscreen mode

Where history comes from is a strategic decision: session storage for ephemeral chats, database for persistent conversations, Redis for fast transient workflows.

Keeping memory outside the client lets you control token growth, privacy boundaries, and cost predictability.

Pattern 4: Background Processing with Queues
Not all Claude interactions belong in HTTP requests. Batch operations, document ingestion, and analysis jobs should run asynchronously.

final class AnalyzeDocument implements ShouldQueue
{
    use Dispatchable, Queueable;
    public function __construct(
        private readonly int $documentId
    ) {}
    public function handle(ClaudeClient $claude): void
    {
        $document = Document::findOrFail($this->documentId);
        $response = $claude->message([
            'model' => 'claude-3-5-sonnet-latest',
            'messages' => [
                [
                    'role' => 'user',
                    'content' => "Extract structured data:\n\n{$document->content}",
                ],
            ],
            'max_tokens' => 1000,
        ]);
        $document->update([
            'analysis' => $response['content'][0]['text'],
        ]);
    }
}
Enter fullscreen mode Exit fullscreen mode

Queues give you retries, failure isolation, and throughput control—all critical when AI latency is variable.

These patterns keep Claude calls predictable, prevent controller bloat, make failures observable, and allow features to grow independently.

Token Accounting: Making AI Costs Predictable
Tokens are a direct function of input size (prompts, conversation history, documents), output length, model choice, and retry behavior. If you don’t account for tokens early, you lose control over cost and performance.

Why Token Accounting Belongs in the Service Layer
Claude’s API returns usage information with each response. That data should never be handled in controllers or UI components. Your Claude client is the correct place to extract and normalize it.

Example response fragment:

{
  "usage": {
    "input_tokens": 812,
    "output_tokens": 436
  }
}
Modify your ClaudeClient to surface this explicitly:

final class ClaudeResponse
{
    public function __construct(
        public readonly string $text,
        public readonly int $inputTokens,
        public readonly int $outputTokens
    ) {}
}
Enter fullscreen mode Exit fullscreen mode

Then adapt the client:

public function message(array $payload): ClaudeResponse
{
    $response = Http::withHeaders([
            'x-api-key' => config('services.claude.key'),
            'anthropic-version' => '2023-06-01',
        ])
        ->post('https://api.anthropic.com/v1/messages', $payload)
        ->throw()
        ->json();
    return new ClaudeResponse(
        text: $response['content'][0]['text'],
        inputTokens: $response['usage']['input_tokens'] ?? 0,
        outputTokens: $response['usage']['output_tokens'] ?? 0,
    );
}
Enter fullscreen mode Exit fullscreen mode

Every Claude interaction becomes measurable.

Propagating Token Data Without Polluting Business Logic
Your domain services should use token data without being tightly coupled to Claude:

final class ArticleSummarizer
{
    public function __construct(
        private readonly ClaudeClient $claude
    ) {}
    public function summarize(string $article): array
    {
        $response = $this->claude->message([
            'model' => 'claude-3-5-sonnet-latest',
            'max_tokens' => 400,
            'messages' => [
                [
                    'role' => 'user',
                    'content' => "Summarize the following article:\n\n{$article}",
                ],
            ],
        ]);
        return [
            'summary' => $response->text,
            'tokens' => [
                'input' => $response->inputTokens,
                'output' => $response->outputTokens,
            ],
        ];
    }
}
Enter fullscreen mode Exit fullscreen mode

This allows you to log usage, display cost hints in admin tools, and enforce limits per request or per user—without leaking Claude-specific details into controllers.

Estimating Cost Before You Call Claude
Pre-flight estimation is one of the most effective control mechanisms. While you can’t know the exact output token count ahead of time, input tokens are deterministic.

A simple heuristic:

final class TokenEstimator
{
    public static function estimateInputTokens(string $text): int
    {
        // Rough heuristic: ~4 characters per token
        return (int) ceil(strlen($text) / 4);
    }
}
Enter fullscreen mode Exit fullscreen mode

Used defensively:

$estimatedTokens = TokenEstimator::estimateInputTokens($document);
if ($estimatedTokens > 12_000) {
    throw new DomainException('Document too large for single Claude request.');
}
Enter fullscreen mode Exit fullscreen mode

This prevents accidental runaway costs and forces intentional chunking strategies.

Tracking Token Usage Over Time
Once token data is available, persisting it is trivial—and invaluable:

Schema::create('ai_usage_logs', function (Blueprint $table) {
    $table->id();
    $table->string('feature');
    $table->integer('input_tokens');
    $table->integer('output_tokens');
    $table->timestamps();
});
Enter fullscreen mode Exit fullscreen mode

Logged at the service boundary:

AiUsageLog::create([
    'feature' => 'article_summary',
    'input_tokens' => $response->inputTokens,
    'output_tokens' => $response->outputTokens,
]);
Enter fullscreen mode Exit fullscreen mode

This gives you per-feature cost visibility, early warning signals for prompt regressions, and data to justify caching or refactoring decisions.

Without token accounting, conversation memory grows unchecked, streaming hides true cost, and background jobs quietly explode your bill.

Read full article here: https://origin-main.com/guides/the-complete-guide-to-integrating-claude-api-with-laravel/

Top comments (0)