Dewald Hugo

Posted on Feb 15 • Edited on Feb 20

The Complete Guide to Integrating Claude API with Laravel

#claude #api #laravel #tutorial

Most Laravel developers hit the same wall when integrating Claude: the official docs focus on JavaScript, most community guides assume Python, and Laravel tutorials stop at toy chatbots that break under real load.

This guide shows how Claude fits into a production Laravel 11 codebase—how to structure the integration cleanly and why certain decisions matter once AI becomes a dependency instead of an experiment.

We’ll build in layers. Each part expands the system, explains why a pattern exists, and walks through the code. By the end, you’ll understand not just how to call Claude, but how to reason about AI integration the same way you reason about databases, queues, and external services.

Stack: Laravel 11, Claude Messages API, raw HTTP (no SDK).

Why Integrate Claude with Laravel
Claude is designed for longer reasoning chains, clearer structured output, and safer handling of ambiguous prompts. This makes it well-suited to applications where AI output is part of a workflow, not a novelty feature.

Laravel provides exactly the tooling required to manage such workflows: service containers, background jobs, caching, and structured error handling. AI interactions become first-class application concerns.

Rather than building a single-purpose chatbot, we’ll design a Claude integration that supports synchronous requests, streaming output, conversational memory, and background processing. Each pattern builds on the same foundation.

Part 1: Foundation

Authentication and API Setup
Claude uses straightforward API key authentication—no token exchange or refresh mechanism. This simplifies integration but puts responsibility on you to manage access carefully.

Store the API key in environment configuration:

CLAUDE_API_KEY=sk-ant-…
CLAUDE_API_VERSION=2023-06-01

Laravel’s config system surfaces these cleanly:

// config/claude.php
return [
    'api_key' => env('CLAUDE_API_KEY'),
    'version' => env('CLAUDE_API_VERSION', '2023-06-01'),
    'base_url' => 'https://api.anthropic.com/v1/messages',
];

This centralizes Claude-specific configuration and makes testing and environment overrides trivial without touching application code.

Laravel Service Architecture
Don’t call the Claude API directly from controllers. Create a dedicated service:

php artisan make:class Services/ClaudeService
namespace App\Services;
use Illuminate\Support\Facades\Http;
class ClaudeService
{
    public function send(array $messages, array $options = []): array
    {
        $response = Http::withHeaders([
            'x-api-key' => config('claude.api_key'),
            'anthropic-version' => config('claude.version'),
            'content-type' => 'application/json',
        ])->post(config('claude.base_url'), [
            'model' => $options['model'] ?? 'claude-3-5-sonnet-latest',
            'messages' => $messages,
            'max_tokens' => $options['max_tokens'] ?? 1024,
        ]);
        if ($response->failed()) {
            throw new \RuntimeException(
                'Claude API error: ' . $response->body()
            );
        }
        return $response->json();
    }
}

This service becomes the single point of truth for Claude communication.

Error Handling Strategy
AI APIs fail differently from traditional services. Errors may represent transient capacity issues rather than application bugs. Error handling should prioritize resilience and observability.

Wrap calls in try/catch blocks and report exceptions. Log raw API responses for debugging, but never return them directly to users.

try {
    $result = $claude->send($messages);
} catch (\Throwable $e) {
    report($e);
    return response()->json([
        'error' => 'AI service temporarily unavailable'
    ], 503);
}

Environment Sensitivity
Different environments require different tolerances. Development benefits from low token limits and aggressive timeouts. Production requires higher ceilings.

Externalize these values to tune behavior without redeploying:

CLAUDE_MAX_TOKENS=1024
CLAUDE_TIMEOUT=30

Then:

'timeout' => env('CLAUDE_TIMEOUT', 30),

Part 2: Core Integration Patterns

Once authentication and setup are complete, the challenge isn’t calling Claude—it’s deciding where that call should live, how it should fail, and how the rest of your application should interact with it.

Designing a Claude Service Boundary
The most common mistake is letting API calls leak into controllers, jobs, or Livewire components. That works for prototypes but quickly becomes untestable and fragile.

Treat Claude like any other external dependency: Stripe, S3, or an internal microservice. Create a single, dedicated client.

namespace App\Services\Claude;
use Illuminate\Support\Facades\Http;
final class ClaudeClient
{
    public function message(array $payload): array
    {
        return Http::withHeaders([
                'x-api-key' => config('services.claude.key'),
                'anthropic-version' => '2023-06-01',
            ])
            ->post('https://api.anthropic.com/v1/messages', $payload)
            ->throw()
            ->json();
    }
}

This class does one thing: speak Claude’s protocol. It doesn’t know about your application’s domain.

You can swap models without touching business logic, mock the client cleanly in tests, and isolate vendor churn to one file.

Pattern 1: Simple Request–Response Workflows
Most Claude interactions are: generate text, summarize content, rewrite copy, extract data.

Instead of calling the client directly, wrap each use case in a domain-specific service:

final class ArticleSummarizer
{
    public function __construct(
        private readonly ClaudeClient $claude
    ) {}
    public function summarize(string $article): string
    {
        $response = $this->claude->message([
            'model' => 'claude-3-5-sonnet-latest',
            'max_tokens' => 400,
            'messages' => [
                [
                    'role' => 'user',
                    'content' => "Summarize the following article:\n\n{$article}",
                ],
            ],
        ]);
        return $response['content'][0]['text'];
    }
}

From the controller’s perspective:

public function store(Request $request, ArticleSummarizer $summarizer)
{
    $summary = $summarizer->summarize($request->input('content'));
    return response()->json(['summary' => $summary]);
}

Controllers orchestrate. Services decide how Claude is used. The Claude client handles transport.

Pattern 2: Streaming Responses
Once output grows beyond a few hundred tokens, synchronous calls start to feel slow—even if they’re technically fast.

Claude supports streaming responses, and Laravel supports streamed HTTP responses. Your Claude client exposes a streaming method:

public function stream(array $payload, callable $onChunk): void
{
    Http::withHeaders([
            'x-api-key' => config('services.claude.key'),
            'anthropic-version' => '2023-06-01',
        ])
        ->withOptions(['stream' => true])
        ->post('https://api.anthropic.com/v1/messages', $payload)
        ->onChunk(function ($chunk) use ($onChunk) {
            $onChunk($chunk);
        });
}

Then in a controller:

public function generate(Request $request)
{
    return response()->stream(function () use ($request) {
        $this->claude->stream([
            'model' => 'claude-3-5-sonnet-latest',
            'max_tokens' => 2000,
            'messages' => [
                [
                    'role' => 'user',
                    'content' => $request->input('prompt'),
                ],
            ],
        ], function ($chunk) {
            echo $chunk;
            flush();
        });
    });
}

This improves perceived performance and unlocks live writing experiences, progressive document analysis, and token-by-token UI updates.

Claude is still stateless. Streaming doesn’t change memory behavior—only delivery.

Pattern 3: Managing Conversation State
Claude doesn’t remember previous requests. Conversation memory is always an application concern.

final class ConversationResponder
{
    public function __construct(
        private readonly ClaudeClient $claude
    ) {}
    public function respond(array $history, string $input): string
    {
        $messages = [
            ...$history,
            ['role' => 'user', 'content' => $input],
        ];
        $response = $this->claude->message([
            'model' => 'claude-3-5-sonnet-latest',
            'messages' => $messages,
            'max_tokens' => 600,
        ]);
        return $response['content'][0]['text'];
    }
}

Where history comes from is a strategic decision: session storage for ephemeral chats, database for persistent conversations, Redis for fast transient workflows.

Keeping memory outside the client lets you control token growth, privacy boundaries, and cost predictability.

Pattern 4: Background Processing with Queues
Not all Claude interactions belong in HTTP requests. Batch operations, document ingestion, and analysis jobs should run asynchronously.

final class AnalyzeDocument implements ShouldQueue
{
    use Dispatchable, Queueable;
    public function __construct(
        private readonly int $documentId
    ) {}
    public function handle(ClaudeClient $claude): void
    {
        $document = Document::findOrFail($this->documentId);
        $response = $claude->message([
            'model' => 'claude-3-5-sonnet-latest',
            'messages' => [
                [
                    'role' => 'user',
                    'content' => "Extract structured data:\n\n{$document->content}",
                ],
            ],
            'max_tokens' => 1000,
        ]);
        $document->update([
            'analysis' => $response['content'][0]['text'],
        ]);
    }
}

Queues give you retries, failure isolation, and throughput control—all critical when AI latency is variable.

These patterns keep Claude calls predictable, prevent controller bloat, make failures observable, and allow features to grow independently.

Token Accounting: Making AI Costs Predictable
Tokens are a direct function of input size (prompts, conversation history, documents), output length, model choice, and retry behavior. If you don’t account for tokens early, you lose control over cost and performance.

Why Token Accounting Belongs in the Service Layer
Claude’s API returns usage information with each response. That data should never be handled in controllers or UI components. Your Claude client is the correct place to extract and normalize it.

Example response fragment:

{
  "usage": {
    "input_tokens": 812,
    "output_tokens": 436
  }
}
Modify your ClaudeClient to surface this explicitly:

final class ClaudeResponse
{
    public function __construct(
        public readonly string $text,
        public readonly int $inputTokens,
        public readonly int $outputTokens
    ) {}
}

Then adapt the client:

public function message(array $payload): ClaudeResponse
{
    $response = Http::withHeaders([
            'x-api-key' => config('services.claude.key'),
            'anthropic-version' => '2023-06-01',
        ])
        ->post('https://api.anthropic.com/v1/messages', $payload)
        ->throw()
        ->json();
    return new ClaudeResponse(
        text: $response['content'][0]['text'],
        inputTokens: $response['usage']['input_tokens'] ?? 0,
        outputTokens: $response['usage']['output_tokens'] ?? 0,
    );
}

Every Claude interaction becomes measurable.

Propagating Token Data Without Polluting Business Logic
Your domain services should use token data without being tightly coupled to Claude:

final class ArticleSummarizer
{
    public function __construct(
        private readonly ClaudeClient $claude
    ) {}
    public function summarize(string $article): array
    {
        $response = $this->claude->message([
            'model' => 'claude-3-5-sonnet-latest',
            'max_tokens' => 400,
            'messages' => [
                [
                    'role' => 'user',
                    'content' => "Summarize the following article:\n\n{$article}",
                ],
            ],
        ]);
        return [
            'summary' => $response->text,
            'tokens' => [
                'input' => $response->inputTokens,
                'output' => $response->outputTokens,
            ],
        ];
    }
}

This allows you to log usage, display cost hints in admin tools, and enforce limits per request or per user—without leaking Claude-specific details into controllers.

Estimating Cost Before You Call Claude
Pre-flight estimation is one of the most effective control mechanisms. While you can’t know the exact output token count ahead of time, input tokens are deterministic.

A simple heuristic:

final class TokenEstimator
{
    public static function estimateInputTokens(string $text): int
    {
        // Rough heuristic: ~4 characters per token
        return (int) ceil(strlen($text) / 4);
    }
}

Used defensively:

$estimatedTokens = TokenEstimator::estimateInputTokens($document);
if ($estimatedTokens > 12_000) {
    throw new DomainException('Document too large for single Claude request.');
}

This prevents accidental runaway costs and forces intentional chunking strategies.

Tracking Token Usage Over Time
Once token data is available, persisting it is trivial—and invaluable:

Schema::create('ai_usage_logs', function (Blueprint $table) {
    $table->id();
    $table->string('feature');
    $table->integer('input_tokens');
    $table->integer('output_tokens');
    $table->timestamps();
});

Logged at the service boundary:

AiUsageLog::create([
    'feature' => 'article_summary',
    'input_tokens' => $response->inputTokens,
    'output_tokens' => $response->outputTokens,
]);

This gives you per-feature cost visibility, early warning signals for prompt regressions, and data to justify caching or refactoring decisions.

Without token accounting, conversation memory grows unchecked, streaming hides true cost, and background jobs quietly explode your bill.

Read full article here: https://origin-main.com/guides/the-complete-guide-to-integrating-claude-api-with-laravel/