DEV Community

Hoang Manh Cam
Hoang Manh Cam

Posted on

Building a Laravel Package for Ollama - The full guide

A step‑by‑step guide to designing, coding, testing, documenting, and releasing a Laravel package that talks to a local Ollama server.


Why Ollama + Laravel?

Ollama makes it dead‑simple to run large language models (LLMs) locally. Laravel gives you an expressive toolkit for building PHP applications. A dedicated package is the cleanest way to:

  • Centralize HTTP calls to the Ollama API (chat, generate, embeddings, models, etc.).
  • Offer a fluent, framework‑native API via Facades and dependency injection.
  • Provide config, caching, logging, and test fakes out of the box.

This article uses camh/laravel-ollama as the concrete example, but the structure applies to any Laravel package talking to Ollama.

Laravel to Laravel Ollama to Ollama Server


Prerequisites

  • PHP 8.2+ and Composer
  • Laravel 10 or 11
  • Ollama installed locally and running (default: http://localhost:11434)
  • Basic familiarity with Laravel packages (service providers, facades, config)

Package Goals

We’ll build a package that:

  1. Wraps Ollama endpoints with a typed, ergonomic client.
  2. Supports both non‑streaming and streaming responses.
  3. Exposes a Facade (Ollama) and injectable interfaces.
  4. Adds config + environment variables for base URL, timeouts, and model defaults.
  5. Includes testing utilities (Http fakes and example fixtures).
  6. Ships with documentation and CI.

Project Skeleton

laravel-ollama/
├─ src/
│  ├─ Contracts/
│  │  └─ OllamaClient.php
│  ├─ DTOs/
│  │  ├─ ChatMessage.php
│  │  ├─ ChatResponse.php
│  │  └─ EmbeddingResponse.php
│  ├─ Http/
│  │  └─ Client.php
│  ├─ Facades/
│  │  └─ Ollama.php
│  ├─ OllamaServiceProvider.php
│  └─ Support/StreamIterator.php
├─ config/ollama.php
├─ tests/
│  ├─ Feature/
│  └─ Unit/
├─ composer.json
├─ README.md
└─ CHANGELOG.md
Enter fullscreen mode Exit fullscreen mode

Composer Setup

composer.json

{
  "name": "camh/laravel-ollama",
  "description": "Laravel wrapper for the Ollama local LLM API (chat, generate, embeddings).",
  "type": "library",
  "license": "MIT",
  "require": {
    "php": ">=8.2",
    "illuminate/support": "^10.0|^11.0"
  },
  "autoload": {
    "psr-4": {
      "CamH\\LaravelOllama\\": "src/"
    }
  },
  "extra": {
    "laravel": {
      "providers": [
        "CamH\\LaravelOllama\\OllamaServiceProvider"
      ],
      "aliases": {
        "Ollama": "CamH\\LaravelOllama\\Facades\\Ollama"
      }
    }
  },
  "minimum-stability": "stable",
  "prefer-stable": true
}
Enter fullscreen mode Exit fullscreen mode

Configuration

config/ollama.php

<?php

return [
    'base_url' => env('OLLAMA_BASE_URL', 'http://localhost:11434'),

    // default model to use if not provided explicitly
    'model' => env('OLLAMA_MODEL', 'llama3.1:8b'),

    // timeouts (in seconds)
    'timeout' => env('OLLAMA_TIMEOUT', 120),
    'connect_timeout' => env('OLLAMA_CONNECT_TIMEOUT', 5),
];
Enter fullscreen mode Exit fullscreen mode

.env

OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:8b
OLLAMA_TIMEOUT=120
OLLAMA_CONNECT_TIMEOUT=5
Enter fullscreen mode Exit fullscreen mode

Add a publish group so users can copy the config into their app:

// in OllamaServiceProvider::boot()
$this->publishes([
    __DIR__.'/../config/ollama.php' => config_path('ollama.php'),
], 'ollama-config');
Enter fullscreen mode Exit fullscreen mode

Service Provider & Container Bindings

src/OllamaServiceProvider.php

<?php

namespace CamH\LaravelOllama;

use CamH\LaravelOllama\Contracts\OllamaClient as OllamaClientContract;
use CamH\LaravelOllama\Http\Client as HttpClient;
use Illuminate\Support\ServiceProvider;

class OllamaServiceProvider extends ServiceProvider
{
    public function register(): void
    {
        $this->mergeConfigFrom(__DIR__.'/../config/ollama.php', 'ollama');

        $this->app->bind(OllamaClientContract::class, function ($app) {
            $config = $app['config']['ollama'];
            return new HttpClient(
                baseUrl: $config['base_url'],
                defaultModel: $config['model'],
                timeout: (int) $config['timeout'],
                connectTimeout: (int) $config['connect_timeout'],
            );
        });
    }

    public function boot(): void
    {
        $this->publishes([
            __DIR__.'/../config/ollama.php' => config_path('ollama.php'),
        ], 'ollama-config');
    }
}
Enter fullscreen mode Exit fullscreen mode

Contracts (Interface‑first Design)

src/Contracts/OllamaClient.php

<?php

namespace CamH\LaravelOllama\Contracts;

use Generator;

interface OllamaClient
{
    /** Simple one‑shot completion (non‑streaming). */
    public function generate(string $prompt, ?string $model = null, array $options = []): string;

    /** Chat with role‑based messages (non‑streaming). */
    public function chat(array $messages, ?string $model = null, array $options = []): array;

    /** Token‑streaming chat. Yields partial text chunks as they arrive. */
    public function streamChat(array $messages, ?string $model = null, array $options = []): Generator;

    /** Create embeddings for given input text(s). */
    public function embeddings(string|array $input, ?string $model = null, array $options = []): array;

    /** List available local models. */
    public function models(): array;
}
Enter fullscreen mode Exit fullscreen mode

The HTTP Client

We’ll use Laravel’s HTTP client (Illuminate\Support\Facades\Http) behind a thin adapter.

src/Http/Client.php

<?php

namespace CamH\LaravelOllama\Http;

use CamH\LaravelOllama\Contracts\OllamaClient;
use Generator;
use Illuminate\Support\Facades\Http;

class Client implements OllamaClient
{
    public function __construct(
        private readonly string $baseUrl,
        private readonly string $defaultModel,
        private readonly int $timeout = 120,
        private readonly int $connectTimeout = 5,
    ) {}

    protected function http()
    {
        return Http::baseUrl($this->baseUrl)
            ->timeout($this->timeout)
            ->connectTimeout($this->connectTimeout)
            ->acceptJson();
    }

    public function generate(string $prompt, ?string $model = null, array $options = []): string
    {
        $payload = array_merge([
            'model' => $model ?? $this->defaultModel,
            'prompt' => $prompt,
            'stream' => false,
        ], $options);

        $response = $this->http()->post('/api/generate', $payload)->throw();
        // Ollama returns { response: "...", ... }
        return (string) $response->json('response', '');
    }

    public function chat(array $messages, ?string $model = null, array $options = []): array
    {
        $payload = array_merge([
            'model' => $model ?? $this->defaultModel,
            'messages' => $messages,
            'stream' => false,
        ], $options);

        $response = $this->http()->post('/api/chat', $payload)->throw();
        return $response->json();
    }

    public function streamChat(array $messages, ?string $model = null, array $options = []): Generator
    {
        $payload = array_merge([
            'model' => $model ?? $this->defaultModel,
            'messages' => $messages,
            'stream' => true,
        ], $options);

        $response = $this->http()->withOptions(['stream' => true])->post('/api/chat', $payload)->throw();

        foreach ($response->toPsrResponse()->getBody() as $chunk) {
            $line = trim((string) $chunk);
            if ($line === '') continue;
            $json = json_decode($line, true);
            if (isset($json['message']['content'])) {
                yield $json['message']['content'];
            }
        }
    }

    public function embeddings(string|array $input, ?string $model = null, array $options = []): array
    {
        $payload = array_merge([
            'model' => $model ?? $this->defaultModel,
            'input' => $input,
        ], $options);

        $response = $this->http()->post('/api/embeddings', $payload)->throw();
        return $response->json();
    }

    public function models(): array
    {
        return $this->http()->get('/api/tags')->throw()->json();
    }
}
Enter fullscreen mode Exit fullscreen mode

Note: Ollama’s streaming endpoints send a stream of JSON lines. We iterate the PSR stream and decode each line.


Facade for Ergonomics

src/Facades/Ollama.php

<?php

namespace CamH\LaravelOllama\Facades;

use CamH\LaravelOllama\Contracts\OllamaClient as OllamaClientContract;
use Illuminate\Support\Facades\Facade;

/** @method static string generate(string $prompt, ?string $model = null, array $options = [])
 *  @method static array chat(array $messages, ?string $model = null, array $options = [])
 *  @method static \Generator streamChat(array $messages, ?string $model = null, array $options = [])
 *  @method static array embeddings(string|array $input, ?string $model = null, array $options = [])
 *  @method static array models()
 */
class Ollama extends Facade
{
    protected static function getFacadeAccessor()
    {
        return OllamaClientContract::class;
    }
}
Enter fullscreen mode Exit fullscreen mode

Usage in a Laravel App

Install

composer require camh/laravel-ollama
php artisan vendor:publish --tag=ollama-config
Enter fullscreen mode Exit fullscreen mode

Generate text (controller or job):

use CamH\LaravelOllama\Facades\Ollama;

$text = Ollama::generate('Write a haiku about monsoons.', model: 'llama3.1:8b');
Enter fullscreen mode Exit fullscreen mode

Chat

$reply = Ollama::chat([
    ['role' => 'system', 'content' => 'You are a concise assistant.'],
    ['role' => 'user', 'content' => 'Summarize Laravel in 1 sentence.'],
]);

$assistant = data_get($reply, 'message.content');
Enter fullscreen mode Exit fullscreen mode

Streaming chat (controller returning an SSE stream)

use Symfony\Component\HttpFoundation\StreamedResponse;
use CamH\LaravelOllama\Facades\Ollama;

return new StreamedResponse(function () {
    $messages = [
        ['role' => 'user', 'content' => 'Explain queues in Laravel concisely.']
    ];

    foreach (Ollama::streamChat($messages) as $delta) {
        echo "data: ".$delta."\n\n";
        ob_flush(); flush();
    }
}, 200, [
    'Content-Type' => 'text/event-stream',
    'Cache-Control' => 'no-cache',
    'X-Accel-Buffering' => 'no',
]);
Enter fullscreen mode Exit fullscreen mode

Embeddings

$response = Ollama::embeddings([
    'Laravel is a delightful PHP framework.',
    'Eloquent provides ActiveRecord-like models.'
]);

$firstVector = $response['embeddings'][0] ?? [];
Enter fullscreen mode Exit fullscreen mode

List models

$models = Ollama::models();
Enter fullscreen mode Exit fullscreen mode

Error Handling & Timeouts

  • Use $response->throw() so HTTP ≥ 400 raises exceptions.
  • Catch and convert RequestException to domain‑specific exceptions if you want (OllamaUnavailable, OllamaValidationError).
  • Let users override timeout and connect_timeout via config and options arguments.

Example domain exceptions:

class OllamaUnavailable extends \RuntimeException {}
class OllamaValidationError extends \InvalidArgumentException {}
Enter fullscreen mode Exit fullscreen mode

Testing Strategy

  1. Unit tests for the client using Http::fake() to simulate Ollama responses.
  2. Feature tests for your routes / controllers consuming the facade.
  3. Provide fixtures (JSON lines for streaming) to test parsers.

Example

use Illuminate\Support\Facades\Http;
use CamH\LaravelOllama\Http\Client;

it('generates text', function () {
    Http::fake([
        'http://localhost:11434/api/generate' => Http::response([
            'response' => 'Hello world',
        ], 200),
    ]);

    $client = new Client('http://localhost:11434', 'llama3.1:8b');

    $text = $client->generate('Say hello');

    expect($text)->toBe('Hello world');
});
Enter fullscreen mode Exit fullscreen mode

Streaming test helper (fake JSONL):

$stream = "{\"message\":{\"content\":\"Hel\"}}\n{\"message\":{\"content\":\"lo\"}}\n";
Http::fake([
    '*' => Http::response($stream, 200, ['Content-Type' => 'application/x-ndjson'])
]);
Enter fullscreen mode Exit fullscreen mode

Documentation & DX

  • Ship a README with install, configuration, and quickstart examples.
  • Add PHPDoc on public methods and return types.
  • Provide copy‑paste examples for SSE streaming and queue jobs.
  • Include a php artisan example command to prove the integration end‑to‑end.

Example command

// app/Console/Commands/OllamaAsk.php
protected $signature = 'ollama:ask {prompt} {--model=}';

public function handle(): int
{
    $model = $this->option('model');
    $answer = \CamH\LaravelOllama\Facades\Ollama::generate($this->argument('prompt'), $model);
    $this->line($answer);
    return self::SUCCESS;
}
Enter fullscreen mode Exit fullscreen mode

Releasing to Packagist

  1. Create a public Git repository.
  2. Ensure composer.json has correct name, autoload, and extra.laravel.
  3. Tag a release: git tag v1.0.0 && git push --tags.
  4. Submit the repo to packagist.org once (future tags auto‑sync).

Versioning tips

  • Follow SemVer.
  • Maintain a CHANGELOG.md.
  • Use GitHub Actions to run tests on PHP 8.2/8.3 and Laravel 10/11.

Example CI (GitHub Actions)

name: tests
on: [push, pull_request]
jobs:
  phpunit:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        php: ['8.2', '8.3']
        laravel: ['10.*', '11.*']
    steps:
      - uses: actions/checkout@v4
      - uses: shivammathur/setup-php@v2
        with:
          php-version: ${{ matrix.php }}
          tools: composer:v2
      - run: composer require "illuminate/support:${{ matrix.laravel }}" --no-interaction --no-progress --no-suggest
      - run: composer install --no-interaction --prefer-dist
      - run: vendor/bin/phpunit
Enter fullscreen mode Exit fullscreen mode

Security, Performance & Ops Notes

  • Never trust prompts; validate and size‑limit user input.
  • Consider rate limiting endpoints that proxy to Ollama.
  • Add caching for models() (e.g., cache for 5–10 minutes) to avoid frequent calls.
  • Support retry/backoff on transient errors.
  • Log latency and token usage (if available) for observability.
$models = cache()->remember('ollama:models', 600, fn () => Ollama::models());
Enter fullscreen mode Exit fullscreen mode

Advanced: Middleware & Pipelines

  • Add middleware to inject system prompts, sanitize user input, or enforce max tokens.
  • Provide a pipeline API for RAG: retrieve docs → build context → call chat().
  • Consider stream transformers to emit SSE, console updates, or WebSockets.

Troubleshooting

  • Connection refused → Verify Ollama is running and OLLAMA_BASE_URL is correct.
  • Model not found → Pull the model: ollama pull llama3.1:8b.
  • Timeouts → Increase OLLAMA_TIMEOUT or simplify prompts.
  • Binary size → Exclude tests/fixtures from export via .gitattributes.

Example End‑to‑End Controller

namespace App\Http\Controllers;

use CamH\LaravelOllama\Facades\Ollama;
use Illuminate\Http\Request;

class AskController
{
    public function __invoke(Request $request)
    {
        $validated = $request->validate([
            'prompt' => ['required','string','max:4000'],
            'model' => ['nullable','string']
        ]);

        $answer = Ollama::generate($validated['prompt'], $validated['model'] ?? null);

        return response()->json([
            'answer' => $answer,
        ]);
    }
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

The camh/laravel-ollama package is only the first step. The current focus has been on wrapping Ollama’s core APIs and providing a clean Laravel interface, but there is a clear roadmap ahead:

  • Tool & function calling: add helpers for structured outputs and integrations with external services.
  • Conversation memory: support multi‑turn conversation stores (database, cache, or Redis) to persist chat history.
  • RAG workflows: deeper integration with Laravel Scout or custom pipelines to enable retrieval‑augmented generation.
  • Monitoring & Observability: built‑in hooks for logging, metrics, and tracing of requests.
  • Community feedback: open issues and PRs will shape features for better developer experience.

In an upcoming article, we will move beyond package design and demonstrate how to use camh/laravel-ollama in a real Laravel project. That walkthrough will cover building controllers, jobs, and even front‑end integrations that take advantage of local LLMs through this package.

Stay tuned—the best part of Laravel + Ollama is seeing it power actual products and workflows!

Top comments (0)