DEV Community

Cover image for Building Real Agentic AI in PHP with Neuron AI, Gemini, and Tools
Roberto B.
Roberto B.

Posted on

Building Real Agentic AI in PHP with Neuron AI, Gemini, and Tools

Build a real tool-using AI agent that understands YouTube videos and answers questions. In pure PHP

For years, using AI in PHP meant little more than sending prompts to an API and displaying the result. Valuable, yes, but also limited, fragile, and prone to hallucinations.

Today, the industry is moving toward a new paradigm: Agentic AI.

Instead of a single prompt-response interaction, we now build agents: goal-oriented AI programs that can reason, decide when to use tools, call real code, and operate inside well-defined constraints.

And yes: PHP is absolutely ready for this.

Thanks to libraries like Neuron AI, Symfony AI, Prism, and soon Laravel AI, the PHP ecosystem is becoming a first-class platform for building production-grade AI systems.

The PHP ecosystem is becoming a first-class platform for building production-grade AI systems.

To showcase the power and practicality of this approach, we will build a concrete example: an agent that acts as an assistant, capable of answering questions about a YouTube video. To achieve this, we will give our agent the ability to fetch the video’s transcript (and we will see how easy this is to do in PHP) and then use that transcript as its source of context and knowledge.

By the end of this article, you will have an agent that you can interrogate about any specific video. Whether you want to summarize a tutorial, extract key insights, or ask precise questions about a long talk or conference recording, you will be able to do it using an agent you built yourself.

The PHP tools/libraries we are going to use for building the YouTube AI agent:

By the end, you’ll have an agent that can answer questions about a YouTube video using only its transcript.

What is an Agent?

An agent is a goal-driven AI program that can reason, plan, and decide when to use tools to complete a task.

In practice, an agent:

  • Has clear instructions
  • Has rules and boundaries
  • Has tools it can use
  • Can call real code
  • Can chain multiple steps
  • And can operate reliably inside constraints

This is the foundation of modern AI applications.

What is a Tool?

A Tool is a function that the AI is allowed to call.

Instead of guessing or hallucinating data, the model can:

  • Query your database with your current data
  • Call external APIs
  • Read files
  • Execute business logic
  • Fetch external resources

The flow becomes:

  1. User asks a question
  2. Agent reasons (following predefined steps and instructions)
  3. Agent decides to call a tool
  4. Tool executes real PHP code
  5. Agent receives the result from the Tool
  6. Agent uses it to answer

This is how you build reliable AI systems.

Agentic AI in the PHP Ecosystem

The PHP ecosystem has always been driven by strong, pragmatic, and highly active communities—and the current wave of AI adoption is no exception. Over the past few months, we’ve seen an increasing number of serious, well-designed libraries emerge, built by people who have a deep understanding of both PHP and real-world production needs.

Rather than treating AI as a gimmick or a thin API wrapper, these projects are focusing on architecture, maintainability, and integration into existing applications. As a result, the PHP ecosystem is evolving extremely fast to support AI more and more, and the number of available tools is growing almost every week, especially thanks to the energy of the Symfony, Laravel, and independent open-source communities.

Today, we already have excellent building blocks, such as Neuron AI, which provides a complete agent framework with tool support and orchestration; Symfony AI, which brings first-class AI integration into the Symfony ecosystem; and Prism, which offers a clean abstraction layer over different LLM providers. And this is only the beginning: a dedicated Laravel AI ecosystem is already on the way.

In this article, we’ll focus on Neuron AI, because it already offers a mature and complete Agent + Tool architecture, making it an ideal foundation for building real-world, production-grade Agentic AI systems in PHP.

Our goal: a YouTube transcript QA agent

We will build an agent that:

  • Accepts a YouTube video ID
  • Uses a specific tool to fetch the transcript (we are going to implement the Agent tool)
  • Answers questions using the transcript

Installation and dependencies

Before we start building our agent, we need to install a few libraries. Each of them plays a very specific role in the architecture of our application, and together they form a clean, modular, and production-ready stack.

You can install everything using Composer:

composer require neuron-core/neuron-ai
composer require mrmysql/youtube-transcript
composer require symfony/http-client nyholm/psr7
composer require vlucas/phpdotenv
Enter fullscreen mode Exit fullscreen mode

Let’s briefly look at why we need each of these packages.

Neuron AI

composer require neuron-core/neuron-ai
Enter fullscreen mode Exit fullscreen mode

This is the core of our system. Neuron AI provides the Agent abstraction, the Tool system, the orchestration layer, and the provider integrations (such as Gemini, OpenAI, Ollama, etc.). It is the framework that allows us to define instructions, expose tools to the model, and let the agent reason about when and how to use them. Without it, we would just be sending prompts to an API; with it, we are building a real Agentic system.

YouTube Transcript fetcher

composer require mrmysql/youtube-transcript
Enter fullscreen mode Exit fullscreen mode

This library is a small but very useful utility that allows us to fetch the transcript of a YouTube video programmatically. Instead of scraping HTML or dealing with undocumented APIs ourselves, we can rely on a clean, tested PHP library. This will be the “real-world data source” that our agent will access through a tool.

Symfony HTTP Client and PSR-7 Implementation

composer require symfony/http-client nyholm/psr7
Enter fullscreen mode Exit fullscreen mode

The YouTube transcript library relies on PSR-18 (HTTP client) and PSR-7 (HTTP messages) interfaces. Here, we use Symfony’s HTTP client as a robust, production-ready implementation, together with Nyholm’s PSR-7 implementation for request and response objects. This keeps everything standards-based and framework-agnostic.

PHP dotenv

composer require vlucas/phpdotenv
Enter fullscreen mode Exit fullscreen mode

Finally, we use phpdotenv to load our Gemini API key from a .env file instead of hardcoding it into the source code. This is a best practice for both security and portability, and it allows us to easily change credentials between environments without touching the code.


This set of dependencies already shows an important aspect of modern PHP: we are building our AI system using small, composable, standards-based components, rather than a single monolithic library.

The complete Agent

<?php

use MrMySQL\YoutubeTranscript\TranscriptListFetcher;
use NeuronAI\Agent;
use NeuronAI\Chat\Messages\UserMessage;
use NeuronAI\Providers\AIProviderInterface;
use NeuronAI\Providers\Gemini\Gemini;
use NeuronAI\SystemPrompt;
use NeuronAI\Tools\PropertyType;
use NeuronAI\Tools\Tool;
use NeuronAI\Tools\ToolProperty;
use Nyholm\Psr7\Factory\Psr17Factory;
use Symfony\Component\HttpClient\HttpClient;
use Symfony\Component\HttpClient\Psr18Client;

require "./vendor/autoload.php";

class YouTubeTranscriptQAAgent extends Agent
{
    protected function provider(): AIProviderInterface
    {
        $dotenv = Dotenv\Dotenv::createImmutable(__DIR__);
        $dotenv->load();

        return new Gemini(
            key: $_ENV["GEMINI_API_KEY"],
            model: "gemini-2.5-flash",
        );
    }

    public function instructions(): string
    {
        return (string) new SystemPrompt(
            background: [
                "You are a helpful assistant that answers questions about a YouTube video using ONLY the provided transcript.",
                "You must use the tool to retrieve the transcript.",
                "If the transcript does not contain the answer, say so clearly and ask for another video id.",
                "Never use external knowledge.",
            ],
            steps: [
                "Ask the user for a YouTube video URL or ID if not provided.",
                "Call the get_transcription tool to retrieve the transcript.",
                "Read the transcript carefully.",
                "Answer the user's question strictly using the transcript.",
            ],
            output: [
                "Be concise and precise.",
                "If the transcript does not contain the answer, reply: 'The transcript does not contain this information.'",
            ],
        );
    }

    protected function tools(): array
    {
        return [
            Tool::make(
                "get_transcription",
                "Retrieve the transcription of a YouTube video.",
            )
                ->addProperty(
                    new ToolProperty(
                        name: "video_id",
                        type: PropertyType::STRING,
                        description: "The YouTube video ID (e.g. wT1lcJ_zn18).",
                        required: true,
                    ),
                )
                ->setCallable(function (?string $video_id): string {
                    if (is_null($video_id)) {
                        return "";
                    }
                    return $this->fetchTranscript($video_id);
                }),
        ];
    }

    private function fetchTranscript(string $videoId, string $language = "en"): string
    {
        $httpClient = HttpClient::create();
        $psr18Client = new Psr18Client($httpClient);
        $psr17Factory = new Psr17Factory();

        $fetcher = new TranscriptListFetcher(
            $psr18Client,
            $psr17Factory,
            $psr17Factory,
        );

        $transcriptList = $fetcher->fetch($videoId);
        $transcript = $transcriptList->findTranscript([$language]);

        $lines = [];

        foreach ($transcript->fetch() as $item) {
            $lines[] = html_entity_decode(
                $item["text"],
                ENT_QUOTES | ENT_HTML5,
                "UTF-8",
            );
        }

        return implode(PHP_EOL, $lines);
    }
}
Enter fullscreen mode Exit fullscreen mode

Understanding the Agent: a guided tour of the code

Now that we’ve seen the complete implementation, let’s walk through the most important parts of the agent step by step. The goal here is not just to understand what the code does, but how the different pieces work together to form a real Agentic AI system.

1. The Agent class

class YouTubeTranscriptQAAgent extends Agent
Enter fullscreen mode Exit fullscreen mode

This is the heart of the application. By extending the Agent class provided by Neuron AI, we are not building a simple wrapper around an LLM, we are defining a full AI agent with:

  • Its own instructions
  • Its own tools
  • Its own model provider
  • And its own behavior and constraints

From this point forward, Neuron AI will handle orchestration, tool invocation, and conversation flow.

2. Choosing the model provider

protected function provider(): AIProviderInterface
Enter fullscreen mode Exit fullscreen mode

This method instructs Neuron AI which provider (in this case, via the Gemini class provided by Neuron AI, but you can explore other providers, such as Ollama) and which model should power the agent.

return new Gemini(
    key: $_ENV["GEMINI_API_KEY"],
    model: "gemini-2.5-flash",
);
Enter fullscreen mode Exit fullscreen mode

Here, we use Gemini as a provider, but one of the strengths of Neuron AI is that this is easily swappable. The rest of the agent does not care whether you are using Gemini, OpenAI, or Ollama, the architecture stays exactly the same.

We also load the API key from a .env file, which is a best practice for security and configuration management.

3. The system prompt: the Agent’s brain

public function instructions(): string
Enter fullscreen mode Exit fullscreen mode

The instructions() method defines the entire behavioral contract of the agent.

Instead of a single free-form prompt, we build a structured SystemPrompt with:

  • A background: what the agent is and what it must never do
  • A set of steps: how to approach the task
  • An output contract: how it should respond

This is one of the most important parts of the system. These instructions:

  • Force the agent to use the tool
  • Forbid it from using external knowledge
  • Greatly reduce hallucinations
  • Make the agent more predictable and reliable

In other words, this is where we transform a generic LLM into a specialized, constrained assistant.

4. Defining the Tool

protected function tools(): array
Enter fullscreen mode Exit fullscreen mode

Here we declare what the agent is allowed to do in the real world.

Tool::make("get_transcription", "Retrieve the transcription of a YouTube video.")
Enter fullscreen mode Exit fullscreen mode

This creates a tool named get_transcription. The model can decide to call it whenever it thinks it needs the transcript.

5. Describing the Tool interface

->addProperty(
    new ToolProperty(
        name: "video_id",
        type: PropertyType::STRING,
        required: true,
    ),
)
Enter fullscreen mode Exit fullscreen mode

This is crucial: we are not just exposing a PHP function, we are defining a formal schema for it.

Thanks to this, the model knows:

  • What arguments are required
  • What type must they have
  • How to call the tool correctly

This is what makes tool calling reliable and structured, rather than fragile and heuristic.

6. Connecting the Tool to real PHP code

->setCallable(function (?string $video_id): string {
    return $this->fetchTranscript($video_id);
})
Enter fullscreen mode Exit fullscreen mode

This is the bridge between the AI world and your application.

When the model decides to call the tool, this PHP function is executed. At this point:

  • The AI stops “imagining”
  • And your real, deterministic PHP code takes over

This is the exact moment where the agent becomes more than a chatbot.

7. The actual transcript fetching logic

private function fetchTranscript(...)
Enter fullscreen mode Exit fullscreen mode

Everything here is our old, expected, reliable PHP, and that’s a good thing.

We use:

  • Symfony’s HTTP client
  • PSR-7 / PSR-18 interfaces
  • The mrmysql/youtube-transcript library

To fetch the transcript, normalize it, and return it as a simple text variable.

From the agent’s point of view, this is just a tool. From your perspective, this provides full control over data access and behavior.

Using our new Agent

Once the agent is defined, using it in your application is simple.

$agent = new YouTubeTranscriptQAAgent();

$response = $agent->chat(
    new UserMessage("What is the main topic of the video 'wT1lcJ_zn18'?"),
);

echo $response->getContent();
Enter fullscreen mode Exit fullscreen mode

We start by instantiating our agent just like any other PHP object. At this point, we are not dealing with a “model client” or a low-level API wrapper, but with a fully configured AI agent that already knows:

  • Which model to use
  • What its instructions are
  • Which tools it is allowed to call
  • And how it should behave

When we call the chat() method, we send a UserMessage to the agent. This message becomes part of the agent’s conversation context and triggers the whole reasoning process.

Behind the scenes, Neuron AI orchestrates everything:

  1. The agent analyzes the user’s question.
  2. It follows the rules and steps defined in the system prompt.
  3. It realizes that it needs the video transcript to answer.
  4. It decides to call the get_transcription tool.
  5. Because the get_transcription tool requires the video identifier, the video id is extracted from the question (if it is not present or detected, the answer will be a message about the missing video id)
  6. The Tool PHP code fetches the transcript.
  7. The agent receives the result (a long text with the transcription) from the Tool and uses it as context.
  8. Finally, it generates an answer that strictly respects the defined constraints.

The return value of chat() is a response object. By calling:

echo $response->getContent();
Enter fullscreen mode Exit fullscreen mode

We get and print the final answer produced by the agent as a string.

From the outside, this appears to be a single method call. In reality, it is a full agentic workflow involving reasoning, tool usage, and controlled generation, all hidden behind a clean, idiomatic PHP API.

Why this is real Agentic AI

This approach goes far beyond simply sending prompts to a language model. What we have built here is a real agentic system, where the model is no longer a passive text generator but an active component inside a well-defined software architecture.

The model can autonomously decide when to call a tool, and it can utilize your PHP code as real, concrete capabilities rather than merely as suggestions. At the same time, it operates under strict and explicit rules defined by the system prompt, which constrain its behavior and guide its reasoning.

Thanks to the carefully designed context, background description, step-by-step instructions, and clearly defined output expectations, the agent is firmly grounded in the task it must perform. This dramatically reduces the risk of hallucinations and makes the whole system far more reliable, predictable, and suitable for real-world applications.

Final thoughts

The PHP ecosystem is clearly entering a new era. What only a short time ago looked like an experimental trend is quickly becoming a solid and mature part of everyday application architecture.

With tools like Neuron AI, Symfony AI, Prism, and soon Laravel AI, we can now build reliable, production-grade AI agents entirely in PHP, and, more importantly, integrate them naturally into our existing applications and workflows.

The agent we built in this article is intentionally simple, but it already shows the real power of this approach. From here, you could extend it in many directions: you could add memory to enable reasoning across multiple interactions, introduce multiple tools and let the agent choose between them, connect it to a vector database for semantic search and retrieval-augmented generation, or even orchestrate multiple agents that collaborate on more complex tasks.

This is not about replacing your application logic with an LLM. It is about augmenting your software with reasoning, decision-making, and controlled autonomy, while keeping PHP firmly in charge of execution and structure.

Agentic AI is not a distant future.

It is something you can start building today.

Top comments (0)