Gil Fink

Posted on Jun 30 • Originally published at itnext.io on Jun 25

Architectural Freedom: Implementing LCEL and Composable Chains in Browser AI Agents

#agents #ai #langchain #promptapi

Throughout this series on building enterprise-grade AI agents with Chrome’s Prompt API, we have steadily overcome the major physical limitations of client-side AI. We moved execution to a Web Worker, persisted state with IndexedDB, implemented RAG for dynamic tool retrieval, built autonomous error handling, and conquered token limits with background context compression.

Our engine is robust, but as a framework, it suffered from a classic software engineering anti-pattern: Tight Coupling.

Until now, our agent’s execution logic was hardcoded into a monolithic while loop inside prompt-chain-worker.js. If a developer wanted to build a simple, linear pipeline (e.g., summarize a text, then translate it) rather than a full, autonomous ReAct (Reasoning and Acting) loop, they couldn't. The framework forced every interaction through the complex ReAct cycle.

To elevate prompt-chain from a single-purpose script into a versatile AI framework, we need to decouple the execution topology from the worker host.

Today, we are introducing a Pipeline Architecture inspired by LangChain Expression Language (LCEL). By refactoring our core into modular, composable Runnable components, developers can now pipe prompts into LLMs, attach output parsers, and build custom agent architectures declaratively without ever touching the core web worker code.

The Core Concept: The Runnable Interface

The foundation of LCEL is the Runnable interface. Everything in the system such as prompts, LLM calls, parsers, and even the agent loop itself becomes a Runnable that implements a standard invoke() method.

By standardizing the inputs and outputs, we can chain these components together using RunnableSequence (acting like a pipeline).

The LCEL Primitives

We created a new file to house our base primitives. Here is the implementation of the core Runnable class and the RunnableSequence which allows us to compose chains:

export class Runnable {
    async invoke(input, config = {}) {
        throw new Error("Abstract method invoke() must be implemented.");
    }

    pipe(nextRunnable) {
        if (typeof nextRunnable === "function") {
            nextRunnable = new RunnableLambda(nextRunnable);
        }
        return new RunnableSequence(this, nextRunnable);
    }

    bind(kwargs) {
        return new RunnableBinding(this, kwargs);
    }
}

export class RunnableBinding extends Runnable {
    constructor(boundRunnable, kwargs) {
        super();
        this.boundRunnable = boundRunnable;
        this.kwargs = kwargs;
    }

    async invoke(input, config = {}) {
        return await this.boundRunnable.invoke(input, { ...config, ...this.kwargs });
    }
}

export class RunnableSequence extends Runnable {
    constructor(first, second) {
        super();
        this.first = first;
        this.second = second;
    }

    static from(runnables) {
        if (!Array.isArray(runnables) || runnables.length === 0) {
            throw new Error("RunnableSequence.from expects a non-empty array of runnables.");
        }
        let seq = runnables[0];
        if (typeof seq === "function") seq = new RunnableLambda(seq);

        for (let i = 1; i < runnables.length; i++) {
            let next = runnables[i];
            if (typeof next === "function") next = new RunnableLambda(next);
            seq = seq.pipe(next);
        }
        return seq;
    }

    async invoke(input, config = {}) {
        const firstOutput = await this.first.invoke(input, config);
        return await this.second.invoke(firstOutput, config);
    }
}

export class RunnableParallel extends Runnable {
    constructor(runnablesMap) {
        super();
        this.runnablesMap = runnablesMap;
    }

    async invoke(input, config = {}) {
        const entries = Object.entries(this.runnablesMap);
        const results = await Promise.all(
            entries.map(async ([key, runnable]) => {
                let r = runnable;
                if (typeof r === "function") r = new RunnableLambda(r);
                const output = await r.invoke(input, config);
                return [key, output];
            })
        );
        return Object.fromEntries(results);
    }
}

export class RunnableLambda extends Runnable {
    constructor(func) {
        super();
        this.func = func;
    }

    async invoke(input, config = {}) {
        return await Promise.resolve(this.func(input, config));
    }
}

export class RunnablePassthrough extends Runnable {
    async invoke(input, config = {}) {
        return input;
    }

    static assign(mapping) {
        return new RunnableLambda(async (input, config) => {
            if (typeof input !== "object" || input === null) {
                throw new Error("RunnablePassthrough.assign expects an object input.");
            }
            const parallel = new RunnableParallel(mapping);
            const computedValues = await parallel.invoke(input, config);
            return { ...input, ...computedValues };
        });
    }
}

Alongside these, we implemented RunnableLambda (to wrap arbitrary functions), RunnableParallel (for concurrent execution), and RunnablePassthrough (to carry inputs down the chain).

Refactoring the Architecture

With our primitives in place, we needed to refactor our existing monolithic codebase to conform to this new standard. The first place to start the refactoring was the prompt template implementation.

Upgrading the Prompt Template

Our PromptTemplate previously just returned a string. Now, it extends Runnable, allowing it to sit at the beginning of an LCEL chain. It receives a dictionary of inputs, formats the string, and passes it down the pipeline.

import { Runnable } from "./runnable.js";

export class PromptTemplate extends Runnable {
    constructor() {
        super();
        this.systemInstruction = `You are an autonomous AI agent with long-term memory. Think step-by-step.
            You must STRICTLY output valid JSON matching the schema.

            Rules:
            1. If you need data, set "toolName" to a tool and "toolInput" to the query. Leave "finalAnswer" as "".
            2. If you know the answer, set "toolName" to "none" and put the answer in "finalAnswer".`;

        this.fewShotExamples = `
            --- Example 1: Using a Tool ---
            User: What is the current stock price of Apple?
            {"thought": "I need to look up the real-time stock price for Apple (AAPL).", "toolName": "FetchStockPrice", "toolInput": "AAPL", "finalAnswer": ""}
            Observation from FetchStockPrice: 175.50
            {"thought": "I have the observation. I can now provide the final answer.", "toolName": "none", "toolInput": "", "finalAnswer": "The current stock price of Apple is $175.50."}

            --- Example 2: Answering Directly ---
            User: What is the capital of France?
            {"thought": "I know the capital of France is Paris. No tool is needed.", "toolName": "none", "toolInput": "", "finalAnswer": "The capital of France is Paris."}
            `;
    }

    format(relevantTools, historyTurns, userPrompt, summary = "", skillInstructions = "") {
        const toolDescriptions = relevantTools.length > 0
            ? relevantTools.map(t => `- ${t.name}: ${t.description}`).join('\n')
            : "- none: No external tools available for this query.";

        const summaryPart = summary
            ? `Conversation Summary (Background Context):\n${summary}\n\n`
            : "";

        const skillPart = skillInstructions
            ? `Active Skill Instructions & Guidelines:\n${skillInstructions}\n\n`
            : "";

        return `${this.systemInstruction}           
            Available tools for this request:
            ${toolDescriptions}
            - none: Use this if you do not need a tool.

            ${this.fewShotExamples}

            --- Current Conversation ---
            ${summaryPart}${skillPart}Prior History:
            ${historyTurns.length > 0 ? historyTurns.join('\n') : "No prior history."}

            User: ${userPrompt}
            Output your next step as JSON:`;
    }

    async invoke(input, config = {}) {
        const { relevantTools = [], historyTurns = [], userPrompt = "", summary = "", skillInstructions = "" } = input;
        return this.format(relevantTools, historyTurns, userPrompt, summary, skillInstructions);
    }
}

Componentizing the Inference Step

Previously, calling the LLM and parsing its JSON output was tangled together. We separated these into two distinct Runnable classes inside prompt-chain-worker.js:

export class LLMRunnable extends Runnable {
    constructor(askLLMFn, schema) {
        super();
        this.askLLMFn = askLLMFn;
        this.schema = schema;
    }
    async invoke(prompt, config = {}) {
        return await this.askLLMFn(prompt, this.schema);
    }
}

export class JSONOutputParserRunnable extends Runnable {
    async invoke(responseText, config = {}) {
        try {
            return { success: true, parsed: JSON.parse(responseText) };
        } catch (e) {
            return { success: false, error: "Invalid JSON format received. You must respond strictly in JSON syntax." };
        }
    }
}

By isolating these, we can now construct an inferenceStepChain like this: PromptTemplate -> LLMRunnable -> JSONOutputParserRunnable.

Encapsulating the ReAct Loop

The most complex part of our previous implementation was the runReActLoop function. We encapsulated this entire beast into a dedicated ReActAgentExecutor class that also extends Runnable.

Instead of hardcoding the LLM calls, the Executor now relies on an injected inferenceStepChain.

export class ReActAgentExecutor extends Runnable {
    constructor({ tools = [], skills = [], memory, toolRetriever, skillRetriever, promptTemplate, inferenceStepChain, askLLM, logToMain }) {
        super();
        this.tools = tools;
        this.skills = skills;
        this.memory = memory;
        this.toolRetriever = toolRetriever;
        this.skillRetriever = skillRetriever;
        this.promptTemplate = promptTemplate;
        this.inferenceStepChain = inferenceStepChain;
        this.askLLM = askLLM;
        this.logToMain = logToMain;
    }

    async invoke({ userPrompt, sessionId }) {
        let isComplete = false;
        let finalResult = "";
        let loopCount = 0;

        let { history: historyTurns, summary: conversationSummary } = await this.memory.getHistory(sessionId);

        const relevantTools = await this.toolRetriever.getRelevantTools(userPrompt, 3);
        const relevantSkills = await this.skillRetriever.getRelevantSkills(userPrompt, 3);

        let skillInstructions = "";
        if (relevantSkills.length > 0) {
            for (const skill of relevantSkills) {
                this.logToMain(`System: Activating skill "${skill.name}"`);
                skillInstructions += `${skill.instructions} `;

                for (const skillTool of skill.tools) {
                    if (!relevantTools.some(t => t.name === skillTool.name)) {
                        relevantTools.push(skillTool);
                    }
                }
            }
        }

        const toolsMap = new Map(relevantTools.map(t => [t.name, t]));

        let currentTurnLog = `User: ${userPrompt}\n`;
        let chainInput = { relevantTools, historyTurns, userPrompt, summary: conversationSummary, skillInstructions };

        while (!isComplete && loopCount < 7) {
            loopCount++;

            const stepResult = await this.inferenceStepChain.invoke(chainInput);

            if (!stepResult.success) {
                chainInput = `Observation: ${stepResult.error}`;
                continue;
            }

            const response = stepResult.parsed;

            if (response.thought) {
                this.logToMain(`Thought: ${response.thought}`);
                currentTurnLog += `Thought: ${response.thought}\n`;
            }

            if (response.finalAnswer && response.finalAnswer.trim() !== "") {
                finalResult = response.finalAnswer;
                currentTurnLog += `Assistant: ${response.finalAnswer}\n`;
                isComplete = true;
            }
            else if (response.toolName && response.toolName !== "none" && toolsMap.has(response.toolName)) {
                this.logToMain(`Action: Running ${response.toolName} with input "${response.toolInput}"`);

                const tool = toolsMap.get(response.toolName);
                let toolResult;
                let success = false;
                let retryCount = 0;
                const maxRetries = 3;

                while (retryCount <= maxRetries && !success) {
                    try {
                        toolResult = await runWithTimeout(tool.executeFn, response.toolInput, 3000);
                        success = true;
                    } catch (err) {
                        if (isRecoverableError(err) && retryCount < maxRetries) {
                            retryCount++;
                            this.logToMain(`Observation: Tool timed out. Retrying...`);
                            await delay(1000);
                        } else {
                            currentTurnLog += `Action: ${response.toolName}("${response.toolInput}")\nObservation: Tool failed with error: ${err.message}\n`;
                            this.logToMain(`Observation: Tool failed with error: ${err.message}`);
                            chainInput = `Observation: Tool '${response.toolName}' failed because: ${err.message}. Please correct the input/parameters, try a different approach, or check tool availability, and try again.`;
                            break;
                        }
                    }
                }

                if (success) {
                    currentTurnLog += `Action: ${response.toolName}("${response.toolInput}")\nObservation: ${toolResult}\n`;
                    this.logToMain(`Observation: ${toolResult}`);
                    chainInput = `Observation from ${response.toolName}: ${toolResult}\nGiven this observation, output your next step as JSON:`;
                }
            }
            else if (response.toolName === "none" || response.toolName === "") {
                chainInput = `Observation: You set toolName to "none" but omitted a finalAnswer. Provide your final answer text in the JSON.`;
            }
            else {
                chainInput = `Observation: Tool '${response.toolName}' is not loaded. Select from available tools or use 'none'.`;
            }
        }

        if (finalResult) {
            historyTurns.push(currentTurnLog.trim());
            const compressionResult = await compressHistory(historyTurns, conversationSummary, this.askLLM, this.logToMain);
            await this.memory.saveHistory(sessionId, compressionResult.historyTurns, compressionResult.updatedSummary);
        }

        return finalResult || "Error: Reached maximum iterations.";
    }
}

The Universal Agent Runtime

The final piece of the puzzle is the web worker entry point: createAgentWorker. It is now a polymorphic, universal runtime host.

If you pass it an array of tools (the legacy way), it automatically wires up the default ReActAgentExecutor. But, if you pass it a custom Runnable chain, it completely bypasses the ReAct logic and executes your custom topology.

export function createAgentWorker(toolsOrRunnable, skillsArray = []) {
    let msgId = 0;
    const resolvers = new Map();

    const memory = new AgentMemory();

    const agentSchema = {
        "type": "object",
        "properties": {
            "thought": { "type": "string" },
            "toolName": { "type": "string" },
            "toolInput": { "type": "string" },
            "finalAnswer": { "type": "string" }
        },
        "required": ["thought", "toolName", "toolInput", "finalAnswer"]
    };

    function askLLM(prompt, schema = agentSchema) {
        return new Promise((resolve, reject) => {
            const id = ++msgId;
            resolvers.set(id, { resolve, reject });
            self.postMessage({ id, type: MessageContext.llmRequest, payload: { prompt, schema } });
        });
    }

    function logToMain(message) {
        self.postMessage({ id: 0, type: MessageContext.agentLog, payload: message });
    }

    let agentExecutor;
    if (toolsOrRunnable instanceof Runnable || typeof toolsOrRunnable?.invoke === "function") {
        agentExecutor = toolsOrRunnable;
    } else {
        const toolsArray = Array.isArray(toolsOrRunnable) ? toolsOrRunnable : [];
        const toolRetriever = new ToolRetriever(toolsArray);
        const skillRetriever = new SkillRetriever(skillsArray);
        const promptTemplate = new PromptTemplate();

        const llmRunnable = new LLMRunnable(askLLM, agentSchema);
        const parserRunnable = new JSONOutputParserRunnable();

        const inferenceStepChain = RunnableSequence.from([
            new RunnableLambda(async (promptInput) => {
                if (typeof promptInput === "string") return promptInput;
                return await promptTemplate.invoke(promptInput);
            }),
            llmRunnable,
            parserRunnable
        ]);

        agentExecutor = new ReActAgentExecutor({
            tools: toolsArray,
            skills: skillsArray,
            memory,
            toolRetriever,
            skillRetriever,
            promptTemplate,
            inferenceStepChain,
            askLLM,
            logToMain
        });
    }

    self.addEventListener('message', async (e) => {
        const { id, type, payload } = e.data;

        if (type === MessageContext.llmResponse) {
            resolvers.get(id)?.resolve(payload);
            resolvers.delete(id);
        } else if (type === MessageContext.llmError) {
            resolvers.get(id)?.reject(new Error(payload));
            resolvers.delete(id);
        } else if (type === MessageContext.startLoop) {
            try {
                await memory.init();
                const answer = await agentExecutor.invoke({
                    userPrompt: payload.userPrompt,
                    sessionId: payload.sessionId,
                    memory,
                    askLLM,
                    logToMain
                });
                const finalStr = typeof answer === "string" ? answer : answer?.finalAnswer || JSON.stringify(answer);
                self.postMessage({ id, type: MessageContext.agentComplete, payload: finalStr });
            } catch (err) {
                self.postMessage({ id, type: MessageContext.agentError, payload: err.message });
            }
        }
    });
}

Summary

This refactoring completely changes the trajectory of the prompt-chain project.

By adopting the LCEL pattern, we are no longer forcing developers into a single cognitive architecture. If you want to bypass the ReAct loop entirely and just run a fast, linear classification pipeline directly in the browser’s background thread, you can define a RunnableSequence and pass it right into the worker.

Modularity brings freedom. By explicitly separating prompt formatting, LLM execution, output parsing, and the looping mechanism, we have built a highly extensible foundation for whatever browser-native AI topologies developers dream up next.

If you are interested in the code, you can find it on my Github — https://github.com/gilf/prompt-chain.

DEV Community