DEV Community: Daniel Malek

LLM Agent Observability with Langfuse

Daniel Malek — Tue, 16 Jun 2026 11:14:43 +0000

repository link https://github.com/Dan1618/langfuse-observability

Introduction

In this post I will show how to locally install Langfuse to support observability of agent application. Thanks to that we will be able to watch the agents' internal properties like prompt used, agents' answer or token usage.
I will use existing application from this article https://dev.to/dan1618/building-ai-agent-with-langgraph-and-nestjs-4d86

Why agent application observability is important?

Observability means having deep visibility into the "how" and "why" of your software's behavior— a requirement that is rapidly evolving with the rise of AI. Traditional Application Performance Monitoring (APM) tools were built for deterministic systems: a user clicks a button, a predefined API is called, the database is queried, and a response is returned. If you try to use a traditional APM tool on an AI agent, you will only see that an HTTP request was made to an LLM provider and it took 4 seconds. It won't tell you what the model was thinking. Using LLMs requires us to measure different things like Cost and Token Management, Reasoning Loops (we need to consider agents that spawn different agents and other agent-specific actions), and Tracking Response Quality (Evals).

Why Langfuse?

Langfuse has emerged as a go-to open-source AI engineering platform because it was built specifically for the probabilistic nature of LLMs, rather than trying to shoehorn AI metrics into legacy APM dashboards.

Purpose-Built Tracing: It natively understands LLM concepts. It traces the full execution tree—sessions, individual exchanges, LLM calls, and external tool usage—so you can visually inspect an agent's exact decision path.
OpenTelemetry (OTel) Native: Modern agent frameworks (like AutoGen, Pydantic AI, or Amazon Bedrock AgentCore) emit traces via OpenTelemetry. Langfuse seamlessly ingests these standard OTel traces and maps them to its generative AI data model.
Prompt Management: Langfuse allows you to decouple your prompts from your codebase. You can test, version, and roll back prompts directly in the platform without needing to redeploy your application.
Built-in Evaluations: It allows you to run "evals" on production data continuously. You can track quality metrics (faithfulness, accuracy) alongside latency and cost, triggering alerts if the agent's performance degrades.

Langfuse features

here is the visual overview on yt: https://www.youtube.com/watch?v=2E8iTvGo9Hs

The core features and metrics you can track and interact with inside the Langfuse dashboard are organized across several key areas:

1. Core Observability & In-Depth Tracing

Tracing is the foundation of the platform. The dashboard allows you to visually map out exactly what happens under the hood during an LLM execution:

Hierarchical Trace Views: See the entire execution journey of a request. It breaks down multi-step agentic workflows into a timeline or a visual graph, showing every tool invocation, retrieval step (RAG), and LLM call.
Session Tracking: Group individual traces into "Sessions" to track and debug multi-turn user conversations over time.
Deep Filtering: Quickly slice your execution data by user ID, specific models, releases/versions, or custom metadata tags.

2. Live Performance & Health Metrics

The home dashboard natively aggregates technical metrics to monitor system stability:

Latency Monitoring: Inspect overall response times and Time-to-First-Token (TTFT) distributions to find bottlenecks or slow steps in your agents.
Streaming Speed: Tracks tokens per second to optimize real-time user experiences.
Error Rates: Automatically flags failed LLM calls, timeout errors, or API hiccups.

3. Cost & Token Tracking

LLM costs can get out of hand quickly. The dashboard includes a robust cost-management suite:

Token Breakdowns: Track exact token consumption (input vs. output tokens) aggregated across different model types (e.g., GPT-4, Claude, Llama).
Cost Attribution: Compute spending trends and look at cost breakdowns by model provider or down to individual user IDs so you know who or what is driving your API bills.

4. Evaluation & Quality Analytics

The dashboard helps you understand if your app is actually outputting good, safe answers:

Multi-Method Score Tracking: Visualizes quality scores compiled from user feedback (thumbs up/down), manual human annotations, or automated "LLM-as-a-judge" evaluators (checking for hallucinations, correctness, toxicity, etc.).
Annotation Queues: Provides a space for teams to systematically review production traces, apply labels, and build "golden datasets" for future testing.

5. Custom Dashboards & Advanced Reporting

Langfuse includes a fully customizable dashboard builder so you don’t get overwhelmed by irrelevant charts:

Widget Builder: Create, move, and resize different charts (like histograms or line graphs) to build tailored views.
Pivot Tables: Built-in pivot table widgets allow for multi-dimensional data analysis—for instance, evaluating how quality scores stack up against model costs across different application features.
Curated Layouts: Pre-built, single-click templates focused heavily on Cost, Latency, Prompts, or Evals to get you started immediately. ##### 6. Prompt Management Content System The dashboard doubles as a low-code prompt registry:
Version Control: Create, test, and track variations of prompts over time without updating your core codebase.
Playground Integration: Open any real production input trace directly inside an interactive LLM Playground to test prompt tweaks and compare model behaviors side by side.

💡 Data Portability Note: If you prefer visualizing this data outside of Langfuse, the platform includes a Query API endpoint. This allows you to fetch any aggregated trace metric directly from the dashboard database and pipe it straight into your own internal business intelligence tools (like Tableau or Looker).

Instalation

Instalation example will be based on the article and repository I have created earlier - an agent created with NestJS and LangGraph.

https://dev.to/dan1618/building-ai-agent-with-langgraph-and-nestjs-4d86
https://github.com/Dan1618/nest-langgraph

Langfuse’s solution to LLM monitoring and observability consists of two parts:

Langfuse SDKs
Langfuse Server

The Langfuse SDKs are the coding side of Langfuse, available for various platforms, which allow you to enable instrumentation in your application’s code. They are nothing more than a few lines of code which can be used appropriately in your application’s codebase.

The Langfuse server, on the other hand, is the UI based dashboard, along with other underlying services, which can be used to log, view and persist all the traces and metrics. The Langfuse’s dashboard is usually accessible through any modern web browser.

1. Setup Langfuse server with docker
I suggest to install this part by following the docs, since they will be the most actual.
https://langfuse.com/self-hosting/deployment/docker-compose
after proper instalation you can create new project and LANGFUSE_SECRET_KEY and LANGFUSE_PUBLIC_KEY that will be needed to pass to .env file, so langfuse recognizes your app.

2. Installing SDK
https://langfuse.com/docs/observability/sdk/overview
In example of the nestJS app I provided:

a) update your .env file

LANGFUSE_SECRET_KEY=""
LANGFUSE_PUBLIC_KEY=""
LANGFUSE_BASE_URL="http://localhost:port"
OPENAI_API_KEY=""

b) create file which runs open telemetry and initialize it at the very top of your application start

import * as dotenv from 'dotenv';
dotenv.config(); // ← must run before any Langfuse / OTel import reads env vars

import { NodeSDK } from '@opentelemetry/sdk-node';
import { LangfuseSpanProcessor } from '@langfuse/otel';

const sdk = new NodeSDK({
  spanProcessors: [new LangfuseSpanProcessor()],
});

sdk.start();

c) for each appGraph invokation add CallbackHandler

    // Each invocation gets its own handler so the trace is correctly scoped.
    const langfuseHandler = new CallbackHandler({
      tags: ['langgraph', 'portfolio-review'],
    });

    // Invoke the graph — it will pause at the first interrupt in humanReview
    const result = await this.appGraph.invoke(initialState, {
      ...this.threadConfig,
      callbacks: [langfuseHandler],
    });

Observations in practice

Here are examples of some observations that we can make thanks to Langfuse dashboard.

In the screenshot below there is a call to the llm visible with most important properties like the prompt, user input, llm output, time it took or tokens usage (117):

Here you can see a verbose table of logs of any action taken by the agent. The table is interactive, firterable and costomizable.

Summary

In this post we have covered the basics of observability of agentic systems. There are mentioned the main features of Langfuse, instalation guide for NestJS apps with example LangGraph integration and finally - tracing examples of the working agent app.

Building AI Agent with LangGraph and NestJS

Daniel Malek — Wed, 03 Jun 2026 19:24:16 +0000

github link - https://github.com/Dan1618/nest-langgraph

What we are building and why

This project shows how to build a workflow of an AI agent that helps with data processing. In following application there is provided an example of a stock market portfolio that we want to edit with help of agent, then include human in the loop to verify each of the stock and eventually let the agent write a nice overview of the portfolio.

The main goal of this article is to show capabilities and a demo of LangGraph - opensource library for building complex AI agents.

The UI and workflow

User sees a portfolio that will be reviewed. (this example consists of three items, but could be added more)
After start the agent automatically assigns investment risk property.
Each of stock items are presented to the reviewer so he can approve, reject or edit each stock (human in the loop).
After all items are reviewed by human, the agent generates a nice overview of picked stocks.

So the flow goes like this:

TechStack

Backend framework - NestJS
Templating / UI - Handlebars
LLM orchestration - LangGraph
Chat model - OpenAI API

What is LangGraph?
LangGraph is a specialized framework designed for building complex, stateful AI applications. Instead of running a simple linear sequence, it allows you to structure AI behaviors as graphs—complete with nodes (actions) and edges (decisions)—which makes it the industry standard for creating robust, cyclical AI agent workflows. Major features:
a) Advanced Workflows
Traditional AI structures only go in one direction (Step 1 → Step 2 → Step 3). LangGraph allows for cyclical workflows (loops). This means an AI can try a task, evaluate the result, and loop back to fix its own mistakes if it fails.

It also manages a centralized "State," ensuring that every step of the workflow has access to the history of what happened before.

b) Multi-Agent Systems
LangGraph excels at orchestration, allowing you to deploy multiple independent AI agents within the same system.

You can break a massive, complex problem down into smaller parts.
Each agent can be a specialist (e.g., one Researcher agent, one Writer agent, and one Editor agent).
They can collaborate, pass tasks to one another, and transfer control dynamically based on the graph's rules.

c) Native Observability via LangSmith
Debugging autonomous agents can be incredibly difficult because they make decisions on the fly. LangGraph integrates natively with LangSmith to solve this.
With LangSmith connected, you get full visibility into the system:

Traceability: See exactly which agent was called, what prompt was sent, and what the response was at every single node.
Cost & Latency Tracking: Monitor token usage and execution time for each step of your advanced workflow.
Time Travel: Pause the execution, look at the state, and even modify it to test how your agents react to different inputs.

Technical implementation

Setup — NestJS + Handlebars

NestJS gives us a structured, modular backend with dependency injection out of the box. Handlebars is wired in as the view engine so it can serve a lightweight UI without pulling in a full frontend framework.

If you are planning to run this code yourself, please remember to add your OPENAI_API_KEY to .env file.

codebase reviewing

Basically the codebase consists of controller which handles endpoints and service which handles code related to the agent.

The agent's graph is built in constructor of graph.service.ts:

    const workflow = new StateGraph(StateAnnotation)
      .addNode("scoreCompanies", scoreCompanies)
      .addNode("humanReview", humanReview)
      .addNode("generateOverview", generateOverview)
      .addEdge(START, "scoreCompanies")
      .addEdge("scoreCompanies", "humanReview")
      .addEdge("humanReview", "generateOverview")
      .addEdge("generateOverview", END);

The graph has start, end, nodes and edges. Moreover it can be coded with different type of nodes (like ToolNode for MCP-like tools, or RouterNode for returning edges conditionally). You can see in code above that each addEdge function holds a pointer to two nodes.

Visualising the graph with Mermaid

Mermaid is a tool to go from text to a living diagram.
The graph can be visualised with this piece of code:

const graph = await this.appGraph.getGraph();
console.log(graph.drawMermaid());

It will return a diagram code which if you pass to editor (like this https://mermaid.live/) will display as following:

Although this workflow looks very simple here is an example how more advanced LangGraph workflow could look like with more features:

Running the workflow

The workflow starts when user clicks 'Start' in UI and endpint @Post('start') is run. Data from json file is being taken and passed to first node (the data could come from any data source like database).

The First node that is being invoked is scoreCompanies which for each stock element provided from data source asks OpenAI to generate the risk scoring and passes state to next node.

The next node is humanReview which interrupts the graph to wait for response from user. It is important here to notice that for the user's review there is a separate endpoint @Post('review') which resumes the interrupted graph when user clicks approve or reject on the UI. That endpoint provides logic to process all stock items and when that is done it countinues to generateOverview node, which yet again connects to LLM for a text review of provided stock portfolio.

Summary

LangGraph provides a lot of tooling for creating flexible, robust and advanced agent workflows, moreover it works very well with observability libraries for LLM apps like LangSmith. In this post I presented an example workflow with several features, the codebase and ideas how it can be extended.

Building a Knowledge Base with RAG Using NestJS, LangChain and OpenAI

Daniel Malek — Tue, 03 Mar 2026 11:27:12 +0000

Source code: github.com/Dan1618/Articles-rag

1. What We're Building and Why

Retrieval-Augmented Generation (RAG) is a technique that enhances Large Language Model responses by grounding them in external data. Instead of relying solely on the model's training data, RAG retrieves relevant information from your own curated sources and injects it into the prompt, producing answers that are more accurate and up-to-date.
In this project we build a system where you can save articles from the web into a vector database and then ask questions about their content through a chat-like interface. Think of it as assembling a personal knowledge base: every article you feed in becomes searchable context for future queries.
Could you just point a bot at a live website each time? Sure — but by persisting the data in a vector store you are building a knowledge base that grows over time. There is nothing stopping you from combining both approaches, or extending the pipeline to ingest PDFs and other document types as well.

The Stack

Backend framework - NestJS

Templating / UI - Handlebars

LLM orchestration - LangChain

Vector store - FAISS (local)

Embeddings & chat model - OpenAI API

LangChain is an open-source framework that simplifies building applications powered by language models. It provides ready-made abstractions for document loading, text splitting, embedding, vector storage, and chaining LLM calls together — so you can focus on your application logic rather than low-level plumbing.

FAISS (Facebook AI Similarity Search) is a library for efficient similarity search over dense vectors. We use it here as a local vector store, which is simpler to set up and demonstrate than a hosted vector database, while still being fast enough for production-grade similarity lookups.

The UI

The user interface includes three input fields for links, a field for asking related questions, and buttons to save articles to FAISS or generate answers.

2. Technical Implementation

The application is split into two main services:

IngestService — loads articles from the web, splits them into chunks, creates embeddings, and saves them to the FAISS index.
AppService — loads the saved FAISS index, retrieves relevant chunks for a given question, and runs a Map-Reduce QA chain to produce an answer.

2.1 Setup — NestJS + Handlebars

NestJS gives us a structured, modular backend with dependency injection out of the box. Handlebars is wired in as the view engine so we can serve a lightweight chat-style UI without pulling in a full frontend framework. The two services above are standard NestJS @Injectable() providers.

2.2 Ingesting Articles

The IngestService.ingest() method handles the entire pipeline from raw URL to searchable vector store:

@Injectable()
export class IngestService {
    private readonly directory = 'faiss_store';
    private readonly urlsFile = path.join(this.directory, 'urls.json');

    async ingest(urls: string[]) {
        console.log('Building new FAISS store...');
        const embeddings = new OpenAIEmbeddings();

        // 1. Load data from each URL using Cheerio
        const docs = [];
        for (const url of urls) {
            const loader = new CheerioWebBaseLoader(url);
            const loadedDocs = await loader.load();
            docs.push(...loadedDocs);
        }

        // 2. Split documents into chunks
        const textSplitter = new RecursiveCharacterTextSplitter({
            separators: ['\n\n', '\n', '.', ','],
            chunkSize: 1000,
        });
        const splitDocs = await textSplitter.splitDocuments(docs);

        // 3. Create embeddings and persist the FAISS index
        const vectorStore = await FaissStore.fromDocuments(splitDocs, embeddings);
        await vectorStore.save(this.directory);
        fs.writeFileSync(this.urlsFile, JSON.stringify(urls));
    }
}

Let's walk through the key stages.

Loading — `CheerioWebBaseLoader`

CheerioWebBaseLoader is a LangChain document loader that fetches a web page and extracts its text content using the Cheerio HTML parser. Each URL becomes a Document object containing the page text and metadata.

Splitting — Why Chunks Matter

LLMs have a finite context window — the maximum number of tokens a model can process in a single request. If a model has a 4,000-token limit and your prompt already uses 3,500 tokens, only 500 tokens are left for the completion. A full article easily exceeds that budget, so we need to split it into smaller chunks that fit comfortably inside the context window.

RecursiveCharacterTextSplitter handles this by trying a hierarchy of separators (\n\n, \n, ., ,) to find natural break points. We set chunkSize: 1000 to keep each chunk under 1,000 characters.

Overlap Chunks

RecursiveCharacterTextSplitter also supports an overlap option (chunkOverlap), which allows adjacent chunks to share a portion of text at their boundaries. Think of it like the "previously on…" recap at the start of a TV episode followed by the "coming up next" teaser at the end — it ensures that context isn't lost at the seams between chunks.

Embedding & Saving — `FaissStore.fromDocuments`

FaissStore.fromDocuments(splitDocs, embeddings) sends each chunk to the OpenAI Embeddings API, converts the text into high-dimensional vectors, and indexes them in a FAISS store. The resulting index is then saved to disk with vectorStore.save(), so it can be reloaded later without re-embedding.

It is worth highlighting here that to vectorize the data, the app connects to OpenAI API, using it is not free (although for some older models like 'gpt-4o-mini' it is still quite cheap). If you want to use this application please remember to include your OpenAI API key in .env file. You can see the logs of connecting to OpenAI in the console, similar process will take place when retrieving the data from the the FAISS index when answering questions.

2.3 Answering Questions

Once the articles are ingested, the AppService.answerQuestion() method handles the retrieval and answering:

async answerQuestion(question: string) {
    const embeddings = new OpenAIEmbeddings();
    const vectorStore = await FaissStore.load(this.directory, embeddings);

    const llm = new ChatOpenAI({
        modelName: 'gpt-4o-mini',
        temperature: 0.7,
        maxTokens: 1000,
    });

    const chain = loadQAChain(llm, { type: 'map_reduce' });

    // Retrieve the most relevant chunks
    const retriever = vectorStore.asRetriever();
    const retrievedDocs = await retriever.invoke(question);

    // Run the QA chain
    const result = await chain.invoke({
        input_documents: retrievedDocs,
        question: question,
    });

    // Get the sources
    const sources = Array.from(new Set(retrievedDocs.map(doc => doc.metadata.source)));

    return { status: 'done', answer: result.text, sources: sources };
}

Step by step:

Load the vector store — FaissStore.load() reads the previously saved index from disk.
Create the LLM — We use gpt-4o-mini with a temperature of 0.7 for a good balance of creativity and accuracy.
Retrieve relevant chunks — vectorStore.asRetriever() returns a retriever that performs a similarity search. When we call retriever.invoke(question), it embeds the question and finds the most similar chunks in the FAISS index.
Run the Map-Reduce chain — The retrieved documents and the question are passed into the chain, which produces the final answer.

2.4 Map-Reduce: Reassembling the Chunks

When we split an article into chunks for ingestion, we eventually need a strategy to recombine those chunks when answering a question. This is where the Map-Reduce pattern comes in.

In the context of LLM applications, Map-Reduce operates in two phases:

Map — Each retrieved chunk is sent to the LLM individually. The model extracts or summarizes only the information relevant to the question, producing a filtered chunk (FC). This step runs in parallel, and its primary goal is to reduce the size of each chunk down to the essential content.
Reduce — All the filtered chunks are combined into a single summary, which is then sent to the LLM along with the original question to produce the final answer.

LangChain's loadQAChain with type: 'map_reduce' wires this up for you. Under the hood it uses two sub-chains:

An LLM chain that processes each individual document (the Map step).
A combine documents chain that merges the Map outputs into one cohesive input for the final LLM call (the Reduce step).

The "Stuff" Optimization

Although in this implementation we set map_reduce explicitly, LangChain includes an internal optimization: if the total retrieved text (all chunks combined) is smaller than the LLM's context window or a pre-defined token_max limit, the chain detects that it is cheaper and faster to skip the Map phase entirely.

Instead of performing multiple Map calls followed by one Reduce call, it simply "stuffs" all the documents into a single prompt and makes one LLM call. This automatic fallback saves both time and API cost when the input is small enough to fit.

3. Summary and Ideas for the Future

We've built a RAG pipeline that ingests articles from the web, stores them in a local FAISS vector store, and answers questions using a Map-Reduce QA chain powered by OpenAI. The key takeaways:

RAG grounds LLM responses in your own data, making answers more accurate and verifiable.
Chunking with overlap preserves context across split boundaries.
Map-Reduce elegantly handles cases where retrieved content exceeds the context window, with an automatic "Stuff" fallback for smaller inputs.
FAISS provides a zero-infrastructure vector store that is perfect for demos and small-to-medium workloads.

Where to Go from Here

Add PDF and file ingestion — extend the loader to support PDFLoader, TextLoader, and other LangChain document loaders for a richer knowledge base.
Persistent hosted vector store — migrate from local FAISS to a managed solution like Pinecone, Weaviate, or Qdrant for multi-user, production-grade deployments.
Streaming responses — use LangChain's streaming callbacks to deliver answers token-by-token for a more responsive chat experience.
Hybrid retrieval — combine vector similarity search with keyword-based (BM25) retrieval for better recall on exact-match queries.

You can not make a reverse engineering of “why” somebody made a decision

Daniel Malek — Thu, 01 May 2025 19:58:46 +0000

Reverse engineering is a process of disassembling and analyzing a finished product to understand how it works, often with the goal of recreating or improving upon it. It is also commonly met in software development, when you want to know how a program works.

To achieve understanding of existing code you need to use a special debugging tools and typically spend a lot of time. Once you understand it as a whole, you can do whatever you need… usually. What if you need to write further feature, but the code was written in a not standard way or there is a workaround, which you can not simply ignore? In such situation you would want to know why the code is made this way to safely finish your task. But the reverse engineering process might not answer the question “why”.

Similar situations might take place in other engineering fields, like building construction, let me bring some examples.

1. Space Planning and Layout:
The Observable: The kitchen is located on the north side of the house.
The Difficulty in Reverse Engineering: Was it to maximize natural light in the living areas on the south? Was it due to plumbing constraints? Did the homeowner prefer a cooler kitchen? Was it simply the most logical flow based on the overall footprint of the house? Maybe the architect had a specific design philosophy about the placement of service areas.

2. Material Selection:
The Observable: A specific type of brick was chosen for the facade.
The Difficulty in Reverse Engineering: Was it solely based on cost? Aesthetics? Durability in the local climate? A personal preference of the architect? A long-standing relationship with a particular supplier offering a "good deal"?
Perhaps the architect envisioned a certain texture or color that only this brick provided, even if alternatives existed. Maybe local material availability played a significant role that isn't explicitly documented.

If a new person was assigned to building project and needs to make changes in the constructions or architecture, he will definitively need to understand why things has been designed certain way to proceed.

Same applies to software engineering, let’s check up:

1. Choice of Programming Language:
The Observable: The application is built using Python.
The Difficulty in Reverse Engineering: Was it due to the team's existing expertise? The availability of specific libraries or frameworks? The perceived speed of development? Performance requirements for certain tasks? Perhaps the initial prototype was quickly built in Python, and the team decided to stick with it. Maybe the lead developer had a strong preference for Python's syntax and ecosystem.

2. Architectural Pattern Selection:
The Observable: The software follows a microservices architecture.
The Difficulty in Reverse Engineering: Was it chosen for scalability and independent deployments? To allow different teams to work on separate parts? Did the team anticipate a large and complex system from the outset? Perhaps a previous monolithic architecture proved difficult to maintain and evolve. Maybe the team wanted to experiment with a modern architectural style.

3. Use of a Specific Library or Framework Version:
The Observable: The project uses version 3.2.1 of a particular library.
The Difficulty in Reverse Engineering: Was this version chosen because it was the latest stable release at the time? Because it offered a specific feature that was required? Or perhaps because the team had prior experience with that exact version and felt comfortable with it? Maybe a newer version introduced breaking changes that the team wasn't ready to address. Without commit messages or project notes detailing the upgrade (or lack thereof), the exact reasoning is speculative.

4. Implementation Details of the Singleton Pattern:
The Observable: A class is implemented as a Singleton.
The Difficulty in Reverse Engineering: Was the Singleton pattern used to ensure a single instance for resource management? To provide a global point of access? Were the potential drawbacks of Singletons (like tight coupling and testability issues) fully considered? Were lazy initialization or thread-safe implementations specific requirements that influenced the chosen approach?

5. Implementation Details of a Specific Algorithm:
The Observable: A particular sorting algorithm is implemented in a specific way.
The Difficulty in Reverse Engineering: Was this implementation chosen for its time complexity? Its space complexity? Its readability? Did the developer optimize it for a specific use case or data distribution? Perhaps the developer simply remembered or copied this particular implementation without fully understanding the alternatives or the nuanced trade-offs.

In software engineering, the "why" behind a code decision can be influenced by a multitude of factors: technical constraints, team skills, time pressures, business requirements, personal preferences of developers, evolving understanding of the problem, and even legacy choices that were carried forward.

While code comments and documentation should ideally explain the reasoning, they are often incomplete or missing. This makes truly reverse engineering the original intent a challenging, if not impossible, task. We can analyze the code and its behavior, but the full story of the decisions that shaped it often remains within the minds of the developers who wrote it.

Managing software complexity – Simple is not easy

Daniel Malek — Thu, 01 May 2025 19:37:55 +0000

Recently I have seen a sentence “Simple solutions scale better, are easier to maintain and deliver value faster.”

In this post I will show some examples, where initial code which seems fine to implement, may not be relevant when application grows and requires thinking in terms of scalability, maintainability and readability. Some parts of code will require accepting different trade offs, broader perspective on what needs to be achieved and definietly – more abstraction layers and sophisticated approach.

The examples below are written in JavaScript, but I believe they may be understood by any software developer.

1. Imperative vs. Declarative Programming (describing logic)

The Illusion of Simplicity (Imperative): Imperative programming focuses on how to achieve a result by explicitly stating the steps. It can seem simple for basic tasks.

const numbers = [1, 2, 3, 4, 5];
const doubled = [];
for (let i = 0; i < numbers.length; i++) {
  doubled.push(numbers[i] * 2);
}
console.log(doubled); // [2, 4, 6, 8, 10]

The hidden complexity: As the logic becomes more intricate, imperative code can become verbose and harder to reason about. Managing state and side effects explicitly can lead to more opportunities for errors.

The path to simplicity: Declarative programming focuses on what the result should be, abstracting away the control flow. Higher-order functions in JavaScript enable a more declarative style.

const numbers = [1, 2, 3, 4, 5];
const doubled = numbers.map(number => number * 2);
console.log(doubled); // [2, 4, 6, 8, 10]

The map function abstracts away the iteration process, making the code more concise and easier to understand the intent – to double each number in the array. While the map function itself has underlying complexity, it provides a simpler interface for the developer.

2. Reduce vs Map and Filter
This example will show that using more sophisticated code (which is more efficient) might cause a cognitive overwhelm. Let’s say we want to have an array of doubled numbers, only when the value is higher than 3. You may use 'reduce' for that it will loop through the array, multiply number if it is higher than 2, then push it to the output array.

const numbers = [1, 2, 3, 4, 5];
const doubledHigherThanTwo = numbers.reduce((acc, number) => {
  if (number > 2) {
    acc.push(number * 2);
  }
  return acc;
}, []);
console.log(doubledHigherThanTwo); // [6, 8, 10]

For this task you may also use filter and map combined, which seems more clear. It filters out numbers higher than 2, then makes multiplying on each element.

const numbers = [1, 2, 3, 4, 5];
const doubledHigherThanTwo =
  numbers
    .filter(number => number > 2)
    .map(number => number * 2);
console.log(doubledHigherThanTwo); // [6, 8, 10]

Both of the solutions are fine and using reduce may be more flexible in future, but you can use the second approach, when code readability or maintainability (debugging) is your priority.

3. Factory design pattern
Imagine you're building a system to manage different types of notifications (email, SMS, push). Creating a notification might involve setting up API keys, formatting the message, and potentially logging the creation.

class EmailNotification {
  constructor(recipient, subject, body, apiKey) {
    this.recipient = recipient;
    this.subject = subject;
    this.body = body;
    this.apiKey = apiKey;
    this.setupEmailService();
    this.formatMessage();
    this.logCreation();
  }

  setupEmailService() {
    console.log(`Setting up email service with API key: ${this.apiKey.substring(0, 5)}...`);
    // Imagine actual API client initialization here
  }

  formatMessage() {
    this.formattedBody = `Subject: ${this.subject}\n\n${this.body}`;
    console.log("Email message formatted.");
  }

  send() {
    console.log(`Sending email to ${this.recipient}:\n${this.formattedBody}`);
    // Imagine actual sending logic here
  }

  logCreation() {
    console.log(`Email notification created for ${this.recipient} at ${new Date().toLocaleTimeString()}.`);
  }
}

class SMSNotification {
  constructor(phoneNumber, message, accountSid, authToken) {
    this.phoneNumber = phoneNumber;
    this.message = message;
    this.accountSid = accountSid;
    this.authToken = authToken;
    this.setupSMSService();
    this.truncateMessage();
    this.logCreation();
  }

  setupSMSService() {
    console.log(`Setting up SMS service with SID: ${this.accountSid.substring(0, 5)}...`);
    // Imagine actual API client initialization here
  }

  truncateMessage() {
    this.truncatedMessage = this.message.substring(0, 140); // Basic truncation
    console.log("SMS message truncated (if necessary).");
  }

  send() {
    console.log(`Sending SMS to ${this.phoneNumber}: ${this.truncatedMessage}`);
    // Imagine actual sending logic here
  }

  logCreation() {
    console.log(`SMS notification created for ${this.phoneNumber} at ${new Date().toLocaleTimeString()}.`);
  }
}

// Client code creating notifications directly
const email = new EmailNotification(
  "user@example.com",
  "Important Update",
  "This is the content of the important update.",
  "YOUR_EMAIL_API_KEY_SECRET"
);
email.send();

const sms = new SMSNotification(
  "+1234567890",
  "Hey, check out the latest news!",
  "ACCOUNTSID12345",
  "AUTHTOKEN_SECRET"
);
sms.send();

Problems with this approach:

Repetitive Initialization: The client code is responsible for knowing which concrete class to instantiate and providing all the necessary initialization parameters.

Tight Coupling: The client code is directly coupled to the concrete EmailNotification and SMSNotification classes. If you add a new notification type, you'll need to modify the client code.

Scattered Logic: The complex initialization steps are within each notification class. If the initialization logic becomes more involved or shared across notification types, it can lead to duplication.

Now, let's refactor this using a Factory Pattern. We'll create a NotificationFactory to handle the object creation.

class EmailNotification {
  constructor(recipient, subject, body) {
    this.recipient = recipient;
    this.subject = subject;
    this.body = body;
    this.isServiceSetup = false;
    this.isMessageFormatted = false;
  }

  setupService(apiKey) {
    console.log(`Setting up email service with API key: ${apiKey.substring(0, 5)}...`);
    this.apiKey = apiKey;
    this.isServiceSetup = true;
  }

  formatMessage() {
    this.formattedBody = `Subject: ${this.subject}\n\n${this.body}`;
    console.log("Email message formatted.");
    this.isMessageFormatted = true;
  }

  send() {
    if (!this.isServiceSetup || !this.isMessageFormatted) {
      console.error("Email service not properly set up or message not formatted.");
      return;
    }
    console.log(`Sending email to ${this.recipient}:\n${this.formattedBody}`);
    // Imagine actual sending logic here
  }

  logCreation() {
    console.log(`Email notification created for ${this.recipient} at ${new Date().toLocaleTimeString()}.`);
  }
}

class SMSNotification {
  constructor(phoneNumber, message) {
    this.phoneNumber = phoneNumber;
    this.message = message;
    this.isServiceSetup = false;
    this.isMessageTruncated = false;
  }

  setupService(accountSid, authToken) {
    console.log(`Setting up SMS service with SID: ${accountSid.substring(0, 5)}...`);
    this.accountSid = accountSid;
    this.authToken = authToken;
    this.isServiceSetup = true;
  }

  truncateMessage() {
    this.truncatedMessage = this.message.substring(0, 140); // Basic truncation
    console.log("SMS message truncated (if necessary).");
    this.isMessageTruncated = true;
  }

  send() {
    if (!this.isServiceSetup || !this.isMessageTruncated) {
      console.error("SMS service not properly set up or message not truncated.");
      return;
    }
    console.log(`Sending SMS to ${this.phoneNumber}: ${this.truncatedMessage}`);
    // Imagine actual sending logic here
  }

  logCreation() {
    console.log(`SMS notification created for ${this.phoneNumber} at ${new Date().toLocaleTimeString()}.`);
  }
}

class NotificationFactory {
  constructor(emailApiKey, smsAccountSid, smsAuthToken) {
    this.emailApiKey = emailApiKey;
    this.smsAccountSid = smsAccountSid;
    this.smsAuthToken = smsAuthToken;
  }

  createNotification(type, ...args) {
    switch (type) {
      case 'email':
        const emailNotification = new EmailNotification(args[0], args[1], args[2]);
        emailNotification.setupService(this.emailApiKey);
        emailNotification.formatMessage();
        emailNotification.logCreation();
        return emailNotification;
      case 'sms':
        const smsNotification = new SMSNotification(args[0], args[1]);
        smsNotification.setupService(this.smsAccountSid, this.smsAuthToken);
        smsNotification.truncateMessage();
        smsNotification.logCreation();
        return smsNotification;
      default:
        throw new Error(`Unknown notification type: ${type}`);
    }
  }
}

// Client code using the factory
const notificationFactory = new NotificationFactory(
  "YOUR_EMAIL_API_KEY_SECRET",
  "ACCOUNTSID12345",
  "AUTHTOKEN_SECRET"
);

const emailNotification = notificationFactory.createNotification(
  'email',
  "user@example.com",
  "Important Update",
  "This is the content of the important update."
);
emailNotification.send();

const smsNotification = notificationFactory.createNotification(
  'sms',
  "+1234567890",
  "Hey, check out the latest news!"
);
smsNotification.send();

Benefits of using the Factory Pattern here:

Centralized Initialization: The complex initialization logic (setting up services, formatting/truncating messages, logging) is now handled within the NotificationFactory. The client code doesn't need to know the specifics of how each notification type is set up.

Decoupling: The client code interacts with the NotificationFactory interface, not the concrete notification classes directly. This makes it easier to add new notification types in the future without modifying the client code. You would simply extend the factory.

Simplified Client Code: The client code for creating notifications becomes much cleaner and more focused on providing the necessary data (recipient, subject, body, phone number, message).

Improved Maintainability: Changes to the initialization process for a specific notification type are isolated within the factory, making the system easier to maintain and debug.

In this example, the NotificationFactory encapsulates the complex steps involved in creating and initializing different notification objects, providing a cleaner and more maintainable way to manage object creation.

SUMMARY

These examples illustrate that what appears simple on the surface can often lead to significant complexity as the application scales or the requirements become more intricate. Achieving true simplicity often involves adopting more sophisticated patterns, architectures and language features that abstract away underlying complexities, leading to more maintainable, readable and robust code in the long run. "Simple is not easy" because it requires careful design, thoughtful abstraction, and often, embracing more powerful but initially seemingly more complex tools and paradigms.

Code as design

Daniel Malek — Wed, 11 Dec 2024 18:00:03 +0000

1. Introduction

There has been an open discussion if it is better to treat software as a design or as an engineering in the industry for many years, starting with the famous blog post “What Is Software Design” by Jack W. Reeves, 1992.

While this share similar topic, I believe the primary focus should be on the desired outcome of the process rather than the specific terminology used. The difference between "development" and "engineering" is ultimately secondary to the fundamental goal of projects.

In this post I will show some techniques where well designed code is something more than just a working program, making it communicative and understandable not only by the experts.

2. Design patterns

2.1. Builder pattern

Let us say the goal is to create a linechart composed of an X axis, a Y axis, and a line.

you can use this simple piece of code:

chartBuilder
  .setXAxis()
  .setYAxis()
  .setLine()
  .build();

The code is quite clear to understand – it creates two axes and a line, it consists of invoking functions that display geometric objects. Order of invoking does not matter here. What if you wanted to add two more lines and a grid?

chartBuilder
  .setXAxis()
  .setYAxis()
  .setGrid()
  .setLine()
  .setLine()
  .setLine()
  .build();

Why do I find this example important? Because I suppose anyone can understand this code, what opens a way to make a code collaborative for people of different specializations.

It is also worth to mention, that there is an extra work for engineers to handle things that happen under each of the function (e.g setLine, setGrid) and expose them in a comprehensible way.

This example is one of many Design Patterns -- typical solutions to common problems in software design. You can learn more about them here https://refactoring.guru/design-patterns

2.2. Adapter pattern

Consider a scenario where there is an old library that uses XML to store data, but your new application requires JSON data. You can use adapter pattern to convert the XML data to JSON before passing it to your application. The aim of adapter pattern is to allow two incompatible interfaces to work together.

Here is a short code snippets, where an adapter is handled by single function XMLToJSONAdapter.

After XML Data is fetched, it is passed as argument to XMLToJSONAdapter function (which is in charge of switching XML formant into JSON). The JSON formatted data is eventually passed to constant JSONData.

const XMLData = fetchData();
const JSONData = getJSONData(XMLToJSONAdapter(XMLData));

I believe that entry level to understand this code is pretty low.

Another, a bit more complicated example of adapter pattern in Object oriented way, making pretty much same work under getJsonData function:

class XmlToJsonAdapter {
    constructor(xmlData) {
        this.xmlData = xmlData;
    }

    getJsonData() {
        // ... convert XML to JSON ...
        return jsonData;
    }
}

2.3. Facade pattern

Another example is Facade design pattern, the aim of which is to provide a simplified interface to a complex subsystem. This pattern defines a higher-level interface that makes the subsystem easier to use. In the code below the difficult part of coding, which is connecting to external system or any additional computation would be hidden in Weather class (getTemperature, getHumidity, getWindSpeed functions).


getCurrentWeather() {
    const temperature = Weather.getTemperature();
    const humidity = Weather.getHumidity();
    const windSpeed = Weather.getWindSpeed();

    return {
        temperature,
        humidity,
        windSpeed,
    };
}

getCurrentWeather function returns temperature, humidity and wind speed. Person who reads the code sees where those indicators come from and how they are returned together as a cohesive weather state. Getting the weather state, which we consider for this example more complicated is moved to getTemperature, getHumidity, getWindSpeed functions.

2.4. Example connecting different parts of code and multiple Design Patterns

Let us say there is an existing system that is no longer developed but some of its features will be utilized by a new system, that is being built.
In such case, a good solution might be creating a new piece of software that will translate between those systems and do not let the old system make direct impact on the new one. Such approach is called Anti-corruption layer.

Examples of usage of ACL in real life:
Core Banking and Mobile Banking: The ACL could be used to transform complex financial data from the core banking system into a simplified format suitable for mobile devices, ensuring a smooth user experience.
IoT Devices and Supply Chain Management: The ACL could be used to transform raw IoT data into meaningful insights that can be integrated into the supply chain management system, improving visibility and efficiency.

Anti-corruption layer is also a Design Pattern, but in example presented here it will operate on two different systems and consists of multiple parts of code.

The diagram below shows an overview for two exemplary systems connected with Anti-corruption layer. On the right hand side, there is a system that we want to connect with, in the middle there is ACL and on the left hand side there is your brand new system which you are proud of. In Anti-corruption layer you can see a Facade which role is to hide complexity of the system you want to connect to. There are also two Adapters, which role is to translate data format of previous system to the one that that will be currently used.

Creating a diagram, that connects different sub-systems is a good idea, when you want to make your project more understandable by both, technical and non-technical people. A diagram might be treated as a map that we use to navigate in codebase. Thanks to its visual nature, anyone can see how things in your codebase are connected and if more detailed understanding is needed, people may read the code by themselves.

3. Infrastructure as code

This chapter will show yet another approach to utilize the code. I find it important to notice, because Infrastructure as Code is closer to physical world, as is Design in its roots.

Infrastructure as Code is a practice that involves managing and provisioning infrastructure through code. This means that instead of manually configuring servers or networks they can be defined in configuration files. These files can be versioned, tested, and deployed just like any other software code.

Below diagram visualizes the process. User writes code, which is then version-controlled and mapped through Automation API or Server directly into an infrastructure.

As mentioned before, IaC is closer to the physical world than regular programming. The idea of defining and building products through precise instructions, rather than manual processes is also common for manufacturing industry and has a strong connection with design.

Key similarities between IaC and Manufacturing:

Blueprint-Driven Approach:
IaC: Engineers write code (blueprints) to define the desired infrastructure.
Manufacturing: Engineers create blueprints (CAD models, schematics) to design physical products.

Automation and Repeatability:
IaC: Automation tools execute the code to provision and configure infrastructure, ensuring consistency.
Manufacturing: Automated machinery follows precise instructions to produce identical products.

Version Control and Traceability:
IaC: Code is version-controlled, allowing tracking of changes, collaboration, and rollback.
Manufacturing: Product designs and manufacturing processes are version-controlled to maintain quality and consistency.

Continuous Improvement:
IaC: Infrastructure code is continuously refined and optimized to improve performance and reliability.
Manufacturing: Manufacturing processes are constantly analyzed and improved to increase efficiency and reduce costs.

At the end, I would like to show an example snippet of IaC, to show that some basic configurations may be clear for non-technical people. Although the configuration describes technical infrastructure, so it requires some vocabulary. Moreover the code describing more advanced infrastructure might be much more complex.

Bellow snippet defines a reusable configuration for a Google Compute Engine virtual machine with a Debian 11 boot disk. By providing values for the deployment_identifier variable, you can create multiple virtual machines with unique names based on this configuration. You can also modify the machine_type and network configuration to suit specific needs.

variable "machine_type" {
  type    = string
  default = "n1-standard-1"
}

variable "zone" {
  type    = string
  default = "us-central1-a"
}

variable "deployment_identifier" {
  description = "The unique name for your instance"
  type        = string
}

resource "google_compute_instance" "default" {
  name         = "vm-${var.deployment_identifier}"
  machine_type = var.machine_type
  zone         = var.zone

  boot_disk {
    device_name = "boot"
    auto_delete = true
    initialize_params {
      image = "debian-cloud/debian-11"
    }
  }

  network_interface {
    network = "default"
    access_config {
      // Ephemeral IP
    }
  }
}

Source: https://cloud.google.com/service-catalog/docs/terraform-configuration

4. Summary

Code is a flexible tool, based on the latin alphabet (and special characters), which allows to articulate processes of varying complexity and give instructions that are understandable both to humans and machines, thereby making technology more inclusive. Beside a difficulties that comes with writing software, code has traits that make it comprehensible.

Bibliography

Design: The Whole Story by Elizabeth Wilhide
Domain-Driven Design: Tackling Complexity in the Heart of Software by Eric Evans
https://www.developerdotstar.com/mag/articles/reeves_design.html

Clean architecture with Next.js

Daniel Malek — Mon, 05 Aug 2024 18:38:07 +0000

1) Introduction and Clean architecture

Software architecture depends on many things, but there are still some concepts and good practices, that is worth to be familiar with. In this post I will show how to implement clean architecture using NextJS framework with focus on most important parts, It may be helpful if you plan tu start a new project with up to several software developers onboard. Intermediate knowledge about software development would be needed to understand this article.
I have created a basic Calendar app to show how you can shape an example project, check it on github.

The key concepts of Clean Architecture are:
Separation of Concerns: Different parts of the application handle distinct responsibilities, making the code base easier to understand and modify.
Dependency Rule: Inner layers should not depend on outer layers. This improves maintainability, flexibility and reusability, making code modular, easier to refactor and more technology agnostic.
Testability: The architecture facilitates thorough unit testing by isolating components.

2) Layers

From technical point of view the idea is based on dividing application into layers and connect them with a bunch of adapters, repository pattern and dependency inversion. On the picture below, you can see how well each concern of any app might be separted. Further, I will describe each part with code examples.

Source: https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html

3) Entities and Use Cases

Entities represent core business concepts with their data and rules, forming the heart of the system.
Use Cases define the actions a system can perform, orchestrating how entities interact to achieve specific goals. Think of entities as the building blocks and Use Cases as the blueprints for constructing a software application.

Core business logic can stay in one place, framework independent and well tested. Does it sound good? You can achieve it by utilizing Entities and Use Cases. Database may change, framework may change, but core logic will remain independent in one place. What happen when new business requirements arrive? Just edit proper Use Case.

In the example of the Calendar App I provided, Use Case examples seems way simple (even ICalendarEvent entity is as simple as interface), but their business logic might be extended by adding such features as:

Date and time consistency: Verify that start time precedes end time.
Event duration limits: Define minimum and maximum event durations.
Conflict detection: Check for overlapping events based on event times and locations.
Capacity limits: Enforce attendance restrictions for events.
Category-based filtering: Implement filtering events based on categories.

Remember:

Keep Use Cases focused on a single responsibility.
Prioritize business logic over technical implementation details (like database structures or UI elements).
Write clear and concise code for maintainability.
Test your Use Cases to ensure correct behavior.

4) Controllers, Presenters and Dependency inversion

Controller captures an event that takes place in application, invokes a Use Case and, if needed, passes output from Use Case to presenter to map data and make it fit to an UI.

Controller is a place which connects the outside world with the application's core logic. What is very important to mention here, it also utilizes dependency inversion principle. According to this principle Use Cases should not depend on outer layers. Advantage of dependency inversion principle are:

Improved maintainability: Changes in the underlying implementation (e.g., database) have minimal impact on the Use Cases.
Better code organization: Clear separation of concerns between business logic and technical details.
Enhanced flexibility: You can easily swap out different implementations of dependencies without affecting the core Use Cases.
Increased testability: By isolating Use Cases from external dependencies, you can easily write unit tests without relying on complex setups.

In Calendar App that I have provided, you can see exactly that for example when deleting calendar event, deleteEventUseCase is invoked with repository and output of the Use Case is used to refresh the view. Controllers that I have made receive many arguments, which make them hard to maintain. That might be improved in several ways, for example by using some external JavaScript dependency inversion library or creating your own system for that. A simple improvement in the Calendar Application might be made by creating a hooks controllers, which will encapsulate state and interaction methods and return them:

useCalendarViewController:

export function useCalendarViewController(
  repository: IRepository,
) {
  const [calendarViewData, setCalendarViewData] = useState<TCalendarView | null>(null);

  // const nextCalendarView = async () => {
  // const prevCalendarView = async () => {
  // const fetchCalendarEventData = async (
  // const searchCalendarEvent = async (


  return { calendarViewData, nextCalendarView, prevCalendarView, fetchCalendarEventData, searchCalendarEvent }
}

CalendarViewComponent:

const { calendarViewData, nextCalendarView, prevCalendarView, fetchCalendarEventData, searchCalendarEvent } =  useCalendarViewController(repository);

5) Outer layer and Repository pattern

In Clean Architecture, the outer layer is the furthest from the core business logic. It's where all the implementation details reside. This layer is often referred to as the Frameworks and Drivers Layer.

In the Calendar App, you can see that React components are yet another layer, it relays on data and abstraction provided by inner layers. Repository with database implementation details is also passed from here. What is advantage of using repository pattern? At any time you can change database (for example from mongodb to mysql, but also some code details) without touching core business logic, as long as interface fits.

6) Summary

We went through the most important parts of clean architecture. It provides good practices for setting up your project and it is definitely worth considering especially if you know that the project will grow medium or bigger size. For compact projects using all the rules might lead to increased boilerplate code or redundant complexity in the project structure. But there is also a drawback, you need to maintain more code when you pick the approach with repository.

7) Bibliography

https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html
https://www.youtube.com/watch?v=wnxO4AT2N4o
https://betterprogramming.pub/clean-architecture-with-react-cc097a08b105
https://tooploox.com/yet-another-clean-architecture

DEV Community: Daniel Malek

LLM Agent Observability with Langfuse

Introduction

Why agent application observability is important?

Why Langfuse?

Langfuse features

1. Core Observability & In-Depth Tracing

2. Live Performance & Health Metrics

3. Cost & Token Tracking

4. Evaluation & Quality Analytics

5. Custom Dashboards & Advanced Reporting

Instalation

Observations in practice

Summary

Building AI Agent with LangGraph and NestJS

What we are building and why

The UI and workflow

TechStack

Technical implementation

Setup — NestJS + Handlebars

codebase reviewing

Visualising the graph with Mermaid

Running the workflow

Summary

Building a Knowledge Base with RAG Using NestJS, LangChain and OpenAI

1. What We're Building and Why

The Stack

The UI

2. Technical Implementation

2.1 Setup — NestJS + Handlebars

2.2 Ingesting Articles

Loading — CheerioWebBaseLoader

Splitting — Why Chunks Matter

Overlap Chunks

Embedding & Saving — FaissStore.fromDocuments

2.3 Answering Questions

2.4 Map-Reduce: Reassembling the Chunks

The "Stuff" Optimization

3. Summary and Ideas for the Future

Where to Go from Here

You can not make a reverse engineering of “why” somebody made a decision

Managing software complexity – Simple is not easy

Code as design

1. Introduction

2. Design patterns

2.1. Builder pattern

2.2. Adapter pattern

2.3. Facade pattern

2.4. Example connecting different parts of code and multiple Design Patterns

3. Infrastructure as code

4. Summary

Bibliography

Clean architecture with Next.js

1) Introduction and Clean architecture

2) Layers

3) Entities and Use Cases

4) Controllers, Presenters and Dependency inversion

5) Outer layer and Repository pattern

6) Summary

7) Bibliography

Loading — `CheerioWebBaseLoader`

Embedding & Saving — `FaissStore.fromDocuments`