DEV Community: lucasnscr

Customer Support Agent with ADK and LangGraph

lucasnscr — Mon, 26 Jan 2026 14:56:34 +0000

Customer Support Agent

Customer support agent via WhatsApp with Jira integration, built with Python, FastAPI, LangGraph, and Redis.

Architecture

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│    WhatsApp     │────▶│     FastAPI     │────▶│   AgentRunner   │
│   (Webhook)     │     │   /webhooks     │     │   (ADK Style)   │
└─────────────────┘     └─────────────────┘     └────────┬────────┘
│
▼
┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│      Redis      │◀───▶│   StateStore    │◀───▶│    LangGraph    │
│   (Sessions)    │     │                 │     │   StateGraph    │
└─────────────────┘     └─────────────────┘     └────────┬────────┘
│
┌───────────────┼───────────────┐
▼               ▼               ▼
┌────────────┐  ┌────────────┐  ┌────────────┐
│ CustomerDB │  │    Jira    │  │  Parsing   │
│   (Tool)   │  │   (Tool)   │  │  (Utils)   │
└────────────┘  └────────────┘  └────────────┘

Conversational Flow

[Customer] “Hi”
│
▼
[Agent] “Hi! To help you, please send me your CNPJ.”
│
▼
[Customer] “11.111.111/0001-91”
│
▼
[System] Validates CNPJ in the database
│
├─── Invalid CNPJ ──▶ “I couldn’t find an active account. Please confirm the number.”
│                           │
│                           └──▶ [Back to requesting CNPJ]
│
└─── Valid CNPJ ──▶ “Perfect! Tell me the product and describe your request.”
│
▼
[Customer] “Product: ERP, Request: error in the report”
│
▼
[System] Creates ticket in Jira
│
▼
[Agent] “✅ Request created: SUP-123”

Requirements

Python 3.12+
Docker and Docker Compose
Poetry (dependency manager)

Installation

1. Clone the repository

cd customer-support-agent

Install dependencies

# Install Poetry (if needed)
curl -sSL https://install.python-poetry.org | python3 -

Install dependencies

poetry install

Configure environment variables

Create a .env file at the root of the project:

# Redis
REDIS_URL=redis://localhost:6379/0

# Jira
JIRA_BASE_URL=https://your-domain.atlassian.net
JIRA_EMAIL=your-email@example.com
JIRA_API_TOKEN=your-api-token
JIRA_PROJECT_KEY=SUP
JIRA_ISSUE_TYPE=Task

# WhatsApp (optional)
WHATSAPP_PROVIDER=cloudapi

Start the services

Start Redis and the application

docker compose up -d

# Or start only Redis (for local development)
docker compose up -d redis

# Run the application locally
poetry run uvicorn app.main:app --reload --port 8000

Usage

Testing with cURL

# Health check
curl http://localhost:8000/health

# Simulate WhatsApp message - Greeting
curl -X POST http://localhost:8000/webhooks/whatsapp \
  -H "Content-Type: application/json" \
  -d '{"from":"+5511999999999","text":"hi"}'

# Send valid CNPJ
curl -X POST http://localhost:8000/webhooks/whatsapp \
  -H "Content-Type: application/json" \
  -d '{"from":"+5511999999999","text":"11111111000191"}'

# Send product and request
curl -X POST http://localhost:8000/webhooks/whatsapp \
  -H "Content-Type: application/json" \
  -d '{"from":"+5511999999999","text":"Product: ERP System, Request: error when generating report"}'

bash

Valid CNPJs for testing (mock)

CNPJ Name Status
11111111000191 Demo Client Ltd. Active
22222222000191 Test Company S.A. Active
33333333000191 Inactive Company Inactive

API Endpoints

Method Endpoint Description
GET /health Health check
POST /webhooks/whatsapp Receives WhatsApp messages
GET /sessions/{phone} Retrieves session state
DELETE /sessions/{phone} Removes session

Tests

# Run all tests
poetry run pytest

# Run with coverage
poetry run pytest --cov=app

# Run a specific test
poetry run pytest tests/test_graph_flow.py -v

Project Structure

customer-support-agent/
├── app/
│   ├── main.py                    # FastAPI webhook
│   ├── settings.py                # Config via env vars
│   └── agent/
│       ├── runner.py              # AgentRunner (ADK style)
│       ├── graph.py               # LangGraph StateGraph
│       ├── state.py               # SupportState TypedDict
│       ├── nodes.py               # Node functions
│       ├── tools/
│       │   ├── customer_db.py     # Customer validation (mock)
│       │   └── jira.py            # Jira REST integration
│       └── utils/
│           ├── parsing.py         # CNPJ, product, request parsing
│           └── store.py           # Redis state store
├── tests/
│   └── test_graph_flow.py         # Flow tests
├── docker-compose.yml
├── Dockerfile
├── pyproject.toml
└── README.md

Jira Configuration

Creating an API Token

Go to https://id.atlassian.com/manage-profile/security/api-tokens
Click “Create API token”
Copy the token and add it to the .env

Creating the Project

Create a project in Jira (e.g., “SUP” for Support)
Configure the issue type (Task, Bug, etc.)
Update the variables JIRA_PROJECT_KEY and JIRA_ISSUE_TYPE

WhatsApp Integration

WhatsApp Cloud API

Configure the webhook URL in Meta Business:

https://your-domain.com/webhooks/whatsapp

Twilio

Configure the webhook URL in the Twilio Console:

https://your-domain.com/webhooks/whatsapp

Repository URL

Quarkus LangChain4j Book Recommendation System

lucasnscr — Mon, 14 Jul 2025 23:12:50 +0000

Quarkus Book Recommendation System

An intelligent book recommendation system built with Quarkus and LangChain4j, demonstrating the power of AI to personalize reading experiences. The system uses multiple AI providers (OpenAI, Mistral AI, and Ollama) to deliver personalized recommendations based on user profiles and reading history.

Repository

Features

🤖 Multi-AI Support: Switch between OpenAI, Mistral AI, and Ollama
📚 Personalized Recommendations: AI analyzes preferences and history to suggest books
🧠 Conversational Memory: Maintains context for personalized interactions
🔍 Intelligent Search: AI tools to search for books by multiple criteria
📊 Pattern Analysis: AI analyzes reading habits and suggests diversification
🎯 Similar Recommendations: Finds similar books based on specific titles
⚡ Quarkus Performance: Fast startup and low memory footprint
🔧 Simple Configuration: Environment variable-based configuration

Architecture

📦 Architecture and Tool Calling with LangChain4j

This system demonstrates advanced usage of the Tool Calling pattern with Quarkus and LangChain4j, integrating AI with external tools and databases. This approach allows language models to access APIs and local repositories to deliver personalized responses based on real-time and accurate data.

Integration with @Tool

We use the @Tool annotation to expose Java methods as tools that can be accessed by the AI model. This enables the AI to:

Query books stored in the database via JPA (Panache).
Access user preferences and reading history.
Perform filtering and smart analysis using real business logic.

Example of a Service with AI Tools

@RegisterAiService(tools = {BookSearchTools.class, UserProfileTools.class})
public interface BookRecommendationService {

    @UserMessage("""
    Based on my preferences, which books do you recommend?
    """)
    String recommendBooksForUser(Long userId);
}

Output Validation Guardrails

You can apply custom validations to the model’s responses using @OutputGuardrails, ensuring result format, safety, and consistency.

Logging and Observability

Quarkus allows detailed logging of model interactions and external API calls:

quarkus.langchain4j.log-requests=true
quarkus.langchain4j.log-responses=true

Automated Testing with Quarkus

To ensure reliability, endpoints are tested using quarkus-junit5 along with mocked dependencies via quarkus-junit5-mockito.

This addition highlights how your system uses Tool Calling robustly and aligns with modern intelligent agent development practices.

The system uses the AI Tools pattern from LangChain4j, allowing the AI to access database data and perform complex operations:

┌─────────────────┐    ┌──────────────────────┐    ┌─────────────────────┐
│   REST Client   │───▶│ RecommendationController │───▶│BookRecommendationService│
└─────────────────┘    └──────────────────────┘    └─────────────────────┘
                       ┌──────────────────────┐    ┌─────────────────────┐
                       │   BookController     │───▶│   BookDataService   │
                       └──────────────────────┘    └─────────────────────┘
                                                           │
                                              ┌────────────┼────────────┐
                                              │            │            │
                                         ┌────▼────┐ ┌────▼────┐ ┌─────▼──────┐
                                         │  Book   │ │  User   │ │   Config   │
                                         │ Search  │ │ Profile │ │   Class    │
                                         │ Tools   │ │ Tools   │ └─────┬──────┘
                                         └────┬────┘ └────┬────┘       │
                                              │           │            │
                                    ┌─────────▼───┐   ┌───▼────────┐   │
                                    │    Book     │   │    User    │   │
                                    │ Repository  │   │Repositories│   │
                                    └─────────────┘   └────────────┘   │
                                              │                        │
                                         ┌────▼────┐                   │
                                         │   H2    │                   │
                                         │Database │                   │
                                         └─────────┘                   │
                                                                       │
                                              ┌────────────────────────▼────┐
                                              │         LangChain4j         │
                                              │        AI Providers         │
                                              └────────────────────────────┘  
                                                           │
                                        ┌──────────────────┼──────────────────┐
                                        │                  │                  │
                                ┌───────▼────────┐ ┌───────▼────────┐ ┌──────▼───────┐
                                │     OpenAI     │ │   Mistral AI   │ │    Ollama    │
                                │   (gpt-4o-mini)│ │ (mistral-small)│ │   (llama3.2) │
                                └────────────────┘ └────────────────┘ └──────────────┘

Prerequisites

Java 21 or higher
Maven 3.8+ or Gradle
Choose at least one AI provider:

OpenAI Configuration

Create an account at OpenAI
Generate an API key in the API keys section
Set the environment variable OPEN_AI_TOKEN

Mistral AI Configuration

Create an account at Mistral AI
Generate an API key in your dashboard
Set the environment variable MISTRAL_AI_TOKEN

Ollama Configuration

Install Ollama from ollama.ai
Start the Ollama service: ollama serve
Download a model: ollama pull llama3.2
Ensure Ollama is running at http://localhost:11434

Configuration

Selecting the AI Provider

Set the AI provider using the AI_MODEL_PROVIDER environment variable:

# For OpenAI (default)
export AI_MODEL_PROVIDER=openai
export OPEN_AI_TOKEN=your-openai-key

# For Mistral AI
export AI_MODEL_PROVIDER=mistral
export MISTRAL_AI_TOKEN=your-mistral-key

# For Ollama
export AI_MODEL_PROVIDER=ollama

Quick Start

1. Clone the Repository

git clone <repository-url>
cd quarkus-book-recommendation

2. Set Up Your AI Provider

Option A: Using OpenAI

export AI_MODEL_PROVIDER=openai
export OPEN_AI_TOKEN=your-real-key

Option B: Using Mistral AI

export AI_MODEL_PROVIDER=mistral
export MISTRAL_AI_TOKEN=your-real-key

Option C: Using Ollama

# Start Ollama first
ollama serve

# In another terminal, download a model
ollama pull llama3.2

# Set up the environment
export AI_MODEL_PROVIDER=ollama

3. Run the Application

Development Mode (with hot reload):

./mvnw quarkus:dev

Production Mode:

./mvnw clean package
java -jar target/quarkus-app/quarkus-run.jar

The application will start at http://localhost:8080

API Endpoints

Book Generation Endpoints

Generate Book List by Genre

Endpoint: GET /api/books/generate/{genre}

Generates a list of 10 books using AI for the specified genre.

Request Example:

curl -X GET "http://localhost:8080/api/books/generate/fantasy?sessionId=user123"

Get Book by ID

Endpoint: GET /api/books/{sessionId}/{id}

Retrieves a specific book by ID using AI with chat memory capabilities.

Request Example:

curl -X GET http://localhost:8080/api/books/user123/1

Recommendation Endpoints

Personalized User Recommendations

Endpoint: GET /api/recommendations/user/{userId}

Uses AI with LangChain4j tools to analyze the user profile and recommend personalized books.

Request Example:

curl -X GET http://localhost:8080/api/recommendations/user/1

Response Example:

Based on your profile and reading history, here are 5 perfect recommendations for you:

1. **Foundation** by Isaac Asimov
   - Perfectly matches your love of epic science fiction like Dune
   - Classic series with complex worldbuilding and deep political themes

2. **The Name of the Wind** by Patrick Rothfuss
   - Epic fantasy with immersive storytelling, similar to Madeline Miller's style
   - Poetic prose and exceptional character development

[...]

Find Similar Books

Endpoint: GET /api/recommendations/similar

Finds similar books based on a specific title and author.

Request Example:

curl -X GET "http://localhost:8080/api/recommendations/similar?title=Dune&author=Frank Herbert"

Recommendations by Genre and Criteria

Endpoint: GET /api/recommendations/genre/{genre}

Recommends books based on genre and specific criteria.

Request Example:

curl -X GET "http://localhost:8080/api/recommendations/genre/science-fiction?minRating=4.0&language=English&yearAfter=2010"

Reading Pattern Analysis

Endpoint: GET /api/recommendations/analysis/{userId}

Analyzes the user's reading patterns and suggests diversification.

Request Example:

curl -X GET http://localhost:8080/api/recommendations/analysis/1

AI Tools

The system demonstrates advanced AI capabilities using LangChain4j tools:

Book Search Tools

Search by Genre: Retrieve books by category
Search by Author: Find books by specific authors
Search by Rating: Filter books by minimum rating
Search by Language: Filter books by language
Search by Year: Find books by publication period

User Profile Tools

User Preferences: Access stored reading preferences
Reading History: Retrieve books read by the user
Pattern Analysis: Identify trends in reading habits

AI Service Integration

The BookRecommendationService combines both sets of tools to provide intelligent analysis:

@RegisterAiService(tools = {BookSearchTools.class, UserProfileTools.class})
public interface BookRecommendationService {
    // AI can automatically call tools to analyze preferences
    String recommendBooksForUser(Long userId);

    // AI can find similar books using multiple criteria
    String findSimilarBooks(String bookTitle, String author);
}

Technology Stack

Quarkus 3.24.3 - Supersonic Subatomic Java Framework
LangChain4j - Java library for building AI applications
- OpenAI Integration (v1.0.2)
- Mistral AI Integration (v1.0.2)
- Ollama Integration (v1.0.2)
- Support for AI Tools and Agents
JAX-RS (Quarkus REST) - RESTful web services
Hibernate ORM with Panache - Database operations
H2 Database - In-memory database for development
Jackson - JSON processing
CDI - Dependency injection

Development

Project Structure

src/main/java/com/example/books/
├── resources/
│   ├── BookController.java              # REST endpoints for books
│   └── RecommendationController.java    # Recommendation endpoints
├── service/
│   ├── BookDataService.java             # AI service interface
│   └── BookRecommendationService.java   # AI service with tools
├── model/
│   ├── Book.java                        # Book data model
│   ├── UserPreference.java              # User preferences
│   └── ReadingHistory.java              # Reading history
├── repository/
│   ├── BookRepository.java              # Database operations for books
│   ├── UserPreferenceRepository.java    # Preference operations
│   └── ReadingHistoryRepository.java    # History operations
├── tools/
│   ├── BookSearchTools.java             # Book search tools
│   └── UserProfileTools.java            # User profile tools
└── api/
    └── BookListResponse.java            # API response models

Running Tests

./mvnw test

Development with Hot Reload

./mvnw quarkus:dev

This enables hot reload—changes to your code will be automatically reflected without restarting the application.

Demonstrated Use Cases

Personalized Recommendations: AI analyzes the user's complete profile
Intelligent Search: AI tools for complex database queries
Pattern Analysis: AI identifies trends and suggests diversification
Conversational Memory: Context maintained between interactions
Data Generation: AI creates realistic book data by genre

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

License

This project is open source and available under the MIT License.

Happy reading with AI! 📚🤖

MCP and Spring AI

lucasnscr — Wed, 26 Mar 2025 11:46:53 +0000

MCP Protocol with Spring AI

lucasnscr ・ Mar 20

#mcp #spring #ai

MCP Protocol with Spring AI

lucasnscr — Thu, 20 Mar 2025 11:41:39 +0000

Implementing a Real-World Credit Score Verification System

This project leverages Spring AI with Tool Calling capabilities to orchestrate multiple AI-driven tools, managed by the Qwen 2.5 model from Alibaba Cloud. The solution is seamlessly integrated with Slack via the Model Context Protocol (MCP), enabling efficient and intelligent credit score verification for specific users and automatically posting the results in a designated channel within the workspace.

Before we start to explain MCP, I think is important we start the discuss why almost times I tried use OpenSource technologies and why this time I choose Qwen 2.5

Why Open Source LLMs?

Open-source LLMs offer a blend of cost savings, adaptability, transparency, community-driven enhancements, scalability, and data control, making them a viable and often preferable alternative to proprietary models.

Cost-Effectiveness: Open-source LLMs eliminate licensing fees, reducing initial expenses. While operational costs like infrastructure remain, the absence of proprietary fees makes advanced AI capabilities more accessible, especially for smaller organizations.
Customization and Flexibility: These models can be fine-tuned to specific business needs, allowing for tailored applications across various industries. This adaptability enables seamless integration into existing systems, enhancing operational efficiency.
Transparency and Security: Access to the source code provides insights into the model's architecture and training data, fostering trust and aiding in audits for ethical and legal compliance. This transparency ensures robust and accountable AI applications.
Community Support and Collaboration: The open-source nature encourages contributions from a diverse set of developers and researchers, leading to continuous improvements and innovation. This collaborative environment accelerates advancements and ensures the models remain up-to-date with emerging trends.
Scalability: Open-source LLMs allow organizations to scale operations according to their requirements, such as handling large volumes of data or expanding into new applications and markets.
Privacy and Independence: Deploying open-source models locally ensures that sensitive data remains within an organization's infrastructure, reducing the risk of data leaks and unauthorized access. This autonomy allows for greater control over data management and compliance with privacy regulations.

Why Qwen 2.5

Qwen 2.5 is an advanced language model developed by Alibaba Cloud, designed to handle a wide range of artificial intelligence tasks. This model stands out for its code generation capabilities, multilingual support, and extensive context processing, making it a versatile tool for various applications.

Advantages of Using Qwen 2.5 in Implementations:

Versatility: With multilingual support and advanced capabilities in code generation and long-context processing, Qwen 2.5 can be applied across various industries, from software development to education and research.
Cost Efficiency: Compared to other AI models on the market, Qwen 2.5 offers a competitive pricing structure, making it an attractive option for startups and companies seeking high-performance AI solutions without a significant investment.
Processing Large Volumes of Data: Its ability to handle extensive contexts allows Qwen 2.5 to process long and complex texts without losing coherence, which is essential for tasks requiring detailed analysis and deep understanding.
Integration with Cloud Services: The model is optimized for cloud computing environments, enabling efficient scalability and integration with other Alibaba Cloud services, simplifying implementation in existing infrastructures.
Community and Support: As an open-source model, Qwen 2.5 benefits from an active community of developers, providing additional resources, support, and continuous updates, which is advantageous for organizations looking for robust and up-to-date AI solutions.
Qwen article
Qwen Hugging Face Page

What is MCP Protocol?

The Model Context Protocol (MCP) is an open standard designed to facilitate the integration of AI systems with various external tools, databases, and services. It acts as a universal connector for AI applications, much like USB-C for devices, enabling seamless interactions between Large Language Models (LLMs) and different data sources.

MCP eliminates the need for multiple custom API integrations by providing a single, standardized method for AI tools to access and interact with external data dynamically.

Why Use MCP Instead of Traditional APIs?
Traditional API-based integrations require separate connections for each external service, meaning:

Each API has unique authentication and documentation.
Developers need to write and maintain custom code for every integration.
Communication is often one-way, limiting real-time interactions.

MCP solves these issues by offering:

A single, standardized integration instead of multiple separate APIs.
Real-time, two-way communication, allowing AI models to not only retrieve data but also trigger actions dynamically.
Dynamic discovery of tools, meaning AI models can access new services without needing manual reconfiguration.
Easier scalability, as developers can connect multiple data sources with minimal additional effort.

Who Created MCP?

Originally developed by Anthropic, MCP is now an open-source standard being adopted by various companies and developers, aiming to establish a universal method for AI-tool interactions.

How MCP Works – Architecture Overview

MCP follows a client-server architecture, consisting of three main components:

MCP Hosts – AI-driven applications (e.g., Claude, AI-powered IDEs) that need access to external tools.
MCP Clients – Maintain one-to-one connections with MCP servers to exchange data.

MCP Servers – Lightweight servers that expose specific functionalities, serving as bridges between AI models and data sources.

Data Flow in MCP

Pull Data → AI queries an MCP server for context (e.g., checking a user’s calendar).
Trigger Actions → AI instructs an MCP server to take action (e.g., reschedule meetings, send emails).
Persistent Connection → Unlike APIs that require repeated calls, MCP maintains continuous communication between AI and tools.

Implementing MCP with Spring AI

Client Starters

spring-ai-mcp-client-spring-boot-starter - Core starter providing STDIO and HTTP-based SSE support
spring-ai-mcp-client-webflux-spring-boot-starter - WebFlux-based SSE transport implementation

Server Starters

spring-ai-mcp-server-spring-boot-starter - Core server with STDIO transport support
spring-ai-mcp-server-webmvc-spring-boot-starter - Spring MVC-based SSE transport implementation
spring-ai-mcp-server-webflux-spring-boot-starter - WebFlux-based SSE transport implementation

Project Setup

Dependency

        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-mcp-client-spring-boot-starter</artifactId>
        </dependency>

Bean Setup

    @Bean
    public ChatClient chatClient(ChatClient.Builder chatClientBuilder, ToolCallbackProvider tools) {
        return chatClientBuilder
                .defaultTools(tools)
                .build();
    }

Config MCP Server File

{
  "mcpServers": {
    "slack": {
      "command": "docker",
      "args": [
        "run",
        "-i",
        "--rm",
        "-e",
        "SLACK_BOT_TOKEN",
        "-e",
        "SLACK_TEAM_ID",
        "mcp/slack"
      ],
      "env": {
      }
    }
  }
}

Add Specific function to be called by MCP Server in your prompt.

    **4. Formatting for Slack:**
                           - Present the information in a structured and well-organized format
                           - Use **bold headings** (`*Decision and Recommendation*`)
                           - Post the summary to the **'#all-ai-agent-alerts'** Slack channel \t
                          \s
                        The summary should be concise, clear, and actionable to help the credit team quickly understand what decision we have to take.

Slack Setup

https://github.com/AVIMBU/slack-mcp-server

Useful Links

Ai Agent Patterns with Spring AI

lucasnscr — Mon, 24 Feb 2025 00:22:46 +0000

AI Agents and Their Patterns: A Practical Perspective

As AI-driven applications evolve, developers are increasingly leveraging large language model (LLM) agents to execute complex tasks efficiently. However, the most effective implementations rely not on overly complex frameworks but on simple, composable design patterns.

This article explores the distinction between workflows and agents, identifies key patterns used in AI-driven systems, and examines how these concepts apply to a Spring AI implementation designed for intelligent news retrieval based on user preferences.

What Are AI Agents?

AI agents are systems that use LLMs to process information, interact with tools, and perform actions. They can be classified into two broad categories:

Workflows: Structured sequences where LLMs and external tools follow predefined execution paths. These systems prioritize predictability and are ideal for well-defined, repeatable tasks.
Agents: More dynamic and autonomous systems where LLMs dictate their own processes, selecting tools and determining how to accomplish tasks. This allows for greater flexibility and adaptation.

Choosing between these approaches depends on the problem domain—workflows excel in structured automation, while agents are preferable when decisions need to be made dynamically at scale.

Key Patterns in AI Agent Systems

Chain Workflow

A Chain Workflow organizes multiple steps in a linear sequence, where the output of one step feeds into the next. It ensures clarity and control while allowing for slight adaptability.

When to use:

Tasks with clear sequential steps
When you want to trade latency for higher accuracy
When each step builds on the previous step's output

Parallelization Workflow

This pattern involves executing multiple tasks simultaneously, increasing efficiency in data-intensive operations. It is useful when fetching or processing large amounts of information concurrently.

When to use:

Processing large volumes of similar but independent items
Tasks requiring multiple independent perspectives
When processing time is critical and tasks are parallelizable

Routing Workflow

A Routing Workflow dynamically directs execution paths based on input conditions, allowing the system to adapt to different cases without predefined sequences.

When to use:

Complex tasks with distinct categories of input
When different inputs require specialized processing
When classification can be handled accurately

Orchestrator-Workers

Orchestrator AI delegates tasks to multiple specialized worker agents, each responsible for a distinct function (e.g., data retrieval, analysis, summarization).

When to use:

Complex tasks where subtasks can't be predicted upfront
Tasks requiring different approaches or perspectives
Situations needing adaptive problem-solving

Evaluator-Optimizer

This pattern involves an Evaluator assessing the quality of an agent’s output and an Optimizer refining future responses based on feedback, improving accuracy over time.

When to use:

Clear evaluation criteria exist
Iterative refinement provides measurable value
Tasks benefit from multiple rounds of critique

AI Agent Patterns Used

Patterns were used in the project:

1. Chain Workflow

Description: Chaining prompts or tasks to improve accuracy.
Application in the Project: The workflow involves retrieving user preferences and then fetching and analyzing news based on those preferences. This can be seen as a chain of tasks, where the output of one task (user preferences) is used as input for the next (news retrieval).

2. Routing Workflow

Description: Intelligent routing of inputs to specialized flows.
Application in the Project: The GetNewsByUserPreferences service routes the request to different API endpoints (crypto or stocks) based on the user's topic of interest. This is an example of Routing Workflow.

3. Orchestrator-Workers

Description: A central AI (orchestrator) distributes tasks to specialized subprocesses (workers).
Application in the Project: The NewsService acts as an orchestrator, coordinating news retrieval and analysis with the AI model. The AI model (OpenAiChatModel) functions as a specialized worker for text analysis.

4. Evaluator-Optimizer

Description: Use of LLMs for iterative evaluation and refinement.
Application in the Project: The extractNewsAndGiveOpinion method uses an AI model to summarize and analyze news, which can be seen as a form of content evaluation and refinement.

Spring Ai and Tool Calling

Spring AI is a framework that facilitates the integration of AI functionalities into Spring applications. One of its key features is Tool Calling, which allows AI models to interact with external APIs (tools) to extend their capabilities. These tools can be used for:

Details the implementation of Tool Calling in Spring AI, focusing on the creation of AI agents that use tools to retrieve information and perform actions. Additionally, we evaluate whether some of the common workflow patterns in AI systems were used, such as Chain Workflow, Parallelization Workflow, Routing Workflow, Orchestrator-Workers, and Evaluator-Optimizer.

Project Structure

1. UserPreferencesService

This service is responsible for retrieving user preferences from a repository (e.g., a database). It is implemented as a function (Function<UserPreferenceRequest, UserPreferenceResponse>) and exposed as a Spring bean.

@Bean(name = GET_USER_PREFERENCES_FUNCTION_NAME)
@Description("Get topic by userId")
public Function<UserPreferenceRequest, UserPreferenceResponse> getUserPreferencesInfo(
        UserPreferencesRepository userPreferencesRepository) {
    return new UserPreferencesService(userPreferencesRepository);
}

Features:

Takes a userId as input.
Queries the repository to retrieve user preferences.
Returns the user's topic of interest (e.g., "crypto" or "stocks").

2. GetNewsByUserPreferences

This service retrieves news based on the user's topic of interest and analyzes them using an AI model. It is also implemented as a function (Function<GetNewsByPreferenceRequest, GetNewsByPreferenceResponse>).

@Bean(name = GET_LATEST_NEWS_BY_TOPIC_FUNCTION_NAME)
@Description("Get latest news from user topic")
public Function<GetNewsByPreferenceRequest, GetNewsByPreferenceResponse> getNewsFromPreference(NewsService newsService) {
    return new GetNewsByUserPreferences(newsService);
}

Features:

Takes a topic of interest (e.g., "crypto" or "stocks") as input.
Queries an external service (e.g., AlphaClientNewsSentimentals) to retrieve related news.
Uses an AI model (e.g., OpenAI) to summarize and analyze the news.
Returns the summarized and analyzed news.

3. NewsService

This service is responsible for:

Retrieving news from an external API (AlphaClientNewsSentimentals).
Using an AI model (OpenAiChatModel) to summarize and analyze the news.

@Service
public class NewsService {

    private final AlphaClientNewsSentimentals alphaClient;
    private final OpenAiChatModel chatModel;

    public NewsService(AlphaClientNewsSentimentals alphaClient, OpenAiChatModel chatModel) {
        this.alphaClient = alphaClient;
        this.chatModel = chatModel;
    }

    public List<NewsAndSentimentals> getNews(GetNewsRequest request) {
        if (request.category().equals("crypto")) {
            return alphaClient.requestCrypto(CryptoEnum.BITCOIN.getTicker());
        } else {
            return alphaClient.requestStock(StockEnum.APPLE.getTicker());
        }
    }

    public GetNewsByPreferenceResponse extractNewsAndGiveOpinion(List<NewsAndSentimentals> newsResponses) {
        return ChatClient.create(chatModel)
                .prompt()
                .system(s -> s.text("You are a professional financial analyst..."))
                .user(u -> u.text("{news}").param("news", newsResponses))
                .call()
                .entity(GetNewsByPreferenceResponse.class);
    }
}

Next Steps

Implement Parallelization Workflow to improve efficiency in news retrieval.
Explore the use of Evaluator-Optimizer for iterative refinement of the analyses generated by the AI model.

How to Run the Project

Prerequisites:
- Java 17+
- Spring Boot 3.x
- OpenAI API Key (for the AI model)
- Configuration of the user preferences repository (e.g., a database).
Setup:
- Add the OpenAI API Key in application.properties:
```
 spring.ai.openai.api-key=YOUR_KEY_HERE
```

Add the Alpha Advantage API Key in application.properties:
```
 Alpha.api-key==YOUR_KEY_HERE
```
- Running the Sql's: Schema and data

Execution:
- Run the project with:
```
 ./mvnw spring-boot:run
```
Testing:

- Use REST endpoints to test preference and news retrieval.

Github Repositoru

Applying GenAI Design Patterns to Build a Next-Generation Financial Analysis Application

lucasnscr — Sun, 09 Feb 2025 19:09:36 +0000

Generative AI (GenAI) is transforming industries by enabling machines to create human-like content across text, images, and beyond. This revolutionary technology is not only powering innovative applications in creative fields but is also making significant inroads into domains such as finance, healthcare, and customer service. With GenAI, organizations can harness advanced models that generate context-rich outputs, augment traditional data retrieval methods, and deliver personalized insights, fundamentally reshaping how we interact with information and make decisions.

Relying on substandard or poorly curated datasets to train LLMs poses significant risks. Without a solid foundation of accurate, representative data, these models may produce outputs that are both factually incorrect and misleading—a phenomenon known as hallucinations. Hallucinations occur when the model invents details or delivers confidently erroneous information that isn’t grounded in its training data. Such inaccuracies can drive poor decision-making, particularly in high-stakes fields like finance, healthcare, or legal services, where precision and trust are critical. Therefore, maintaining dataset integrity is essential for mitigating these risks and building reliable AI systems.

Mitigating Risks with GenAI Patterns: Leveraging Embeddings, RAG, and Query Transformation to Prevent Hallucinations and Ensure Reliable Outputs

Mitigating risks in GenAI systems is achievable by leveraging key design patterns that address the inherent challenges of training on unreliable data. By utilizing Embedding Generation, raw financial text is transformed into meaningful vector representations, ensuring that the semantic nuances of the data are captured accurately. In parallel, Retrieval Augmented Generation (RAG) enriches LLM prompts with context retrieved from a vector store, thereby bridging the gap in static knowledge bases and reducing the likelihood of hallucinations. Additionally, Query Transformation refines user queries to align seamlessly with the underlying document embeddings, enhancing the retrieval of relevant information. Together, these patterns work synergistically to prevent the generation of misleading outputs and to ensure the delivery of reliable, context-aware responses.

Embeddings: From Raw Data to Meaningful Vectors

One of the foundational patterns discussed by Fowler is the use of embeddings. In his article, he explains that embeddings transform large data blocks (whether text or images) into numeric vectors in a high-dimensional space so that items with similar semantic meaning end up near each other. In our application, this concept is critical for enabling similarity search across financial documents and news.

Embeddings implementation with SpringAI

Tokenization & Splitting
The saveVectorDb method begins by using a TokenTextSplitter to break down a list of content strings into smaller chunks, converting each into a Document instance:

List<Document> documentList = textSplitter.apply(   contentList.stream().map(Document::new).collect(Collectors.toList());

This aligns with the idea of chunking large documents into manageable pieces—a common practice to preserve contextual details in embeddings.

Embedding Generation

For each document, the method calls:

embeddingModel.embed(document);

This invocation transforms the document text into an embedding vector. The vector is then stored in the vector store for later similarity searches. This step reflects the core purpose of embeddings in our GenAI architecture.

Metadata Association

Beyond generating embeddings, the method attaches additional metadata:

document.getMetadata().put("metadata", metadata);

Instead of redundantly storing the embedding in the metadata, this field is used to store auxiliary information (such as document type or indexing timestamp) that can be used to filter or display results later. Fowler’s discussion on structuring the knowledge base stresses the importance of separating data (embeddings) from contextual information (metadata).

Retrieval Augmented Generation (RAG): Enriching Prompts with Relevant Context

Martin Fowler’s article describes RAG as a way to overcome the limitations of LLMs by augmenting user prompts with relevant document fragments retrieved from an external knowledge base. This approach helps the LLM produce answers that are factually grounded and contextually appropriate.

Implementing RAG with SpringAI

Constructing an Advised Request

The method retrieveRelevantDocuments builds an AdvisedRequest using the user query and a configured chat model:

AdvisedRequest request = AdvisedRequest.builder()
        .chatModel(chatModel)
        .userText(query)
        .build();

This request serves as the input for the retrieval augmentation process.

Using the RetrievalAugmentationAdvisor

The request is then passed to the advisor:

AdvisedRequest advisedRequest = retrievalAugmentationAdvisor.before(request);

Under the hood, this advisor orchestrates a series of operations:

Query Transformation:

The advisor uses a RewriteQueryTransformer (created in the private method createQueryTransformers()) to refine or rephrase the query. This transformation is crucial for overcoming the “minimalistic user query” problem discussed by Fowler.

Document Retrieval:

A VectorStoreDocumentRetriever (built in createDocumentRetriever()) searches the vector store using the refined query’s embedding. It applies a similarity threshold (set to 0.65 in our implementation) and limits the number of retrieved documents to the top 5.

Query Augmentation:

Finally, a ContextualQueryAugmenter (from createQueryAugmenter()) supplements the original query with the context retrieved from the vector store. In our implementation, the augmenter is configured with allowEmptyContext(true), meaning that even if no documents are deemed sufficiently similar, the original query is preserved instead of being overridden by a fallback message.

Extracting the Retrieved Documents:

After the advisor has processed the request, the method extracts the retrieved documents from the context:

Object contextValue = advisedRequest.adviseContext().get(RetrievalAugmentationAdvisor.DOCUMENT_CONTEXT);
if (contextValue instanceof List<?> documents) {
    return (List<Document>) documents;
}
return List.of();

This final step directly implements the RAG pattern: enriching the query with retrieved context before passing it on to the LLM for answer generation.

Hybrid Retriever and Query Transformation: Enhancing Search Efficiency

Fowler’s discussion on the Hybrid Retriever pattern emphasizes that while dense embeddings are powerful, they can be complemented by traditional keyword or BM25 searches to overcome their limitations. Although our implementation in AIFinancialRepository currently focuses on vector-based retrieval, the integration of a query transformer shows an appreciation for combining multiple retrieval strategies.

Query Transformer

The private method createQueryTransformers() builds a RewriteQueryTransformer using a ChatClient. This transformer adapts user queries to better match the representation of the documents in the vector store:

private QueryTransformer createQueryTransformers() {
    ChatClient chatClient = ChatClient.builder(chatModel).build();
    return RewriteQueryTransformer.builder()
            .chatClientBuilder(chatClient.mutate())
            .build();
}

By rewriting the query, the system addresses ambiguities and refines the search input, increasing the chances of retrieving relevant documents.

VectorStoreDocumentRetriever

The method createDocumentRetriever() constructs a document retriever that searches the vector store using similarity thresholds:

private VectorStoreDocumentRetriever createDocumentRetriever() {
    return VectorStoreDocumentRetriever.builder()
            .vectorStore(vectorStore)
            .similarityThreshold(0.65)
            .topK(5)
            .build();
}

This component implements the core of the hybrid retrieval strategy by ensuring that only the most semantically similar document fragments are returned.

AI Financial Analyst Design System

Comparative Analysis of AI Financial Analyst Implementations

In our journey to develop a robust AI Financial Analyst, we experimented with two distinct implementations that leverage different generative AI models and embedding algorithms. This section compares these two approaches in light of Martin Fowler’s GenAI patterns and our architectural goals.

Version 1: OpenAI’s GPT4o with text-embedding-ada-002

The first version of our application was built around OpenAI’s GPT4o model—a general-purpose large language model designed for a wide range of natural language tasks. GPT4o excels in generating coherent, contextually rich responses across various domains. Its strength lies in its versatility and the ability to handle diverse queries, making it a strong candidate for customer-facing applications where broad knowledge and adaptability are critical.

To complement GPT4o, we employed OpenAI’s text-embedding-ada-002 algorithm for generating embeddings. This algorithm transforms raw financial text into dense, high-dimensional vectors that capture semantic meaning. Text-embedding-ada-002 is known for its cost efficiency, robust performance, and general applicability across tasks such as similarity search, clustering, and semantic comparison. However, while it performs admirably in generating embeddings for general-purpose retrieval, its design is not explicitly optimized for deep reasoning or domain-specific nuances.

Version 2: DeepSeek R1 with nomic-embed-text on Ollama

The second implementation pivots to a locally hosted solution using DeepSeek R1 a model tailored for enhanced reasoning capabilities. Running on the Ollama platform, DeepSeek R1 was selected for scenarios where precise, domain-specific reasoning is paramount, such as detailed financial analysis. Unlike the more general-purpose GPT4o, DeepSeek R1 is fine-tuned to process complex queries and provide reasoned responses that align closely with the intricate requirements of financial analytics.

For embedding generation in this version, we use the nomic-embed-text algorithm. This embedding algorithm is designed to complement the reasoning strengths of DeepSeek R1, focusing on extracting the nuanced relationships within financial texts. While text-embedding-ada-002 offers broad semantic representation, nomic-embed-text is optimized for scenarios where contextual reasoning is essential, ensuring that the resultant embeddings preserve subtle semantic details crucial for informed decision-making.

Benchmark Summary:

Generality vs. Specialization

GPT4o is a general-purpose model suited for a wide array of tasks, making it versatile for various applications. In contrast, DeepSeek R1 is specialized for reasoning, offering enhanced performance in scenarios that demand a higher level of analytical depth.

Embedding Algorithms

Text-embedding-ada-002 provides robust and cost-effective embeddings that work well for generic semantic similarity tasks. On the other hand, nomic-embed-text is tailored to support DeepSeek R1’s reasoning abilities, potentially offering richer representations in context-specific financial analysis.

Nomic AI Doc

Deployment Considerations

The GPT4o-based implementation leverages cloud-based, managed services from OpenAI, which can simplify scaling and maintenance. The DeepSeek R1 solution, running locally via Ollama, offers greater control over data and model tuning, which is beneficial for organizations that require enhanced privacy or need to fine-tune the model for domain-specific tasks.

Database Setup

Both implementations leverage PGVector as the underlying vector database but differ in key configuration parameters that directly impact the effectiveness of the retrieval process.

Distance Type (COSINE_DISTANCE)

This setting indicates that the similarity between vectors is measured using cosine distance, which evaluates the cosine of the angle between two vectors. It is a widely used metric in embedding spaces because it emphasizes the directional similarity rather than the magnitude of the vectors.

Index Type (HNSW)

The Hierarchical Navigable Small World (HNSW) algorithm is employed for indexing. HNSW is an approximate nearest neighbor search method optimized for high-dimensional spaces. It provides fast and scalable retrieval, making it ideal for applications that require efficient similarity search among thousands or millions of vectors.

Dimensions

A crucial difference between the two versions is the dimensionality of the embeddings.

The DeepSeek R1 implementation uses .dimensions(1024), reflecting that the nomic-embed-text embedding algorithm produces 1024-dimensional vectors.
In contrast, the GPT4o implementation uses .dimensions(1536), as the text-embedding-ada-002 algorithm generates higher-dimensional (1536) embeddings.

This difference affects the granularity and capacity of the embeddings to capture semantic nuances, with the higher dimension potentially encoding more detail but also demanding stricter similarity criteria.

Similarity Threshold

The document retriever’s configuration also varies between the two implementations:

For DeepSeek R1, the similarity threshold is set at .similarityThreshold(0.65), which allows documents with a cosine similarity above 0.65 to be considered relevant.
For GPT4o, the threshold is higher at .similarityThreshold(0.82), demanding a closer match between the query embedding and stored document embeddings.

This stricter threshold in GPT4o likely compensates for the richer, more detailed 1536-dimensional embeddings, ensuring that only highly relevant documents are retrieved.

Output Results

After we did the ingestion data related News and Sentimentals about bitcoin we did this question.

{
    "question": "What the sentiment and price expectations for Bitcoin in the next days. Make a deep analysis."
}

Response for GPT4o

Analysis Type: Market Trend and Sentiment Analysis for Bitcoin

Key Metric/Insight: The primary focus is on Bitcoin's price movement, trading volume, and market sentiment indicators. Key metrics include recent price changes, trading volume fluctuations, and sentiment scores derived from social media and news sources.

Source & Reliability: Data is sourced from cryptocurrency exchanges for price and volume, and sentiment data is aggregated from social media platforms and news articles. The reliability of price and volume data is high due to real-time updates, while sentiment data is moderately reliable, subject to rapid changes in public opinion.

Reasoning/Impact: Bitcoin's price is highly sensitive to market sentiment, which can drive significant volatility. Positive sentiment often leads to increased buying pressure, potentially driving prices up, while negative sentiment can result in selling pressure and price declines. Understanding sentiment helps anticipate short-term price movements.

Data Comparisons: Recent data shows Bitcoin's price has been fluctuating within a specific range, with a slight upward trend over the past week. Trading volume has been consistent, indicating steady interest. Sentiment analysis reveals a neutral to slightly positive sentiment, with increased mentions of Bitcoin in the context of potential regulatory developments.

Historical Context/Comparative Insight: Historically, Bitcoin's price tends to experience increased volatility during periods of heightened regulatory discussions or macroeconomic changes. Comparatively, the current sentiment is more positive than during previous regulatory scares, suggesting a potential for price stability or moderate growth.

Final Opinion: Based on the current metrics and sentiment analysis, the short-term outlook for Bitcoin appears cautiously optimistic. If sentiment remains stable or improves, there is potential for a modest price increase. However, investors should remain vigilant for any sudden changes in sentiment or regulatory news. Recommendation: Hold, with a watchful eye on sentiment shifts and regulatory announcements.

Data Patterns & Observations: A recurring pattern is the correlation between sentiment spikes and short-term price movements. Positive news or sentiment often precedes price increases, while negative sentiment can lead to declines. Monitoring sentiment trends can provide early indicators of potential price changes.

Additional Considerations: Keep an eye on liquidity metrics, as they can impact price volatility. Additionally, consider macroeconomic factors such as interest rates and inflation, which can influence investor behavior in the cryptocurrency market.

Response for Deepseek R1
Everything inside was how Deepseek build reasoning the answer:

<think>
Alright, so I need to figure out how to analyze Bitcoin's sentiment and price expectations for the next few days. Hmm, where do I start? Well, first off, I know that Bitcoin's price is influenced by various factors like market trends, news, regulations, and even macroeconomic indicators. But since the question is about sentiment and price expectations in the next day or two, maybe looking at technical analysis would be helpful.

I remember hearing about things like moving averages, RSI (Relative Strength Index), and Volume as part of technical indicators. If Bitcoin's price crosses above a significant moving average, that might indicate positive sentiment because it could signal strength in the market. Conversely, failing to do so might mean negative sentiment.

RSI is another one I'm somewhat familiar with. It measures how overbought or oversold an asset is. An RSI below 30 usually suggests buying pressure, while above 70 indicates selling pressure. If Bitcoin's RSI is rising, that could be a good sign for price movement in the coming days.

Volume also plays a role. High trading volume can indicate strong interest and confidence in the market. If Bitcoin sees a surge in volume alongside positive indicators like moving averages or buying signals, it might signal bullish momentum.

I should consider looking at historical data too—maybe see how Bitcoin's price has moved when these indicators were signaling positive or negative trends. This could give me some patterns to predict future movements.

Wait, but sentiment analysis can also involve news events. Recent major news about regulations in countries like the U.S. or Europe regarding Bitcoin could impact its price and sentiment. If there are upcoming significant events, that's something I should monitor.

I'm a bit unsure about how reliable these indicators are on their own. Sometimes technical indicators can give false signals, so relying solely on them might not be enough. Maybe combining them with other factors like market news or geopolitical events would provide a more accurate picture.

Also, the behavioral aspect—how retail investors behave in the markets—could influence short-term price movements. If there's a lot of buying activity due to fear or excitement, it could drive prices up or down regardless of technical signals.

Looking at specific timeframes might help too. For instance, if Bitcoin shows bullish patterns during certain times of the day, that could predict its behavior in the coming hours.

I think I need to check some sources for reliable data—maybe websites like CoinMarketCap, CoinDesk, or financial news sites that track Bitcoin trends. Also, looking into Twitter sentiment for retail investors might provide real-time insights on market-moving news.

Wait, but sentiment analysis can be subjective. How do you quantify something as intangible as market sentiment? Maybe using sentiment scores from social media posts or news articles could give a gauge of overall feelings towards Bitcoin in the market.

I should also consider the correlation between Bitcoin's price and other assets like Ethereum or traditional cryptocurrencies such as gold, silver, or Bitcoin itself through time-series analysis. This might help identify if there are any patterns or dependencies that can predict future moves.

Another point is to look at the correlation with macroeconomic indicators—like inflation rates or interest rates—if they have an impact on Bitcoin's price due to its status as a hedge against inflation.

But I'm not entirely sure how all these pieces fit together. Maybe starting with a simple moving average analysis would be a good first step, then incorporating RSI and volume data for more depth. Combining that with recent news or geopolitical developments could provide a comprehensive outlook.

I think I'll outline the steps: First, check Bitcoin's current price and key technical indicators like MA (moving averages), RSI, and volume. Then, look at any upcoming significant events that might affect its market. Analyze how sentiment is shaping up on social media or news feeds. Finally, see if there are patterns in historical data correlating these factors with future price movements.

So, putting it all together, I can predict whether Bitcoin's price is expected to go up or down based on the combination of these indicators and current market sentiments.
</think>

Answer:

**Bitcoin Sentiment and Price Expectations Analysis (Next 7 Days)**

**1. Technical Indicators:**
   - **Moving Averages:** Bitcoin has recently crossed above a significant moving average, indicating bullish sentiment due to potential strength in market sentiment.
   - **RSI (Relative Strength Index):** The RSI is currently below 30, signaling buying pressure and positive sentiment among investors.
   - **Volume:** High trading volume indicates strong market interest and confidence, supporting the bullish trend.

**2. News and Events:**
   - Monitor recent news from major countries like the U.S., U.K., and European Union regarding Bitcoin regulations, as these could impact sentiment and price movements.

**3. Behavioral Analysis:**
   - Consider retail investor behavior—patterns of buying during specific times of the day that might influence short-term price actions.

**4. Sentiment Sources:**
   - Analyze social media sentiment, particularly from platforms like Twitter where retail investors often share market-moving news.

**5. Correlation with Other Assets and Indicators:**
   - Examine correlations between Bitcoin's price and other assets (Ethereum, gold) or macroeconomic indicators (inflation rates) for potential predictive patterns.

**6. Historical Context:**
   - Review historical data to identify patterns where these indicators coincided with future price movements, enhancing the reliability of predictions.

**Conclusion:**
Combining these factors—positive technical signals, bullish sentiment, and supportive news—it is recommended that Bitcoin's price may experience an upward trend in the next 7 days. However, this analysis should be closely monitored as market dynamics can change rapidly due to external shocks or shifts in investor behavior.

Final Consideration

After gathering responses from each AI system, I posed the same question to OpenAI O3-mini, DeepSeek R1, and Gemini from Google. I selected the answer from OpenAI for the OpenAI O3-mini system, and for both DeepSeek R1 and Gemini, I opted for the DeepSeek response. In effect, this results in a 2-to-1 preference in favor of DeepSeek.

Observation

Since the DeepSeek R1 model is running locally via Ollama, both the process of converting data into embedding vectors and the retrieval process experienced higher latency compared to cloud-based solutions like OpenAI’s GPT4o implementation.

AI Financial Analyst Repository

Spring AI Doc

Multithreading and Patterns

lucasnscr — Mon, 05 Aug 2024 12:24:44 +0000

Multithreading-Design-Patterns

Implementing the most used multithreading design patterns with use cases and examples in real life scenarios using Java and Spring Boot.

Introduction to Multithreading in Java

Multithreading is a pivotal concept in modern software development that allows for the concurrent execution of two or more threads, enabling efficient utilization of CPU resources and improving the performance of applications. In Java, multithreading is a fundamental feature of the language, designed to enhance the responsiveness and throughput of applications by performing multiple tasks simultaneously within a single program.

Java's robust support for multithreading is built into its core, providing developers with powerful tools to create and manage threads effortlessly. The java.lang.Thread class and the java.util.concurrent package form the backbone of Java's multithreading capabilities, offering a variety of classes and interfaces for thread manipulation, synchronization, and communication.

At its essence, multithreading in Java involves dividing a program into smaller units of work, known as threads, which can run independently and concurrently. This approach not only maximizes the use of available CPU cores but also allows for more responsive and interactive applications, as time-consuming tasks like I/O operations or complex calculations can be performed in the background without freezing the main application thread.

One of the key advantages of multithreading is the ability to design applications that remain responsive under heavy load. For instance, in a graphical user interface (GUI) application, multithreading can ensure that the user interface remains responsive while background tasks, such as data processing or network communication, are handled concurrently.

However, developing multithreaded applications comes with its own set of challenges, such as thread synchronization, deadlocks, and race conditions. Java addresses these issues by providing synchronized methods and blocks, the volatile keyword, and various concurrency utilities like locks, semaphores, and executors, which help in managing the complexities associated with multithreading.

Implementing Multithreading Design Patterns in Spring Boot

In this project, we will implement the following multithreading design patterns using Java and Spring Boot:

Each pattern will be explained in detail, including its use cases and real-life examples. The implementation will demonstrate how to effectively use these patterns to handle concurrent tasks, ensuring responsive and efficient applications.

Active Object Pattern

Explanation

The Active Object Pattern helps manage concurrency by separating method invocation from execution. In this Spring Boot application, we demonstrated how to implement this pattern to process tasks asynchronously, ensuring that the server remains responsive even when handling long-running tasks. This approach can be extended to more complex scenarios such as handling multiple types of tasks or integrating with other services.

The Active Object Pattern decouples method execution from method invocation to enhance concurrency and simplify synchronized object behavior. It consists of the following key components:

Proxy: Provides an interface for clients to send requests.
Method Request: Defines a request as an object that implements a method to be executed.
Scheduler: Responsible for queuing and executing Method Requests on a separate thread.
Servant: Implements the methods exposed by the Proxy.
Activation Queue: Holds the Method Requests until they are executed by the Scheduler.
Future: Represents the result of an asynchronous computation.

Use Cases

GUI Applications: Ensuring that the UI remains responsive by handling time-consuming tasks asynchronously.
Real-Time Systems: Managing tasks in robotics or real-time monitoring systems where operations need to be queued and executed asynchronously.
Server Applications: Handling multiple client requests simultaneously without blocking.

Real-Life Example

Imagine a web server handling multiple client requests to fetch data from a database. Using the Active Object Pattern, each client request is processed asynchronously, improving throughput and ensuring that the server remains responsive.

Barrier Pattern

Explanation

The Barrier Object Pattern ensures synchronization of multiple threads at a predefined point, preventing any thread from proceeding until all have reached the barrier. This Spring Boot application demonstrates the pattern by simulating tasks executed by multiple threads, synchronizing at a barrier before proceeding. This approach is useful in scenarios requiring coordinated completion of parallel tasks, such as parallel computing, batch processing, and gaming.

The Barrier Object Pattern is used to synchronize multiple threads at a predefined point, ensuring that no thread proceeds until all threads have reached this barrier. This pattern is useful in scenarios where a set of tasks must be completed before any further steps can be taken.

Use Cases

Parallel Computing: When dividing a large computational task into smaller sub-tasks, each sub-task must be completed before the results can be combined.
Batch Processing: Ensuring all tasks in a batch are completed before moving to the next batch.
Gaming: Synchronizing the state of multiple players before advancing to the next level or stage.

Real-Life Example

In a multiplayer online game, all players must complete their turns before the game can proceed to the next round. The Barrier Object Pattern can ensure that all players reach the synchronization point (end of their turn) before the game advances.

Future Promises Pattern

Explanation

The Future Promise Pattern is used for handling asynchronous computations, allowing tasks to run in parallel without blocking the main thread. In this Spring Boot application, we demonstrated the pattern by simulating multiple asynchronous tasks and combining their results. This approach is useful in scenarios like asynchronous web requests, concurrent task execution, and long-running computations, ensuring non-blocking and efficient handling of tasks.

The Future Promise Pattern is used for asynchronous programming to handle the result of a computation that may not be immediately available. It involves two main components:

Future: Represents the result of an asynchronous computation. It provides methods to check if the computation is complete, to wait for its completion, and to retrieve the result.
Promise: Represents a proxy for a value that is not yet known. It acts as a placeholder for the result and allows the computation to be done asynchronously.

Use Cases

Asynchronous Web Requests: Making non-blocking HTTP requests where the response is processed once it becomes available.
Concurrent Task Execution: Running multiple tasks in parallel and processing their results once all tasks are completed.
Long-Running Computations: Handling computations that take a long time to complete without blocking the main thread.

Real-Life Example

In an e-commerce application, processing a large number of orders simultaneously without blocking the main thread. Each order is processed asynchronously, and once all orders are processed, the results are combined and sent to the user.

Monitor Object Pattern

Explanation

The Monitor Object Pattern is used to achieve mutual exclusion and synchronization in concurrent programming. This Spring Boot application demonstrates the pattern by simulating a print queue system where multiple print jobs are handled sequentially. This approach is useful in scenarios like resource management, producer-consumer problems, and thread-safe caching, ensuring thread-safe and synchronized access to shared resources.

Mutual Exclusion: Ensures that only one thread can access the critical section of code at a time.
Condition Variables: Used to allow threads to wait for certain conditions to be met before continuing execution.

Use Cases

Resource Management: Ensuring that multiple threads can access a limited resource without conflict.
Producer-Consumer Problem: Managing synchronization between producer and consumer threads.
Thread-Safe Caching: Ensuring thread-safe access to a cache or shared resource.

Real-Life Example

Consider a print queue system where multiple print jobs are submitted from different computers to a single printer. The Monitor Object Pattern can ensure that only one print job is processed by the printer at a time, while other jobs wait their turn.

Producer-Consumer Pattern

Explanation

The Producer-Consumer Pattern is used to manage concurrent access to a shared buffer by multiple producer and consumer threads. This Spring Boot application demonstrates the pattern by simulating an order processing system where multiple orders are produced and consumed. This approach is useful in scenarios like logging systems, web servers, and task queues, ensuring efficient and synchronized handling of tasks without data loss or corruption.

Key Concepts

Producers: Threads that create data and put it into the buffer.
Consumers: Threads that take data from the buffer and process it.
Buffer: A shared resource where produced data is stored before being consumed. This can be implemented as a queue.
Synchronization: Ensures that producers and consumers do not access the buffer concurrently in a way that leads to data corruption or loss.

Use Cases

Logging Systems: Log messages are produced by various parts of an application and consumed by a logging thread that writes them to a file.
Web Servers: Handling incoming HTTP requests (produced by clients) and processing them (consumed by worker threads).
Task Queues: Tasks generated by one part of an application and processed by worker threads in the background.

Real-Life Example

In a food ordering system, multiple customers (producers) place orders which are added to a queue. Chefs (consumers) take orders from the queue and prepare the food. The queue ensures that orders are handled in the order they are received, and no orders are lost or duplicated.

Reader-Writer Pattern

Explanation

The Reader-Writer Pattern is a synchronization pattern that allows multiple readers to read from a shared resource concurrently while ensuring exclusive access for writers. This Spring Boot application demonstrates the pattern by simulating a data store that supports concurrent reads and exclusive writes. This approach is useful in scenarios like database systems, caching, and file systems, where read operations are frequent, and write operations must be done safely.

Key Concepts

Readers: Threads that read data from the shared resource. Multiple readers can read concurrently as long as no writer is writing.
Writers: Threads that write data to the shared resource. Writers require exclusive access, meaning no other readers or writers can access the resource while a writer is writing.
Read-Write Lock: A synchronization mechanism that allows concurrent read access or exclusive write access to a shared resource.

Use Cases

Database Systems: Allowing multiple clients to read data concurrently while ensuring that data modifications are done exclusively.
Caching: Ensuring that cached data can be read by multiple threads simultaneously while updates to the cache are done exclusively.
File Systems: Allowing multiple processes to read from a file while ensuring that write operations are exclusive.

Real-Life Example

In a news website, multiple users can read articles concurrently, but when an article is updated, the update operation must be exclusive to avoid data corruption.

Thread Pool Pattern

Explanation

The Thread Pool Pattern is used to manage and reuse a pool of threads to perform tasks concurrently, improving performance and resource management. This Spring Boot application demonstrates the pattern by simulating the handling of multiple tasks using a thread pool. This approach is useful in scenarios like web servers, database connection pools, and background processing, ensuring efficient and scalable handling of concurrent tasks.

Key Concepts

Thread Pool: A collection of pre-initialized threads that stand by to perform tasks.
Task Queue: A queue where tasks are submitted for execution.
Worker Threads: Threads from the thread pool that pick up tasks from the task queue and execute them.

Use Cases

Web Servers: Handling multiple incoming HTTP requests concurrently.
Database Connection Pools: Managing a pool of database connections for efficient reuse.
Background Processing: Performing background tasks such as logging, data processing, etc.

Real-Life Example

A web server handles multiple incoming HTTP requests. Instead of creating a new thread for each request, the server uses a thread pool to handle the requests concurrently, ensuring efficient resource usage and reducing the overhead of thread management.

Vector Database and Spring IA

lucasnscr — Sun, 11 Feb 2024 21:11:24 +0000

A vector database is a specialized type of database optimized for handling vector data, which is fundamental in the field of Artificial Intelligence (AI), particularly in areas like machine learning, natural language processing, and image recognition.

What is Vector Data?

Vector data refers to data represented in the form of vectors. In AI, a vector often is a numerical representation of complex data, like text, images, or sound. For instance, words in natural language processing can be converted into vectors using techniques like word embeddings (e.g., Word2Vec, GloVe). These vectors capture the semantic meaning of the words and allow AI models to process and understand language.

How does a Vector Database Work?

Vector databases are designed to efficiently store and query vector data. Unlike traditional databases that perform queries based on exact matches or SQL queries, vector databases enable similarity searches. Here's how it works:

Storing Data: Data (like text or images) is transformed into vectors using AI models and then stored in the vector database.
Querying Data: When a query is made, it is also converted into a vector. The vector database then searches for vectors that are most similar to the query vector. This is known as a similarity or nearest neighbor search.
Similarity Measurement: The similarity between vectors is usually calculated using metrics like Euclidean distance, cosine similarity, or Manhattan distance. The choice of metric depends on the specific application and the nature of the data.

Correlation with AI

The use of vector databases is highly correlated with AI for several reasons:

Enhanced AI Models: They enable AI models to access large amounts of relevant, context-rich data quickly. This is crucial for models that require contextual understanding, like chatbots or recommendation systems.
Retrieval Augmented Generation (RAG): This is a technique where, before generating a response, an AI model retrieves relevant information from a vector database. This helps in providing more accurate and context-aware outputs.
Efficiency in Handling High-Dimensional Data: AI often deals with high-dimensional data (like images or complex text). Vector databases are optimized for such data, ensuring efficient storage and retrieval, which is a challenge in traditional databases.
Real-Time Processing: In many AI applications, real-time response is crucial. Vector databases allow for quick retrieval of similar data, enabling real-time processing in AI applications.

In summary, vector databases play a crucial role in the AI ecosystem by enabling efficient storage and retrieval of vectorized data. They support AI models by providing a means to quickly access large volumes of contextually relevant data, which is essential for tasks requiring understanding and interpretation of complex data sets.

Spring IA

The Spring AI project aims to streamline the development of applications that incorporate artificial intelligence functionality without unnecessary complexity. On this example we use features like: Embedding, Prompts, ETL and save all embedding on PGvector(Postgres Vector database)

Embedding

As a software engineer, when you're working with the Embeddings API, think of the EmbeddingClient interface as a bridge connecting your application to the power of AI-based text analysis. Its main role is to transform textual information into a format that machines can understand - numerical vectors, known as embeddings. These vectors are instrumental in tasks like understanding the meaning of text (semantic analysis) and sorting text into categories (text classification).

From a software engineering perspective, the EmbeddingClient interface is built with two key objectives:

Portability: The design of this interface is like a universal adapter in the world of embedding models. It's crafted to fit seamlessly with various embedding techniques. This means, as a developer, you can easily switch from one embedding model to another without having to overhaul your code. This flexibility is in sync with the principles of modularity and interchangeability, much like how Spring framework operates.
Simplicity: With methods like embed(String text) and embed(Document document), EmbeddingClient takes the heavy lifting off your shoulders. It converts text to embeddings without requiring you to get tangled in the complexities of text processing and embedding algorithms. This is particularly beneficial for those who are new to the AI field, allowing them to leverage the power of embeddings in their applications without needing a deep dive into the technicalities.
In essence, as a software engineer, when you use EmbeddingClient, you're leveraging a tool that not only simplifies the integration of advanced AI capabilities into your applications but also ensures that your code remains agile and adaptable to various embedding models.

Prompts

Working with Spring AI, you can prompts can be thought of as the steering wheel for AI models, guiding them to produce specific outputs. The way these prompts are crafted plays a critical role in shaping the responses you get from the AI.

To draw a parallel with familiar concepts in software development, handling prompts in Spring AI is akin to how you manage the "View" component in the Spring MVC framework. In this scenario, creating a prompt is much like constructing an elaborate text template, complete with placeholders for dynamic elements. These placeholders are then substituted with actual data based on user input or other operations within your application, similar to how you might use placeholders in SQL queries.

As Spring AI continues to evolve, it aims to introduce more sophisticated methods for interacting with AI models. At its core, the current classes and functionalities in Spring AI could be compared to JDBC in terms of their fundamental role. For example, the ChatClient class in Spring AI can be likened to the essential JDBC library provided in the Java Development Kit (JDK).

Building on this foundation, just as JDBC is enhanced with utilities like JdbcTemplate and Spring Data Repositories, Spring AI is expected to offer analogous helper classes. These would streamline interactions with AI models, much like how JdbcTemplate simplifies JDBC operations.

Looking further ahead, Spring AI is poised to introduce even more advanced constructs. These might include elements like ChatEngines and Agents that are capable of considering the history of interactions with the AI model. This progression mirrors the way that software development has evolved from direct JDBC usage to more abstract and powerful tools like ORM frameworks.

In summary, as a software engineer working with Spring AI, you are at the forefront of integrating AI capabilities into applications, using familiar paradigms and patterns from traditional software development, but applied to the cutting-edge field of AI and machine learning.

ETL pipeline

Extract, Transform, and Load (ETL) framework is crucial in managing data processes in the Retrieval Augmented Generation (RAG) scenario. Essentially, the ETL pipeline is the mechanism that streamlines the journey of data from its raw state to a more organized vector store. This process is vital for preparing the data in a way that makes it easily retrievable and usable by the AI model.

In the RAG use case, the core objective is to enhance the capabilities of generative AI models. This is achieved by integrating text-based data, which involves sourcing relevant information from a dataset to improve both the quality and the contextual relevance of the outputs generated by the model. The ETL framework plays a pivotal role in this process by ensuring that the data is not only accurately extracted and transformed but also efficiently loaded and stored for optimal retrieval by the AI system. This process enhances the AI's ability to produce more precise and contextually rich responses.

Details of Project

We've developed a project that incorporates fundamental principles related to AI and the Spring library, focusing on concepts like Prompts, Embedding, ETL pipelines, and Vector Databases. Our aim is to provide a concise overview of each concept's functionality. The main goal is to integrate all these elements through a practical example and apply them to a routine solution.

The first step is to select a Vector Database for our use. Spring AI offers integration with various databases. In this instance, we've chosen to use pgvector



version: '3.7'
services:
  postgres:
    image: ankane/pgvector:v0.5.0
    restart: always
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=admin
      - POSTGRES_DB=vector_db
      - PGPASSWORD=admin
    logging:
      options:
        max-size: 10m
        max-file: "3"
    ports:
      - '5433:5432'
    healthcheck:
      test: "pg_isready -U postgres -d vector_db"
      interval: 2s
      timeout: 20s
      retries: 10

for running pgvector you will run



docker compose up -d

In the project for use all Spring IA functionalities you will need add some dependencies:



<spring-ai.version>0.8.0-SNAPSHOT</spring-ai.version>

<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
  <version>${spring-ai.version}</version>
</dependency>

<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-pdf-document-reader</artifactId>
  <version>${spring-ai.version}</version>
</dependency>

<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-pgvector-store</artifactId>
  <version>${spring-ai.version}</version>
</dependency>

We use latest version of library 0.8.0-SNAPSHOT.

Command to Run application



mvn spring-boot:run -Dspring-boot.run.profiles=openai

We have divided our approach into two distinct parts: data handling and question processing.

Data Handling: This involves several key operations:

Loading: Importing data into our system.
Transforming: Modifying or processing the data to fit our needs.
Inserting: Adding new data entries into our database.
Retrieving: Accessing data from the database as needed.
Deleting: Removing data entries that are no longer required.

Question Processing: In this part, we utilize the data that has been loaded and processed. The aim here is to provide responses that are directly related to, and informed by, the data we have in our resources.

Regarding the data aspect, we have utilized a Technology Radar from ThoughtWorks as our primary data source."

About data, we used a Technology Radar from Thoughtwrorks

Technology Radar

The Technology Radar is a snapshot of tools, techniques, platforms, languages and frameworks based on the practical experiences of Thoughtworkers around the world. Published twice a year, it provides insights on how the world builds software today. Use it to identify and evaluate what’s important to you.

Here the link from latest tech radar version

With the content from the ThoughtWorks Technology Radar as our reference, we are now equipped to utilize our API to recommend the best tools or offer insights and opinions on various technologies.

Link of project

Generative AI with SpringBoot Services

lucasnscr — Wed, 31 Jan 2024 20:32:22 +0000

Spring AI

The Spring AI library is a set of APIs and components that allow Java developers to integrate artificial intelligence (AI) into their Spring Boot applications. The library is based on the Spring Framework and provides a unified API to access a variety of AI technologies, including machine learning, deep learning, natural language processing, and computer vision.

Spring AI provides a number of features to make it easy to integrate AI into Spring Boot applications. These features include:

Support for a variety of AI technologies: Spring AI supports a variety of AI technologies, including machine learning, deep learning, natural language processing, and computer vision. This allows developers to choose the AI technology that best meets their needs.
Unified API: Spring AI provides a unified API to access all AI technologies. This simplifies the development of Spring Boot applications that integrate AI.
Ease of use: Spring AI is designed to be easy to use. Developers can start using the library quickly, without the need to learn a new API.
Spring AI can be used for a variety of purposes, including:

Recommendations: Spring AI can be used to generate personalized recommendations for users, based on their historical data.
Data analysis: Spring AI can be used to analyze data and identify patterns and trends.
Anomaly detection: Spring AI can be used to detect anomalies in data, such as fraud or cyber attacks.
Natural language processing: Spring AI can be used to process natural language, such as translation, summarization, and text generation.
Computer vision: Spring AI can be used to process images and videos, such as object identification, image classification, and facial recognition.
Spring AI is a powerful tool that can help developers integrate AI into their Spring Boot applications. The library is easy to use and provides support for a variety of AI technologies.

Here are some additional details about the Spring AI library:

The library is still under development, but it has already been used in a variety of production applications.
The library is open source and available on GitHub.
The library documentation is available on the Spring website.

Project Setup

When configuring your Maven pom.xml for the Spring AI project, add the specified repository to access its dependencies. As Spring AI is currently in the experimental phase, only snapshot versions are available. Here is how you should include this in your pom.xml:



<repositories>
        <repository>
            <id>spring-snapshots</id>
            <name>Spring Snapshots</name>
            <url>https://repo.spring.io/snapshot</url>
            <releases>
                <enabled>false</enabled>
            </releases>
        </repository>
    </repositories>

In this project, I have selected OpenAI as the primary interface. However, it's important to note that Spring AI offers a variety of different ChatClient interfaces.

OpenAI
Azure OpenAI
Hugging Face
Ollama

OpenAI Library



<dependency>
  <groupId>org.springframework.experimental.ai</groupId>
  <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
  <version>0.7.1-SNAPSHOT</version>
</dependency>

The most recent version of this library is 0.8.0-SNAPSHOT. We attempted to use this version, but currently, we only have access to the 0.7.1-SNAPSHOT version. As our studies progress, we will try to update to the latest version.

After all setup, you need to create your account and API key in one of the chosen clients. In my case OpenAI Platform

This was just the first study with Generative AI and the Spring AI library. We will develop some other features using the library, and I hope you follow my journey with Spring and AI.

Part of this doc was generated with SpringAI and Model ChatGPT 3.5.

Checkout this project from Github



git clone https://github.com/lucasnscr/SpringAI.git

Deployment and Patterns

lucasnscr — Fri, 20 Jan 2023 15:30:02 +0000

After a few years working as a Software Engineer, I received an invitation to work as an SRE and closer to my company's CI/CD processes. I think the beginning would be ideal to understand the main concepts and patterns. I believe patterns are ways to solve problems and ensure a robust solution for our service, product, platform or APIs. Today I intend to talk about deployments, but from the perspective of the Continuous Delivery process and talk about the main patterns of this technique.

What is Continuous Delivery?

Continuous Delivery is the ability to get changes of all types including new features, configuration changes, bug fixes and experiments—into production, or into the hands of users, safely and quickly in a sustainable way.

Our goal is to make deployments—whether of a large-scale distributed system, a complex production environment, an embedded system, or an app—predictable, routine affairs that can be performed on demand.

Continuous Delivery improve some features in lifecyle:

Low risk releases. The primary goal of continuous delivery is to make software deployments painless, low-risk events that can be performed at any time, on demand.
Faster time to market. It’s not uncommon for the integration and test/fix phase of the traditional phased software delivery lifecycle to consume weeks or even months.
Higher quality. When developers have automated tools that discover regressions within minutes, teams are freed to focus their effort on user research and higher level testing activities such as exploratory testing, usability testing, and performance and security testing.
Lower costs. Any successful software product or service will evolve significantly over the course of its lifetime.
Better products. Continuous delivery makes it economic to work in small batches. This means we can get feedback from users throughout the delivery lifecycle based on working software.

The Deployment Pipeline

The key pattern introduced in continuous delivery is the deployment pipeline. This pattern emerged from several ThoughtWorks projects where we were struggling with complex, fragile, painful manual processes for preparing testing and production environments and deploying builds to them. We’d already worked to automate a significant amount of the regression and acceptance testing, but it was taking weeks to get builds to integrated environments for full regression testing, and our first deployment to production took an entire weekend.

We wanted to industrialize the process of taking changes from version control to production. Our goal was to make deployment to any environment a fully automated, scripted process that could be performed on demand in minutes (on the original project we got it down to less than an hour, which was a big deal for an enterprise system in 2005). We wanted to be able to configure testing and production environments purely from configuration files stored in version control. The apparatus we used to perform these tasks (usually in the form of a bunch of scripts in bash or ruby) became known as deployment pipelines, which Dan North, Chris Read and I wrote up in a paper presented at the Agile 2006 conference.

In the deployment pipeline pattern, every change in version control triggers a process (usually in a CI server) which creates deployable packages and runs automated unit tests and other validations such as static code analysis. This first step is optimized so that it takes only a few minutes to run. If this initial commit stage fails, the problem must be fixed immediately—nobody should check in more work on a broken commit stage. Every passing commit stage triggers the next step in the pipeline, which might consist of a more comprehensive set of automated tests. Versions of the software that pass all the automated tests can then be deployed on demand to further stages such as exploratory testing, performance testing, staging, and production, as shown below.

What is Deployment?

Deployment in software and web development means pushing changes or updates from one deployment environment to another. When setting up a website you will always have your live website, which is called the live environment or production environment.

Deployment Patterns

Deployment patterns are techniques used to introduce new features or updates to an application in a controlled and organized manner. The goal is to minimize downtime, ensure that the new features are stable and perform well, and sometimes allow for testing with a small group of users before making the update available to everyone.

There are some deployment patterns and in this text we will mention the positive and negative points of each one, I will only mention the ones I had contact with either as a Software Engineer or as an SRE: Canary, Blue and Green, Dark Launching and A/B Testing.

Canary Release

A canary deployment is a method that exposes a new feature to an early sub-segment of users. The goal is to test new functionality on a subset of customers before releasing it to the entire user base.

You can choose randomly or a specific group of users and roll back if anything breaks. If everything works as intended, you can gradually add more users while monitoring logs, errors, and software health.

Why we used

Service Evaluation: You can evaluate multiple service versions side by side using canary deployments in real-world environments with actual users and use cases.
Cost effective: Because two production environments are not required, it is less expensive than a blue-green deployment.
Testing: Canary deployment can be used for A/B testing as it offers two alternatives to the users and selects one with better reception.
In-built Capacity Test: Normally, it is difficult to test the capacity of a large production environment. Canary deployments offer in-built capacity tests.
Feedback: With the release being made for targeted or a sample number of users, you receive invaluable input from real users and modify canary versions for improvements.
No cold-starts: Unlike new systems that take a while to start, Canary deployments build up momentum to prevent slowness of cold-start.
Zero downtime: Just like blue-green deployments, a canary deployment does not cause downtime.
Simple Rollback Mechanism: You can always easily roll back to the previous version.

Why we not used

Script Testing: Canary release scripting is challenging, since human verification and testing can take a significant amount of time, and the monitoring and instrumentation that is required for production testing may call for further research.
Complexity: Canary deployments share the same complexities as blue-green deployments like many production machines, user migrations, and system monitoring
Time: It takes time to set up a healthy canary deployment pipeline. However, once done right, you can do more frequent and safer deployments.
Deployment at an enterprise scale: An enterprise Canary deployment is difficult to accomplish in an environment where the software is loaded on personal devices. Setting up an auto-update environment for end users may be one method to get around this.

Blue/Green Deployment

Blue green deployment is an application release model that gradually transfers user traffic from a previous version of an app or microservice to a nearly identical new release—both of which are running in production.

The old version can be called the blue environment while the new version can be known as the green environment. Once production traffic is fully transferred from blue to green, blue can standby in case of rollback or pulled from production and updated to become the template upon which the next update is made.

There are downsides to this continuous deployment model. Not all environments have the same uptime requirements or the resources to properly perform CI/CD processes like blue green. But many apps evolve to support such continuous delivery as the enterprises supporting them digitally transform.

Why we used

Rapid releasing: You can release software practically any time. You don't need to schedule a weekend or off-hours release because, in most cases, all that is necessary to go live is a routing change. Because there is no associated downtime, these deployments have no negative impact on users.
Simple rollbacks: The reverse process is equally fast. Because blue-green deployments utilize two parallel production environments, you can quickly flip back to the stable one should any issues arise in your live environment. This reduces the risks inherent in experimenting in production.
Built-in disaster recovery: Because blue-green deployments use two production environments, they implicitly offer disaster recovery for your business systems. A dual production environment is its own hot backup.

Load balancing: Blue-green parallel production environments also make load balancing easy. When the two environments are functionally identical, you can use a load balancer or feature toggle in your software to route traffic to different environments as needed.

Why we not used

Resource-intensive: As is evident by now, to perform a blue-green deployment, you will need to resource and maintain two production environments. The costs of this, in money and sysadmin time, might be too high for some organizations. For others, they may only be able to commit such resources for their highest value products. If that is the case, does the DevOps team release software in a CI/CD model for some products but not others? That may not be sustainable.

Dark Launching

A dark launch or dark release is a term that refers to releasing your features to a subset of users to gather their feedback and improve your new features accordingly. Hence, it is a way to deploy a feature but limit access to it to obtain useful feedback.

You can think of it as a safe way to release your features to a small set of users to test whether they like this new feature.

Based on the feedback received from these users, you would either release this feature to the rest of your users or you work on optimizing and improving the feature before doing a full release.

Feature toggles — also known as feature flags — allow you to further decouple the deployment of different software versions from the release of features to users. You can deploy new versions of an application as often as needed, with certain features disabled: releasing a feature to users is simply a matter of toggling it “on.”

Here one implementations of feature flag

Why we used

More experimentation: Dark launches are a great way to gauge customers’ interest in a new feature you’re planning to release. It gives product teams in particular a way to test out their ideas with less risk as only a select number of users are seeing the feature. Teams can choose to run experiments for both front- and back-end features and then release the winning variation to everyone else.
Higher quality releases and faster time to market: uch a technique allows you to put forward high-quality releases as you are updating your features according to feedback from your most relevant users. It allows developers to see how users respond to and interact with the new features to determine whether any improvements will need to be made. Thus, a dark launch is a way to test your new feature in a production environment with real world users. This way, you’ll be able to gather essential metrics to analyze feature performance and to closely monitor how your users engage with the feature.

Why we not used

Microservices are behind feature toggles, which can lead to increased costs and time for debugging microservices.
Additionally, to enable continuous development, teams must be able to move microservices behind feature toggles during development itself.

A/B Testing

A/B Testing, also known as Split Testing or Bucket Testing, compares two variants of the same webpage or app and finds out which one will give better results. It is an experiment where two or more versions of the page are shown to the user randomly. Using statistical methods, which one will be better for a particular goal is determined. This eliminates guesswork and allows data-driven decisions, which results in more conversions and sales.

In the A/B test, a web page or an app screen is taken and modified to create another version of the same. The change can be as small as a button change or a complete revamp. Half of the site traffic is shown in the page’s original version(known as control), and the other half is shown the modified version(known as a variation) of the page. Data is collected with each engagement and analyzed with statistical software. It can be determined whether the change had a positive, negative or neutral effect.

Why we used

Get clear evidence: It’s easy to see how many users complete a transaction with site A over site B. The evidence is based on real behaviour, so is hard data of the type that money men love (and can be presented in a simple-looking, hard hitting chart).
Test new ideas: If you have an innovative idea for an existing site, A/B testing provides hard proof as to whether it works or not. However, you will need to implement that big idea in hard code before you can test it this way.
Optimise one step at a time: If you run a large site, or many sites, then A/B testing is a fantastic opportunity to “patch” test, by starting out in a small corner and then working up to the main pages of the site.

Why we not used

Can take lots of time and resources: A/B testing can take a lot longer to set up than other forms of testing. Setting up the A/B system can be a resource and time hog, although third-party services can help. Depending on the company size, there may be endless meetings about which variables to include in the tests.
Only works for specific goals: This kind of testing is ideal if you want to solve one dilemma, which product page gives me the best results? But, if your goals are less easy to measure pure A/B testing won’t provide those answers.
Could end up with constant testing: Once the test is over, that is it. The data is useless for anything else. Further A/B tests will have to start from a new baseline and other types of testing will only likely be applied to the more successful site, when they could have found equally useful information from the rejected version.

Database types and Cap Theorem

lucasnscr — Wed, 04 Jan 2023 22:56:52 +0000

NoSQL databases are a class of persistence technology that provide a new storage mechanism that goes hand in hand with normalization and relational databases. Like any other database, it has the same objectives: inserting, updating, retrieving and deleting information, however, with new modeling concepts and storage structure.

The term NoSQL, initially, was related to "not SQL" and later it was controlled to Not Only SQL, that is, "not just SQL", opening the concept of polyglot awareness (the work of dealing with different types of databases to reach the objectives in the application).

These databases have speed and high scalability rates as their main characteristics, such as the ease of increasing the number of database servers. This prevents the bottleneck of operations, avoids a point of failure, in addition to geographically distributing them, making the data closer to the users who respond to the request.

NoSQL databases are being adopted more frequently in different types of applications, including financial institutions. As a consequence, the number of suppliers for this type of database is also growing.

Currently, NoSQL databases are classified into four groups (key-value, column family, document and graphs) defined by their storage model:

Key Value

The key-value databases have a structure similar to java.util.Map , that is, the information will be retrieved only by the key. This type of database can be used, for example, to manage the user's session. Another interesting case is DNS, whose key is the address, for example www.google.com , and the value is the IP of that server.

Currently, there are several key-value database implementations, among which the most famous are:

Redis
Amazon DynamoDB
Amazon S3

Comparing the relational database with the key-value type, it is possible to perceive some points. One is that the key-value structure is quite simple.

It is not possible to perform operations such as joining between buckets and the value is composed of a large block of information instead of being subdivided into columns as in the database relational.

Relational Database	Key-Value Database
Table	Bucket
Row	Key/Value
Column	Not supported
Relationship	Not supported

Column Database

This model became popular through Google's BigTable paper, with the aim of setting up a distributed data storage system, and designed to have a high degree of scalability and data volume. Just like the key-value, to perform a search or retrieve some information within the database, it is necessary to use the field that works as a unique identifier that would be similar to the key in the key-value structure. However, the similarities end there. Information is grouped into columns: a unit of information that is made up of the name and the information itself.

These types of databases are important when dealing with a high degree of data volume, such that information needs to be distributed across multiple servers. But it is worth mentioning that its reading operation is quite limited, similar to the key-value, since the search for information is defined from a single field or a key. There are several databases that use these structures, for example:

Hbase
Cassandra
Scylla

Among the types of column family type databases, Apache Cassandra is the most famous. Thus, if an application needs to deal with a large volume of data and with easy scalability, Cassandra is certainly a good option.

By contrasting the column family type database with relational databases, it is possible to notice that operations, in general, are much faster. It is simpler to work with large volumes of information and servers distributed all over the world, however, this comes at a cost: the reading of this type of database is very limited.

For example, it is not possible to perform unions between column families as in the relational database. The column family allows you to have an unlimited number of columns, which in turn are composed of name and information, exactly as shown in the following table:

Relational Database	Column Database
Table	Column Family
Row	Column
Column	Name and Value to Column
Relationship	Not supported

Document Database

Document-oriented databases are very similar in structure to a JSON or XML file. They are composed of a large number of fields, which are created at runtime, generating great flexibility, both for reading and writing information. They allow the reading of information by fields other than the key.

Some implementations, for example, have very high integration with search engines. Thus, this type of database is crucial when performing data analysis or system logs. There are a few implementations of document-type databases, the most famous of which is MongoDB.

When comparing with a relational database, although it is possible to perform a search for fields other than the unique identifier, document-type databases do not support relationships. Another point is that document-type banks, in general, are schemaless.

Relational Database	Document Database
Table	Collection
Row	Document
Column	Key/Value pair
Relationship	Not supported

Graphs Database

Graph databases are a data structure that connects a set of vertices through a set of edges. Modern databases in this category support multi relational graph structures, where there are different types of vertices (representing people, places, items) and different types of edges. The recommendation systems that take place on social networks are the biggest case for the graph-type bank. Of the most famous database types in the NoSQL world, the graph has a distinct structure with the relational one.

Neo4j
HyperGraphDB

Relational Database	Document Database
Table	Collection
Row	Document
Column	Key/Value pair
Relationship	Not supported

Cap Theorem

One of the great challenges of NoSQL databases is that they deal with distributed persistence, that is, the information is located on more than one server. Several studies have been created to help with this distributed persistence challenge, the most famous being a theory created in 1999, the CAP Theorem.

This theorem states that it is impossible for distributed data storage to simultaneously provide more than two of the following three guarantees:

Consistency: A guarantee that every node in a distributed cluster returns the same most recent and successful write. Consistency refers to each customer having the same view of the data.

Availability: each request receives a response (no error) - no guarantee that it contains the most recent write.

Partition tolerance: The system continues to function and maintain its consistency guarantees despite network partitions. Distributed systems that guarantee tolerance continue to operate even if there is a failure in one of the nodes, since there is at least one node to operate the same job and guarantee the functioning of the system.

In general, this theorem explains that there is no perfect world. When you choose one characteristic, you lose another as a consequence. In an ideal world, a distributed database would be able to support all three features, but in reality, it is important for the developer to know what he will be missing out on when choosing between one or the other.

Aws Lambda - Implementing lambda with Spring native and localstack

lucasnscr — Thu, 22 Sep 2022 21:49:21 +0000

Serverless Architecture with Spring Native

Serveless Architecture

Serverless differs from other cloud computing models in that the cloud provider is responsible for managing both the cloud infrastructure and the scaling of apps. Serverless apps are deployed in containers that automatically launch on demand when called.

Under a standard Infrastructure-as-a-Service (IaaS) cloud computing model, users prepurchase units of capacity, meaning you pay a public cloud provider for always-on server components to run your apps. It’s the user’s responsibility to scale up server capacity during times of high demand and to scale down when that capacity is no longer needed. The cloud infrastructure necessary to run an app is active even when the app isn’t being used.

With serverless architecture, by contrast, apps are launched only as needed. When an event triggers app code to run, the public cloud provider dynamically allocates resources for that code. The user stops paying when the code finishes executing. In addition to the cost and efficiency benefits, serverless frees developers from routine and menial tasks associated with app scaling and server provisioning.

With serverless, routine tasks such as managing the operating system and file system, security patches, load balancing, capacity management, scaling, logging, and monitoring are all offloaded to a cloud services provider.

It’s possible to build an entirely serverless app, or an app composed of partially serverless and partially traditional microservices components.

Native Image

Native image is a technology that allows you to compile Java code in advance and make it executable. This executable contains application classes, their dependency classes, runtime library classes, and native code linked to the JDK. After the build process, it builds the executable for the operating system, this process is called image build. JVM-based languages have this capability like Java, Scala, Clojure and Kotlin.

Spring Native

In March 2021, Pivotal released a still-experimental version to the community of its native module, Spring-Native. Given that its competitors are already able to have a more fluid integration with GraalVM, spring had to chase after Quarkus and Micronaut to be able to make its stack compatible with native images.

Spring Native provides the ideal workload for computing in containers and Kubernetes. Use strong native image an instant boot, instant peak performance and reduced memory consumption.

As everything is not perfect, it is necessary to understand that there are steps in this process that need to be improved, as the process of generating a native image is very heavy and time-consuming.

Environment/Localstack

We use Localstack for emulate on local machine our Aws environment. To start the localstack in the repository have a file docker-compose.yml

For start localstack running this command:



docker-compose up -d

Necessary Steps

Running the command below:



mvn -ntp clean package -Pnative --settings ./settings-spring-native.xml

Command for create docker of the application using:



mvn spring-boot:build-image --settings ./settings-spring-native.xml

Command for crate S3 bucket:



aws s3 mb s3://native-bucket --endpoint-url http://localhost:4566

Command for create Aws Lambda(Function1):



aws lambda create-function \
--endpoint-url http://localhost:4566 \
--function-name function1 \
--runtime java17 \
--handler function1.S3Handler \
--region us-east-1 \
--zip-file fileb://function1-0.0.1-SNAPSHOT.jar \
--role arn:aws:iam::12345:role/ignoreme

Command for create Aws Lambda(Function2):



aws lambda create-function \
--endpoint-url http://localhost:4566 \
--function-name function2 \
--runtime java17 \
--handler function2.S3Handler \
--region us-east-1 \
--zip-file fileb://function2-0.0.1-SNAPSHOT.jar \
--role arn:aws:iam::12345:role/ignoreme

Following all the steps, when we execute the command receive the return with this message:



{
    "FunctionName": "function1",
    "FunctionArn": "arn:aws:lambda:us-east-1:000000000000:function:function1",
    "Runtime": "java17",
    "Role": "arn:aws:iam::12345:role/ignoreme",
    "Handler": "function1.S3Handler",
    "CodeSize": 91807,
    "Description": "",
    "Timeout": 3,
    "LastModified": "2022-09-25T18:10:57.187+0000",
    "CodeSha256": "JDYOPtjvkoP17EKU5Fhu45GGFrDve0tJSe2iRccEb9g=",
    "Version": "$LATEST",
    "VpcConfig": {},
    "TracingConfig": {
        "Mode": "PassThrough"
    },
    "RevisionId": "1acedcab-1c5d-4b33-932a-bc2e6a2a67da",
    "State": "Active",
    "LastUpdateStatus": "Successful",
    "PackageType": "Zip",
    "Architectures": [
        "x86_64"
    ]

Registering our Lambda to S3 bucket events

Finally, we need to bind the lambda to the put-bucket-notification event within LocalStack. Any time that an object is created within our native-image S3 bucket, it will invoke our earlier Java code.

Function1



aws s3api put-bucket-notification-configuration --bucket native-image --notification-configuration file://s3hook.json --endpoint-url http://localhost:4566

Invoking a Lambda within LocalStack

Create a simple text file called samplefile.txt and run the following command to transfer the file to S3, thereby triggering the Lambda:



aws s3 cp uploadfile.txt s3://native-bucket/uploadfile.txt --endpoint-url http://localhost:4566

Time to start/running the functions with spring native.



INFO 38045 --- [main] c.l.s.f.Function1ApplicationTests : Started Function1ApplicationTests in 0.64 seconds (JVM running for 1.13)

Project description

Project design and architecture.

Checkout this project from Github

git clone https://github.com/lucasnscr/serverless-spring-native