DEV Community: Sina Tavakkol

Building Local AI Agents: A Practical Guide to Frameworks and Deployment

Sina Tavakkol — Wed, 19 Feb 2025 18:27:25 +0000

Part 2/3

Welcome back to our three-part series on AI Agents. In the first article, "AI Agents Explained: Architecture, Benefits, and Real-World Applications" we established a solid understanding of what AI Agents are, their internal components, and the advantages they offer.

Now, we move into the practical realm. In this second article, we'll explore how to build and deploy AI Agents locally using popular frameworks and tools.

We'll also present a step-by-step guide and a simple code example to get you started. The final article will dive into real world examples with in-depth explanations and sample codes. This hands-on guide will empower you to create your own AI Agents and leverage their capabilities on your own hardware.

Introduction to Local AI Agent Deployment

Deploying AI Agents locally offers several key benefits:

Privacy: Data is processed on your machine, keeping sensitive information under your control.
Low Latency: Local processing removes network delays, allowing for quicker response times, which is crucial for real-time applications.
Offline Access: Agents can work without an internet connection, making them perfect for remote areas.
Cost Savings: You can save on ongoing cloud computing costs by running agents on your own setup.

This article highlights practical tools and techniques to take advantage of these benefits.

Frameworks and Tools for Local AI Agent Development

Several frameworks and tools simplify the process of building and deploying AI Agents locally. We'll focus on three prominent options:

Ollama: Ollama simplifies the process of deploying and running Large Language Models (LLMs) locally. It handles the complexities of model management, allowing you to quickly deploy and experiment with different LLMs without worrying about underlying infrastructure. Ollama is designed to work with a wide array of models, so we can easily integrate this with the other options.
LangChain: LangChain is a powerful framework for building applications powered by language models. It provides a modular and flexible architecture for model integration, data connection, agent creation, and more. Its modular design allows you to customize every aspect of your agent's behavior.
AutoGen (Microsoft): AutoGen enables the development of LLM applications with multiple agents that can converse with each other to solve tasks. It simplifies the orchestration, optimization, and automation of complex workflows involving multiple LLMs and tools.

Choosing the right framework depends on your specific needs and the complexity of your project. For simple agents, LangChain might suffice, while AutoGen is better suited for multi-agent systems. LangChain and AutoGen are more complete tools, so Ollama can be used as a module for these.

Setting up Your Environment

1. Create a Virtual Environment:

Let's walk through the process of setting up your development environment using LangChain, as it's very powerful for deploying agents. These instructions assume you have Python 3.7+ installed.

python3 -m venv venv
source venv/bin/activate  # On Linux/macOS
venv\Scripts\activate  # On Windows

Explanation: A virtual environment isolates your project's dependencies from the system-wide Python installation, preventing conflicts and ensuring reproducibility.

2. Install Dependencies:

pip install langchain openai chromadb python-dotenv

Explanation: pip is Python's package installer. This command installs the following packages:

langchain: The LangChain framework.
openai: *OpenAI'*s Python library, used for interacting with OpenAI models (you'll need an API key).
chromadb: A vector database for storing embedding. ChromaDB is lightweight and easy to use for local development.
python-dotenv: For loading environment variables from a .env file.

3. Install Ollama:

Follow Ollama's steps to install it at ollama.com.
Make sure to download and test your local LLM before continue.

4. Set up API Keys:

Create a .env file in your project directory. This file will store sensitive information like API keys separately from your code.

Add your OpenAI API key and Ollama URL:

OPENAI_API_KEY="YOUR_OPENAI_API_KEY"
OLLAMA_BASE_URL="http://localhost:11434"

5. Load Environment Variables:

import os
from dotenv import load_dotenv

load_dotenv()

openai_api_key = os.getenv("OPENAI_API_KEY")
ollama_base_url = os.getenv("OLLAMA_BASE_URL")

Explanation: This code snippet loads the environment variables from the .env file into your Python script, making them accessible for use.

Note: You will need an account at OpenAI to use its models with API KEY. You can use the Ollama URL if you want to use one of the local models you have previously downloaded with Ollama.

Designing Your Agent's Architecture

Before diving into code, let's outline the key architectural considerations for our simple AI Agent:

Defining Goals:
- Our agent's goal is to answer questions based on a local document. This is a common use case for information retrieval and knowledge management.
Choosing appropriate LLMs:
- We will use OpenAI's gpt-3.5-turbo model or a local model, such as llama2, depending on your configuration.
- Consider the model's capabilities, cost, and latency when making your choice.
Choosing Memory/Storage:
- We will load a document and create a vector embedding to respond to the questions. A vector embedding is a numerical representation of the text that captures its semantic meaning.
- ChromaDB will be used to store these embedding.
Selecting Tools:
- We do not need any external tools for this basic example. However, in more complex scenarios, agents might require access to tools like web search, calculators, or external databases.

Basic Code Example: Question Answering Agent

import os
from dotenv import load_dotenv
from langchain.document_loaders import TextLoader
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.llms import OpenAI, Ollama
from langchain.chains import RetrievalQA

load_dotenv()

openai_api_key = os.getenv("OPENAI_API_KEY")
ollama_base_url = os.getenv("OLLAMA_BASE_URL")

# 1. Load the Document
loader = TextLoader("your_document.txt")  # Replace with your document path
documents = loader.load()

# 2. Create Embeddings and Store in ChromaDB
embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)  # or HuggingFaceEmbeddings
db = Chroma.from_documents(documents, embeddings)

# 3. Choose LLM and Create RetrievalQA Chain

use_local_model = True # Set it to false if you want to use OpenAI Model.

if use_local_model:
    llm = Ollama(base_url=ollama_base_url, model="llama2")  # Replace with your model
else:
    llm = OpenAI(openai_api_key=openai_api_key)

qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=db.as_retriever())

# 4. Ask Questions
query = "What is the main topic of this document?"
result = qa.run(query)

print(result)

Explanation:

Load the Document: This code loads a text document using TextLoader. Replace "your_document.txt" with the path to your local file.
Create Embedding: It creates embedding from the document using OpenAI's embedding (or you can select another embedding model). Embedding are numerical representations of the text, allowing the agent to understand the semantic meaning of the document. These embedding are stored in a ChromaDB vector database.
Choose LLM and Create QA Chain: Here is where we change between OpenAI model or a local model with Ollama. Just select the variable use_local_model to true or false. Change the local model to the one you have available. The RetrievalQA chain combines the LLM with the ChromaDB retriever.
Ask Questions: The code prompts the agent with a question. The qa.run(query) method retrieves relevant information from the ChromaDB and uses the LLM to generate an answer.

Before running:

Replace "YOUR_OPENAI_API_KEY" in your .env file with your actual OpenAI API key or use local models with Ollama.
Replace "your_document.txt" with the path to a local text file.
Ensure Ollama is running and your local model is downloaded.

Troubleshooting Tips

API Key Errors: Double-check that your OpenAI API key is correctly set in the .env file. If you're using a local model, ensure that you haven't set OPENAI_API_KEY accidentally. Common errors include missing keys, incorrect formatting, or expired keys.
Dependency Issues: If you encounter ModuleNotFoundError errors, ensure that all required packages are installed using pip install. If you recently updated Python or pip, try upgrading pip with pip install --upgrade pip and then reinstalling the dependencies.
Model Loading Errors: If you're using a local LLM and encounter errors related to loading the model, verify that Ollama is running correctly and that the specified model (llama2 in the example) is downloaded. Check Ollama's logs for detailed error messages.
Out of Memory Errors: LLMs can be memory-intensive. If you encounter "out of memory" errors, try reducing the size of the document you're loading, using a smaller LLM, or increasing your system's RAM.

Conclusion

In this article, we've explored the practical steps for building and deploying AI Agents locally. We've discussed key frameworks, setting up the environment, and provided a basic code example. By using these tools, you can begin creating your own AI Agents and use their capabilities as you wish. In the next and final article, we'll look into advanced use cases and techniques to optimize AI Agent performance.

AI Agents Explained: Architecture, Benefits, and Real-World Applications (Technical Deep Dive)

Sina Tavakkol — Wed, 12 Feb 2025 19:38:57 +0000

Part 1/3

Welcome to the first article in a three-part series exploring the fascinating world of AI Agents. In this series, we will dive into the details of AI Agents, including their definition, architecture, deployment strategies, and real-world applications.

In this first article, we will lay the foundation by providing a detailed overview of what AI Agents are, their internal components, and the benefits they offer across various industries.

In the following articles, we'll move on to practical implementation, focusing on local deployment using tools like Ollama, LangChain, and AutoGen, and then explore advanced techniques and real-world use cases. Let's start by demystifying AI Agents and understanding their core principles.

Artificial Intelligence is rapidly becoming a part of many aspects of our lives, and one of its most dynamic forms is the AI Agent. This article provides a detailed exploration of AI Agents, covering their definition, internal architecture, key algorithms, and real-world applications, with an emphasis on understanding their functionality and capabilities.

What is an AI Agent? (Technical Definition)

An AI Agent is a software entity characterized by its autonomy, reactivity, pro-activeness, and social ability. Unlike passive AI systems, AI Agents actively perceive their environment, reason about it using internal models, and take actions designed to achieve pre-defined goals. This autonomy is implemented through a cycle of sense-think-act.

Technically:

Autonomy: Achieved through internal control structures that make decisions without constant external input. Often implemented using finite state machines, hierarchical task networks (HTNs), or behavior trees.
Reactivity: Agents respond to changes in the environment in a timely manner, using event-driven programming or reactive planning techniques.
Pro-activeness: Agents initiate actions to achieve their goals, using planning algorithms like A*, Monte Carlo Tree Search (MCTS), or reinforcement learning.
Social Ability: Agents can communicate and interact with other agents or humans, using agent communication languages (ACLs) like KQML or FIPA-ACL or APIs and standard internet protocols.

** FIPA-ACL is stands for Foundation for Intelligent Physical Agents - Agent Communication Language.

** KQML is stands for Knowledge Query and Manipulation Language.

The core distinction between an AI Agent and a regular AI system is its goal-directed behavior and the ability to adapt its actions based on its perceptions.

Key Components of an AI Agent (Technical Breakdown)

Let's dissect the key components of an AI Agent from a technical perspective:

Perception (Sensors and Data Ingestion):

Implementation: Agents receive data through sensors or APIs. This can include:
- Direct Sensors: Physical sensors (cameras, microphones, LiDAR) that provide raw data. This data often requires pre-processing using techniques like image recognition (using convolutional neural networks - CNNs) or speech-to-text conversion.
- API Ingestion: Retrieving structured data from external services (e.g., weather APIs, stock market APIs). Requires handling authentication, data parsing (JSON, XML), and error handling.
Data Structures: Perceived data is typically represented using structured data types like dictionaries, objects, or knowledge graphs.

Reasoning (Decision Making and Planning):

Knowledge Representation: Agents store knowledge about the world using various techniques:
- Logical Representation: Using propositional logic, first-order logic, or description logic to represent facts and rules.
- Probabilistic Representation: Using Bayesian networks, Markov models, or hidden Markov models (HMMs) to represent uncertainty.
- Semantic Networks and Knowledge Graphs: Representing relationships between entities using graph structures.
Planning Algorithms: Agents use algorithms to plan actions:
- Classical Planning: Using algorithms like A*, STRIPS, or partial-order planning (POP) to find optimal plans.
- Reinforcement Learning (RL): Training agents to learn optimal policies through trial and error. Algorithms include Q-learning, SARSA, and deep Q-networks (DQNs).
- Rule-Based Systems: Implementing decision-making using if-then-else rules.

Action (Effectors and API Interaction):

Implementation: Agents interact with the environment through effectors or API calls:
- Effectors: Physical actuators (motors, robotic arms) that directly manipulate the environment. Requires low-level control and feedback mechanisms.
- API Calls: Making requests to external services (e.g., sending emails, posting to social media). Requires handling API authentication, request formatting, and response parsing.

Learning (Adaptation and Optimization):

Supervised Learning: Training agents on labeled data to predict future outcomes. Algorithms include linear regression, logistic regression, support vector machines (SVMs), and neural networks.
Unsupervised Learning: Discovering patterns and structures in unlabeled data. Algorithms include clustering (k-means, hierarchical clustering) and dimensionality reduction (principal component analysis - PCA).
Reinforcement Learning (RL): Training agents to optimize their behavior through rewards and penalties. Algorithms include Q-learning, SARSA, deep Q-networks (DQNs), and actor-critic methods.

Types of AI Agents (Technical Categorization)

Here's a more technical breakdown of AI Agent types:

Simple Reflex Agents: Implemented using simple conditional statements or look-up tables. Easy to implement but limited in complex environments.
Model-Based Reflex Agents: Maintain an internal model of the world, typically implemented using state machines or Bayesian networks.
Goal-Based Agents: Use search algorithms like A* or planning algorithms like STRIPS to find optimal plans.
Utility-Based Agents: Maximize expected utility, often using Markov decision processes (MDPs) or partially observable Markov decision processes (POMDPs).
Learning Agents: Employ machine learning techniques like reinforcement learning, supervised learning, or unsupervised learning to adapt and improve their performance.

Benefits of Using AI Agents (Quantifiable Advantages)

The benefits of AI Agents can be quantified and measured:

Increased Efficiency: Agents can automate tasks, reducing processing time and resource consumption by X% (where X depends on the specific application).
Improved Accuracy: Agents can perform tasks with greater precision and consistency than humans, reducing error rates by Y% (where Y depends on the task and the quality of the training data).
Personalization: Agents can tailor experiences to individual users, increasing customer satisfaction by Z% (where Z depends on the quality of the personalization algorithm and the user's preferences).
Scalability: Agents can be easily scaled to handle increasing workloads, allowing organizations to grow without adding headcount.
Cost Reduction: By automating tasks and improving efficiency, AI Agents can significantly reduce operational costs.

Real-World Applications (Technical Examples)

Virtual Assistants:

Technical Details: Use natural language processing (NLP) techniques (e.g., BERT, GPT) for speech recognition, natural language understanding (NLU), and natural language generation (NLG). Employ dialogue management systems to handle conversations.

Autonomous Vehicles:

Technical Details: Use computer vision (CNNs) for object detection and tracking, sensor fusion algorithms (e.g., Kalman filters) to combine data from multiple sensors (cameras, LiDAR, radar), and path planning algorithms (e.g., A*, RRT) to navigate roads.

Fraud Detection:

Technical Details: Use machine learning algorithms (e.g., logistic regression, support vector machines, neural networks) to identify fraudulent transactions based on historical data.

Personalized Recommendations:

Technical Details: Use collaborative filtering algorithms, content-based filtering algorithms, or matrix factorization techniques to recommend products and services based on user preferences and browsing history.

Robotics:

Technical Details: Use reinforcement learning algorithms to train robots to perform complex tasks autonomously in manufacturing, healthcare, and exploration.

Conclusion

This technical deep dive has provided a comprehensive overview of AI Agents, covering their definition, architecture, key algorithms, and real-world implementations. By understanding the underlying technical principles, you can appreciate the power and versatility of AI Agents and their potential to transform various industries.

In the next article, we will delve into the practical aspects of building and deploying your own AI Agents locally.

Your Guide to Local LLMs: Ollama Deployment, Models, and Use Cases

Sina Tavakkol — Wed, 05 Feb 2025 16:15:57 +0000

Part 2/2

Deploying Large Language Models (LLMs) locally with Ollama offers significant benefits in performance, security, and customization, addressing challenges like privacy concerns, latency, and recurring costs associated with cloud-based AI.

Ollama is a groundbreaking tool that lets you run powerful LLMs on your local machine, and this guide covers everything you need to know, from requirements and deployment to model selection and use cases.

1. Introduction: Embracing the Local AI Revolution

Ollama is a game-changer for anyone seeking to explore the capabilities of large language models without the limitations of cloud-based solutions. As we briefly covered in a Unlocking AI's Potential: Ollama's Local Revolution in AI Development, this tool simplifies the process of setting up and managing LLMs locally.

With Ollama, you can take advantage of privacy, offline capabilities, and the low latency that comes with running AI on your own hardware. This guide is designed to equip you with the practical knowledge needed to start harnessing the power of local LLMs with Ollama today.

2. Minimum Requirements: What You Need to Get Started

Before you begin your journey with Ollama, let's examine the essential hardware and software components needed for a smooth experience. Keep in mind that these requirements are general guidelines, and the specific demands may vary based on the model you choose and the complexity of your use cases.

Hardware Requirements

RAM: Random Access Memory (RAM) is crucial for running LLMs. Aim for a minimum of 8GB, but 16GB is highly recommended. Some larger models will require even more RAM. The more RAM you have, the more fluent your model will be without lagging.

CPU: A modern CPU is needed to run Ollama. While the model itself primarily relies on RAM, the CPU will play a role in general speed. (Intel i5 or equivalent, preferably i7 or higher).

Disk Space: The LLM models are large and can consume a significant amount of disk space, so plan accordingly. Each model can easily range from a few to several gigabytes. Ensure your system has enough storage available. 50GB is a good start.

GPU (Optional, but Highly Recommended): While not strictly required, a compatible GPU will greatly accelerate model inference. If you have a NVIDIA GPU with CUDA support or an Apple Silicon GPU (M1/M2/M3), you should certainly leverage it for faster speeds.

Software Requirements

Operating System: Linux and macOS are preferred. Windows users should use Windows Subsystem for Linux (WSL2) for compatibility.

Containerization: Docker is recommended for isolating dependencies and ensuring reproducibility.

Programming Environment: A stable Python setup with current versions of machine learning libraries (e.g., PyTorch or TensorFlow).

Internet Connectivity: Required for initial setup, updates, and integration, though the core deployment runs locally.

Note: The requirements above are minimums and will change drastically based on the chosen model. For example, a small, lightweight model like phi-2 will work well with 8GB of RAM, but larger models like llama2:13b will need more. Always check the model's documentation for detailed requirements.

3. How to Deploy a Model with Ollama (Minimal Steps)

Here are the fundamental steps to get a model up and running:

Installation: First, ensure you have Ollama installed on your system. Visit the official Ollama website https://ollama.com to download the correct version for your OS.

Download a Model: Open your terminal and use the command ollama run <model_name>. For example, to download and run the llama2 model, you would type ollama run llama2. Ollama will automatically download the required model files if they are not already present.

Interact with the Model: After the model is loaded, you can start interacting with it by typing your prompts directly into the terminal. The model will generate text responses based on your input. For example, you can ask "Hello, who are you?"

Stop the Model: To stop the running model, simply close the terminal window or use CTRL+C.

4. Top 5 Open-Source LLMs for Ollama: A Curated List

Ollama provides a wide array of available models. Here are five popular open-source LLMs that are excellent for local deployment, each selected for their versatility and performance:

Llama 2 (Various Sizes):

Description: Llama 2, developed by Meta, is a powerful general-purpose language model with multiple size variants (7B, 13B, 70B).

Use Cases: Text generation, summarization, content creation, and more.

Ollama usage: You can select the different variations using tags in Ollama (e.g. llama2, llama2:13b, llama2:70b).

Why it's here: Provides good overall performance and several size options for different needs.

Mistral 7B:

Description: Mistral 7B is a small, efficient model that achieves impressive performance.

Use Cases: General-purpose tasks, fast inference, and is more efficient than the baseline Llama2.

Ollama usage: Download the model with ollama run mistral

Why it's here: It provides a good balance between performance and low resource usage.

Gemma (Google):

Description: Gemma, developed by Google, is the newest model in Google's lineup of open models. It provides a good range of performance and sizes.

Use Cases: General-purpose tasks and experimenting with Google's latest model.

Ollama usage: Download the model with ollama run gemma

Why it's here: Gives the opportunity to experiment with one of the most recent open-source LLMs.

Phi-2 (Microsoft):

Description: Phi-2 is a compact model that can perform complex language tasks.

Use Cases: Efficient for mobile devices or devices with limited resources, can still perform complex task with impressive results.

Ollama usage: Download the model with ollama run phi

Why it's here: Offers a great option for resource-constrained environments.

CodeLlama (Various Sizes):

Description: CodeLlama, also from Meta, is specifically designed for code generation and understanding.

Use Cases: Code completion, bug finding, programming language assistance, generating new code snippets.

Ollama usage: You can select the different variations using tags in Ollama (e.g. codellama, codellama:7b).

Why it's here: An excellent tool for developers to assist with daily tasks.

Note: Always check the Ollama library for the latest models and versions: https://ollama.com/library

5. Top 5 Common Usages for Local LLMs

Now that you have your model running, what can you do with it? Here are five popular use cases for local LLMs, along with a suggested model for each:

Text Generation & Creative Writing: Generate articles, stories, poems, or creative marketing copy.

Suggested Model: *llama2 * is a good general purpose model for this task.

Code Assistance: Generate code snippets, find bugs, or provide documentation.

Suggested Model: *codellama * is specifically designed for code-related tasks.

Information Extraction & Summarization: Summarize long documents, extract relevant data points, create short analysis from long texts.

Suggested Model: mistral is known for its good summarization capabilities, while being efficient.

Text Translation: Perform translations without relying on online APIs, maintaining your privacy.

Suggested Model: gemma is trained in multiple languages and can perform very well in translations.

Personalized Chatbots: Create custom chatbots with your own specific instructions, personality, style, and prompts.

Suggested Model: phi-2 is great for creating a custom chatbot thanks to its efficiency, and is easy to customize.

Other suggestions

- GPT-Neo/GPT-J

Overview: Renowned for robust text generation and flexibility across various NLP tasks.

Strengths: Delivers human-like text and is highly adaptable to different applications.

- LLaMA

Overview: Balances efficiency with high performance, ideal for both research and practical applications.

Strengths: Optimized for resource-constrained environments without compromising on accuracy.

- GPT-2

Overview: An established model with extensive community support and reliable performance.

Strengths: Versatile and well-documented, making it a dependable choice for many conversational applications.

- BLOOM

Overview: A multilingual model adept at processing inputs in multiple languages.

Strengths: Its multilingual capabilities make it a prime choice for global applications and diverse datasets.

- T5 (Text-to-Text Transfer Transformer)

Overview: Converts all NLP tasks into a text-to-text format, offering a unified approach.

Strengths: Exceptionally versatile, handling tasks ranging from translation to summarization with ease.

6. Related Features and Tips

Ollama has many features worth exploring.

Model Management:

Downloading Models: Easily download different models using the command ollama run <model_name>.

Listing Models: Use ollama list to see all downloaded models.

Removing Models: Use ollama rm <model_name> to remove a specific model.

Customization:

Custom Prompts: Adjust the behavior of the LLM with custom prompts and specific instructions.

Changing Parameters: Experiment with parameters like temperature and top p to get different types of responses.

API and Integration:

Ollama provides an API that can be integrated into other applications. Check the docs for more information.

It's straightforward to integrate it with other projects in your local environment.

Security and Privacy:

Local LLMs do not send your data to the cloud by default. Everything happens on your local machine, so no data is shared by default.

Community:

Ollama's community is growing, and you can find lots of help and ideas on their GitHub page.

7. Conclusion

Deploying local LLMs with Ollama offers a powerful and flexible way to use advanced AI technologies while keeping control over your data and infrastructure. By meeting the necessary hardware and software requirements, following a simple deployment process, and choosing from top open-source models for specific needs, you can create custom AI solutions that boost innovation and efficiency.

Whether you aim to automate customer support, streamline content creation, improve personal assistance, enhance educational tools, or speed up data analysis, Ollama gives you the tools to fully utilize modern language models. Embrace the future of local AI deployment and transform how you tackle real-world challenges.

Unlocking AI's Potential: Ollama's Local Revolution in AI Development

Sina Tavakkol — Wed, 29 Jan 2025 16:57:17 +0000

Part 1/2

Artificial Intelligence (AI) is no longer limited to high-powered servers and cloud platforms. With the introduction of Ollama, an open-source large language model (LLM), AI is now available to anyone with a regular laptop or desktop. Ollama is built to run locally, providing powerful AI features without needing internet access or cloud services. This article covers Ollama's history, main features, versions, use cases, and alternatives, offering a complete guide to this innovative tool.

What is Ollama?

Ollama is a transformer-based large language model that is great at natural language processing (NLP) tasks. Unlike many AI models that need cloud infrastructure, Ollama is made to run on local hardware. This makes it perfect for developers, researchers, and businesses that value data privacy and security. Being open-source, it encourages community-driven improvements and customization, promoting innovation and collaboration.

A Brief History of Ollama

Ollama was developed by a team of researchers and engineers who wanted to create a powerful yet accessible AI model. The project started with a focus on natural language processing (NLP) and grew to include advanced techniques like transformer-based architectures, reinforcement learning, and fine-tuning for specific industries.
Over time, Ollama has released several versions, each improving on the last with better features, performance, and capabilities.
The open-source nature of Ollama has been crucial to its fast development. Community contributions have significantly refined the model, making it more efficient and versatile. This collaborative approach has kept Ollama at the forefront of AI innovation, providing users with a tool that is both powerful and adaptable.

AI Type and Core Features

Ollama is built on transformer-based architecture, which has transformed NLP by enabling models to process and understand complex text sequences. This architecture allows Ollama to perform various tasks, including:

Text Generation: Ollama can create human-like text, such as stories, poems, articles, and even code snippets. This makes it a valuable tool for content creators, developers, and researchers.
Translation: The model can accurately translate text between different languages, making it useful for global businesses and multilingual applications.
Summarization: Ollama can condense long pieces of text into concise summaries, saving time for professionals who need to handle large amounts of information.
Question Answering: The model provides comprehensive and informative answers to a wide range of questions, making it a useful tool for educational and research purposes.
Code Generation: Ollama can help developers by generating code snippets, debugging code, and offering solutions to programming problems. This feature is particularly useful for speeding up software development.

Useful Use Cases for Ollama

Ollama's versatility makes it applicable across a wide range of industries and tasks. Here are some of the most impactful use cases:

AI-Powered Development:

Code Generation: Ollama can automate the process of writing code, offering autocompletion and bug detection within integrated development environments (IDEs).
Text-to-Code Conversion: Developers can generate code snippets based on natural language prompts, speeding up the development process.
API Integration: Ollama can be seamlessly integrated into software applications, enabling AI-powered features without relying on external services.

Research and Experimentation

Natural Language Processing (NLP): Researchers can use Ollama to explore various NLP tasks, such as sentiment analysis, text summarization, and question answering.
Machine Learning (ML) Research: Ollama is ideal for conducting ML experiments and prototyping locally, without the need for cloud resources.
Data Analysis: The model can preprocess data and identify patterns using AI techniques, making it a valuable tool for data scientists.

Creative Tasks

Content Creation: Ollama can generate text, poems, scripts, and other creative content, helping writers overcome writer's block and explore new styles.
Image Generation: The model can be used to create original visuals, making it a useful tool for artists and designers.
Music Generation: Ollama can experiment with music generation based on custom datasets, offering new possibilities for musicians and composers.

Personal AI Assistants

Local AI Chatbots: Users can create personalized chatbots that run locally, ensuring data privacy and security.
Personal Knowledge Bases: Ollama can be used to develop custom knowledge bases that learn and adapt to individual needs, offering personalized assistance.

Accessibility and Education

AI Education: Ollama can be used to teach students AI concepts through hands-on experiments, making it a valuable tool for educators.
Supporting People with Disabilities: The model can be used to develop tools like text-to-speech applications, assisting individuals with disabilities.

Offline Applications

Mobile Applications: Ollama can run AI features within mobile apps without requiring an internet connection, making it ideal for offline use.
Embedded Systems: The model can be implemented on edge devices with limited processing power, enabling AI capabilities in resource-constrained environments.

Data Security

Sensitive Data Analysis: Ollama can process sensitive information locally, ensuring that data is not exposed to external services.
Privacy-Preserving AI: The model can implement secure algorithms for private data analysis, making it ideal for industries that prioritize data privacy.

Testing and Prototyping

Local AI Application Testing: Developers can test AI features locally before deploying them to cloud platforms, reducing the risk of errors.
Rapid Prototyping: Ollama enables quick iteration on prototypes, allowing developers to experiment with AI capabilities without extensive setup.

Real-world Examples or Case Studies

For example, a software development company might use Ollama to automate code generation and debugging, which can reduce development time and improve code quality. By sharing specific metrics, like a 30% reduction in development time or a 20% increase in code efficiency, the article can show Ollama's real impact.

Similarly, an educational institution could use Ollama to create personalized learning materials and virtual tutors, which can enhance student engagement and performance.

Case studies showing how students achieved higher test scores or completed assignments more efficiently with Ollama's help would make the benefits clearer.

Versions and Updates

Ollama has gone through several updates, with each version bringing major improvements in performance, efficiency, and features. Key versions include:

Version 1.0: Focused on basic NLP functions, allowing simple text-based interactions.
Version 2.0: Introduced transformer-based models, improving accuracy and understanding of context.
Version 3.0: Added multimodal capabilities and real-time adaptability for complex scenarios.
Latest Release: Offers advanced fine-tuning, integration with third-party apps, and improved scalability for enterprise-level applications.

Upcoming Features or Developments

Ollama is actively being developed, and several exciting features and developments are on the horizon:

Enhanced Model Support:

Expect the addition of even more cutting-edge models to the Ollama ecosystem. This could include newer versions of existing models like Llama, as well as entirely new models from various research institutions and organizations.
Improved support for diverse model architectures, such as those specializing in specific tasks like code generation, translation, or multi-modal understanding.

Performance and Efficiency:

Ongoing optimizations to improve the speed and efficiency of running LLMs locally. This could involve advancements in hardware acceleration, more efficient memory usage, and optimized inference techniques.

User Experience:

Refinements to the user interface and developer experience, making it easier to interact with models, manage configurations, and integrate Ollama into various applications.
Potentially more user-friendly tools for fine-tuning models on specific datasets or tasks.

Community and Ecosystem Growth:

Continued expansion of the Ollama community, fostering collaboration, knowledge sharing, and the development of innovative applications built on the platform.
Increased support for third-party tools and integrations, enabling users to seamlessly connect Ollama with other software and services.

Focus on Safety and Ethics:

Continued efforts to address safety and ethical considerations, such as mitigating bias, preventing the generation of harmful content, and ensuring responsible AI development.

Please note that this is not an exhaustive list, and specific features and timelines may change. The best way to stay updated on the latest developments is to follow the official Ollama channels, such as their website, GitHub repository, and social media.

By staying informed about these upcoming features, you can leverage the full potential of Ollama and stay at the forefront of the evolving LLM landscape.

Alternatives to Ollama

While Ollama is a powerful tool, several other open-source and proprietary LLMs are worth considering based on specific needs:

Stable Diffusion: A text-to-image model that generates stunning visuals from simple text descriptions. It's particularly popular in creative industries for generating art and design assets.
Llama 2: A family of large language models developed by Meta, offering a range of sizes and capabilities. Llama 2 is known for its versatility and is widely used in research and commercial applications.
GPT-NeoX: A large-scale open-source transformer model developed by EleutherAI, designed to replicate the capabilities of closed-source models like GPT-3. It's popular among researchers and developers for its flexibility and open-source nature.
BLOOM: An open-source multilingual LLM developed by BigScience. BLOOM is designed to support multiple languages and is particularly useful for global applications requiring multilingual support.
ChatGPT by OpenAI: A widely known conversational AI model that excels in generating human-like text and engaging in interactive dialogues. It's accessible via API and is used in various applications, from customer support to content creation.
Google Gemini: Google's conversational AI model, designed to compete with ChatGPT. Gemini integrates with Google's ecosystem and is particularly strong in providing real-time information and search capabilities.
Hugging Face Transformers: A library offering a wide range of pre-trained models for NLP tasks. Hugging Face is known for its ease of use and extensive model repository, making it a go-to resource for developers and researchers.
Cohere: A platform offering powerful language models for text generation, classification, and summarization. Cohere is known for its enterprise-grade solutions and ease of integration.
Jurassic-1 by AI21 Labs: A large language model designed for high-quality text generation and understanding. It's used in applications ranging from content creation to customer support.
Falcon by TII (Technology Innovation Institute): An open-source LLM known for its high performance and efficiency. Falcon is designed for both research and commercial use, offering a strong alternative to other large models.

Conclusion

Ollama marks a major step forward in making AI accessible to everyone. It can run on regular consumer hardware and is open-source, making it available to many users. Whether you're a developer, researcher, or creative professional, Ollama provides a powerful tool to explore AI's possibilities. As the project grows, we can look forward to more innovative applications and advancements in the future.

Stay Ahead of the Curve

Explore Ollama today and see how it can transform your interactions with AI, whether you're involved in development, research, creative tasks, or personal assistance. With its versatility, accessibility, and strong capabilities, Ollama is set to become an important part of the rapidly changing world of AI-driven solutions.

Understanding Observability

Sina Tavakkol — Thu, 23 Jan 2025 09:55:42 +0000

The realm of observability and proactive monitoring in IT operations is undergoing rapid transformation, propelled by key trends and factors that are redefining this field.

Observability in IT operations is about gaining deep insights into the health and performance of systems and applications. It involves a comprehensive approach to collecting, analyzing, and interpreting various types of data to ensure systems run smoothly and efficiently.

The landscape of observability and proactive monitoring in IT operations is rapidly evolving, driven by advancements in technology and methodologies. Key trends include the use of eBPF and OpenTelemetry for deeper insights, AI and machine learning for enhanced anomaly detection and predictive analysis, and full-stack and cloud-native observability for comprehensive visibility. The focus is shifting to proactive monitoring strategies, aligning IT operations with business needs, and leveraging automated and intelligent tools for improved system reliability and performance. This dynamic field emphasizes preventing issues before they impact users, ensuring high-quality service delivery.

The concept of observability in IT operations has evolved significantly over the years.

Early Beginnings

1950s: The roots of observability trace back to control theory, introduced by Rudolf E. Kálmán in 1959. Kálmán's work focused on understanding and managing complex systems by inferring their internal states from external outputs.

Telemetry and Early Applications

Early 20th Century: Telemetry, the process of collecting data from remote or inaccessible systems, was used in aerospace and defense. This laid the groundwork for modern observability practices.

Evolution in IT

1980s-1990s: As computer systems became more complex, the need for better monitoring tools grew.

Early monitoring tools focused on tracking predefined metrics and alerting when thresholds were breached.

2000s: The rise of the internet and distributed systems led to the development of more sophisticated monitoring tools.

These tools began to provide deeper insights into system performance and reliability.

Modern Observability

2010s-Present: The shift to cloud computing and microservices architectures introduced new challenges in maintaining system health.

Observability tools evolved to provide real-time insights, predictive analysis, and automated alerting.

AI and Machine Learning: The integration of AI and machine learning has further enhanced observability, enabling predictive analysis and automated issue resolution.

I. Key Components of Observability

Metrics: These are numerical data points that measure various aspects of system performance, such as CPU usage, memory consumption, network throughput, etc. Metrics help you track and quantify the health of your systems over time.
Logs: Logs are records of events that happen within your systems. They provide detailed information about what happened, when it happened, and why it happened. Logs are crucial for troubleshooting and identifying the root cause of issues.
Traces: Traces track the flow of requests through your systems, from start to finish. They help you understand how different components interact and where delays or errors might be occurring. Traces are especially useful in distributed systems and microservices architectures.

II. Benefits of Observability

Unlocking the Core: Metrics, Logs, and Traces in Observability

Proactive Issue Detection: With observability, you can detect and address issues before they impact users. By continuously monitoring metrics, logs, and traces, you can identify anomalies and potential problems early on.
Faster Troubleshooting: Observability provides detailed insights into system behavior, making it easier to diagnose and resolve issues quickly. This reduces downtime and improves system reliability.
Better Performance Optimization: By analyzing performance data, you can identify bottlenecks and optimize your systems for better efficiency and scalability.
Improved Customer Experience: Ensuring your systems are running smoothly and efficiently leads to a better experience for your users. Observability helps you maintain high levels of service quality and reliability.

III. How It Works

The Mechanics of Observability: From Data Collection to Automation

Data Collection: Observability tools collect data from various sources, including system metrics, application logs, and traces. This data is gathered in real-time and stored for analysis.
Data Analysis: The collected data is analyzed to identify patterns, anomalies, and trends. Advanced analytics and machine learning techniques can be used to predict potential issues and recommend solutions.
Visualization: Observability platforms provide dashboards and visualizations to help you understand and interpret the data. These visualizations make it easy to see the overall health of your systems and identify areas that need attention.
Alerting and Automation: Automated alerts notify you of any issues or anomalies detected by the observability tools. Some platforms also offer automation capabilities to take corrective actions automatically.

Observability in IT operations is evolving rapidly, driven by advancements in technology and methodologies. It focuses on gaining deep insights into system health and performance through metrics, logs, and traces. Key trends include the use of eBPF and OpenTelemetry for deeper insights, AI and machine learning for enhanced anomaly detection, and full-stack and cloud-native observability for comprehensive visibility. The emphasis is on proactive monitoring, aligning IT operations with business needs, and leveraging automated tools to prevent issues before they impact users, ensuring high-quality service delivery.

IV. Observability Trends:

Trending Now: The Future of Observability in IT Operations

eBPF (Extended Berkeley Packet Filter): This powerful technology allows you to run sandboxed programs within the Linux kernel. It's gaining popularity for its ability to provide low-overhead, deep visibility into system behavior, network traffic, and application performance, making it ideal for fine-grained observability.

Why it's trending: High performance, low overhead, real-time insights without modifying applications.

OpenTelemetry (OTel): This CNCF project is rapidly becoming the standard for generating, collecting, and exporting telemetry data (traces, metrics, logs). It aims to unify observability by providing vendor-neutral APIs, SDKs, and tools.

Why it's trending: Vendor-neutrality, interoperability, reduced lock-in, community driven.

Service Mesh Observability: With the rise of microservices, service meshes (like Istio, Linkerd) are essential. Observability is moving towards deeper integration with service mesh capabilities, including:
- Traffic Analysis: Understanding communication patterns between services.
- Latency Analysis: Pinpointing slow communication paths.
- Request Tracing: Following requests as they move across services.

Why it's trending: Enhanced understanding of microservice interactions and dependencies.

Synthetic Monitoring for Observability: Beyond monitoring real user traffic, synthetic monitoring (using scripts to emulate user behavior) is becoming critical for proactively detecting availability and performance issues in specific workflows. It can be used with observibility tools to gain insights and detect early warning signs.

Why it's trending: Proactive detection, testing specific paths, identifying problems before they impact real users.

AI-Powered Observability (AIOps): Machine learning and AI algorithms are being used to analyze observability data for anomaly detection, root cause analysis, predictive alerting, and capacity planning.

Why it's trending: Automates tasks, reduces human intervention, enables faster insights, improves predictive capabilities.

Full-Stack Observability: The push for complete visibility across all layers of the technology stack (infrastructure, network, application, user experience) rather than just focusing on one area.

Why it's trending: Provides a holistic view of the system, helping to identify issues spanning multiple layers.

Cloud-Native Observability: Solutions tailored for cloud environments, integrating with cloud-specific services and features. This is critical as cloud deployments become standard.

Why it's trending: Native support for cloud services, automated deployment, increased flexibility and scalability.

Shift-Left Observability: Incorporating observability principles and tools earlier in the software development lifecycle, allowing developers to monitor application performance during development, testing, and deployment.

Why it's trending: Faster feedback loops, earlier issue detection, reduces the chance of production issues.

Contextual Observability: Integrating various data points with the application context to make the monitoring data more valuable. For example, combining observability data with deployment metadata, code versioning, and business-related metadata.

Why it's trending: Provides a more comprehensive understanding of the application and its environment.

Cost-Aware Observability: As the volume of data grows, so do the costs. New approaches are focused on optimizing data ingestion, storage, and analysis to reduce costs while maintaining observability quality.

Why it's trending: Helps manage and optimize the costs associated with monitoring large-scale systems.

V. Proactive Monitoring Trends:

Staying Ahead: Proactive Monitoring Strategies for IT Success

Predictive Monitoring: Utilizing machine learning to analyze historical patterns and identify potential problems before they occur. This shifts the focus from reacting to incidents to preventing them.

Why it's trending: Reduces downtime, improves service reliability, increases efficiency.

Chaos Engineering: Intentionally introducing controlled failures into the system to test its resilience and uncover weaknesses. This helps with building more robust systems that are less likely to fail in production.

Why it's trending: Verifies resilience, identifies blind spots, improves the overall stability of systems.

Automated Remediation: Triggering automated actions based on alerts, such as restarting services, scaling resources, or rolling back deployments. This reduces the need for manual intervention and accelerates the resolution of issues.

Why it's trending: Faster recovery, reduced downtime, improved efficiency, lower operational costs.

Telemetry-Driven Alerting: Moving beyond traditional threshold-based alerts by using telemetry data to trigger more sophisticated and context-aware alerts. This helps to reduce alert fatigue and focus on genuinely important issues.

Why it's trending: Reduced alert noise, improves focus on crucial problems, more effective alert management.

Intent-Based Monitoring: Defining monitoring objectives based on business needs and user requirements. This ensures that monitoring is aligned with business goals rather than just focusing on technical metrics.

Why it's trending: Links IT operations to business needs, ensures monitoring is relevant and effective.

Site Reliability Engineering (SRE) Principles: Implementing SRE practices and methodologies, including service level objectives (SLOs), service level indicators (SLIs), and error budgets, to ensure consistent performance.

Why it's trending: Promotes a disciplined and proactive approach to operations.

Self-Healing Systems: Creating systems that can automatically detect and resolve issues without human intervention. This involves combining automated monitoring, alerting, and remediation.

Why it's trending: Maximizes uptime, minimizes impact of failures, reduces need for manual intervention.

Security Observability: Integrating security metrics and events into the same observability platform to gain a unified view of performance and security.

Why it's trending: Enhances security posture, allows detection of security incidents within performance monitoring.

VI. Some Other Key Terms:

Decoding Observability: Essential Terms You Need to Know

Telemetry: The collection of data (metrics, traces, logs, events) used for monitoring and observability.

SLO (Service Level Objective): A target for the desired performance or reliability of a service.

SLI (Service Level Indicator): A metric used to measure the actual performance or reliability of a service.

Error Budget: The acceptable level of unreliability a service can experience before it negatively impacts users.

AIOps: Applying artificial intelligence and machine learning to IT operations.

eBPF (Extended Berkeley Packet Filter): A powerful kernel technology for tracing and monitoring.

OpenTelemetry (OTel): A standard for observability telemetry.

Synthetic Monitoring: Monitoring by emulating user behavior.

In Summary

The Evolving Landscape of Observability: Key Takeaways

The landscape of observability and proactive monitoring in IT operations is dynamic and fast-paced. The trends are pointing towards:

Deeper Insights: Using new technologies like eBPF and OpenTelemetry to gain deeper visibility.
Automation and Intelligence: Using AI and machine learning for anomaly detection and predictive analysis.
Holistic Views: Combining performance metrics, traces, logs, and security data for comprehensive understanding.
Proactive Strategies: Moving towards proactive and predictive monitoring through predictive analysis and chaos engineering.
Business Alignment: Aligning IT monitoring with business needs and user requirements.
Comprehensive Coverage: Modern observability tools offer comprehensive coverage across various systems, applications, and infrastructure components.
Actionable Insights: These tools provide detailed information to help diagnose and resolve issues quickly, improving overall system reliability and performance.

These trends and terms represent a shift towards a more proactive, automated, and intelligent approach to managing IT systems, focusing on preventing problems rather than just reacting to them. Keeping up with these advancements is crucial for staying competitive and maintaining reliable IT services.

Go Harbor: A Deep Dive

Sina Tavakkol — Fri, 17 Jan 2025 19:01:09 +0000

Often referred to simply as Harbor, it is an open-source registry designed for storing, managing, and distributing container images and other cloud-native artifacts. As a project under the Cloud Native Computing Foundation (CNCF), it is a robust and well-supported solution.

Harbor is especially suitable for enterprise environments because of its comprehensive set of features.

Harbor is a robust open-source container registry ideal for enterprises due to its comprehensive security features and scalability. As part of the Cloud Native Computing Foundation, it offers image storage, vulnerability scanning, and role-based access control among other features, making it suitable for organizations with strict compliance needs. However, its rich feature set and complexity may be overkill for simpler use cases. Alternatives like Docker Hub, GitLab Registry, AWS ECR, Google Container Registry, and Azure Container Registry may be better suited for different requirements. Installation on Ubuntu, while straightforward, requires attention to setup details and security configurations.

What I Have to Say About Harbor:

Powerful but Complex: Harbor is feature-rich, but that also means it can be more complex to set up and manage compared to simpler registries.
Great for Enterprises: Its strong focus on security and enterprise features makes it a good fit for organizations with strict compliance requirements.
Active Community: The large and active community provides ample resources, documentation, and support.
Regular Updates: It receives regular updates and improvements, ensuring it stays relevant in the ever-evolving cloud-native landscape.

Harbor: An Enterprise-Grade Solution

Harbor stands out due to its robust feature set aimed at providing secure, scalable, and enterprise-ready image management. Key highlights include:

Comprehensive Features:- Harbor offers extensive functionality covering image storage, role-based access control (RBAC), vulnerability scanning, image signing with Notary, replication, garbage collection, and more. This makes it ideal for organizations with demanding security and compliance needs.
Security Focus: Harbor prioritizes security through RBAC, vulnerability scanning, content trust, and audit logs, ensuring image integrity and controlling access.
Open Source and CNCF Project: Being open source and backed by CNCF guarantees transparency, community support, and long-term sustainability.
Scalability and Extensibility: Harbor's architecture allows horizontal scaling for large deployments and supports plugins for custom integrations.

Feature-Rich:

Image Storage and Distribution: Core functionality for storing, managing, and distributing container images.
Role-Based Access Control (RBAC): Granular control over who can access and manage images, enhancing security.
Vulnerability Scanning: Integrates with vulnerability scanners to identify security flaws in container images.
Image Signing and Notary: Ensures the integrity and authenticity of images using digital signatures.
Replication: Supports replicating images across multiple Harbor instances or other registries.
Garbage Collection: Automatically removes unused images to free up storage space.
User Management: Allows managing users, groups, and their permissions.
Web UI and API: Provides a user-friendly interface and API for interacting with the registry.
Integration with CI/CD Pipelines: Seamlessly integrates into CI/CD workflows for automated image builds and deployments.
Helm Chart Support: Can also be used to store and manage Helm charts.
Open Source and CNCF Project: Being open source means it's free to use, modify, and contribute to. Its CNCF backing ensures its long-term sustainability and community support.
Scalable and Reliable: Designed for production environments and capable of handling a large number of images and requests.
Designed for Enterprise: Offers the features and security required by larger organizations.

Detailed Feature Breakdown

Here's a more detailed look at some of Harbor's key features:

Core Functionality:
- Image Storage & Management: Stores and organizes container images, Helm charts, and other cloud-native artifacts. Supports multiple formats and versions.
- Registry API: Provides a standard Docker Registry API (v2), making it compatible with Docker clients and other tools.
- Multi-Tenancy: Supports multiple projects (organizations or teams) within a single Harbor instance, each with its own users, access control, and repositories.
Security:
- Role-Based Access Control (RBAC): Allows fine-grained access control, letting you define different roles and permissions for users and groups at the project and global levels.
- Vulnerability Scanning: Integrates with scanners like Trivy to scan container images for security vulnerabilities. Reports and dashboards help you address identified issues.
- Image Signing & Notary: Uses Notary to sign images, ensuring their integrity and preventing tampering. This ensures that images come from a trusted source.
- Content Trust: Can enforce policies that only allow signed images to be pulled and deployed.
- Audit Logging: Tracks all actions performed within the registry, providing audit trails for security and compliance.
- LDAP/AD Integration: Integrates with existing LDAP or Active Directory systems for user authentication.
Replication & Distribution:
- Replication Policies: Enables automated replication of images between Harbor instances or other registries. Useful for multi-region deployments and disaster recovery.
- P2P Distribution: Supports P2P image distribution through projects like Dragonfly to reduce the load on the registry and improve image pull speeds.
- Garbage Collection (GC): Automatically removes unused or unreferenced images and layers to reclaim storage space.
- Quota Management: Set storage quotas for projects to manage disk space usage and prevent overutilization.
CI/CD Integration:
- Webhook Integration: Supports webhooks to trigger actions when images are pushed or deleted. This allows you to integrate Harbor into your CI/CD workflows for automated image building and deployments.
- Helm Chart Support: Allows storing and managing Helm charts alongside container images.
User Interface (UI):
- Intuitive Web UI: Provides a user-friendly web interface for managing projects, users, images, scanning results, and other aspects of the registry.
- Customizable Dashboard: Provides a customizable dashboard for monitoring the status of the registry.
API Access:
- Comprehensive REST API: Provides a fully documented REST API for programmatic interaction with the registry.
Performance and Scalability:
- High Availability: Designed for high availability deployments to ensure continuous operation.
- Scalability: Supports scaling horizontally by adding more Harbor instances or scaling the underlying database and storage.

Detailed Pros and Cons

Pros:

Robust Feature Set: Offers a wide array of enterprise-grade features, including RBAC, vulnerability scanning, image signing, replication, and more.
Security Focused: Designed with security in mind, providing features like RBAC, vulnerability scanning, content trust, and audit logs to secure your container images and registry.
Enterprise Ready: Suitable for large organizations with complex requirements for security, compliance, and scalability.
Open Source and CNCF Project: Benefits from the transparency and community support associated with being an open-source project under the Cloud Native Computing Foundation (CNCF).
Scalability: Can be scaled horizontally to handle a large number of images, projects, and users.
Extensibility: Supports plugins and integrations to customize functionality and connect with other tools.
Active Community: Has a large and active community providing extensive documentation and support.
Regular Updates: Gets regular updates and improvements, ensuring it stays relevant and secure.

Cons:

Complexity: Harbor's rich feature set makes it more complex to set up, configure, and manage compared to simpler registries.
Resource Intensive: Can require significant resources (CPU, memory, storage) to run, especially for larger deployments.
Steeper Learning Curve: Requires a deeper understanding of container registries and related concepts.
Overkill for Simple Use Cases: Might be too heavy and complex for individuals or small teams with basic needs.
Initial Setup: The initial setup process can be more involved compared to some simpler registries.
Potential Cost: While the software itself is free, running Harbor in a production environment will require infrastructure resources that come with costs.

Best Alternative: Container Registry Options

While Harbor is powerful, it might not be the right fit for every situation. Here are some of the best alternatives to consider, along with their pros and cons:

Docker Registry (Docker Hub / Docker Trusted Registry):

Pros:
- Simplicity: Very easy to set up and use, especially the public Docker Hub.
- Widely Adopted: Most developers are already familiar with Docker Hub.
- Free Tier (Docker Hub): Offers a free tier for public repositories.
- Docker Trusted Registry (DTR): Enterprise-grade registry from Docker with additional features (now part of Docker Desktop).
Cons:
- Docker Hub Limits: The free tier has limitations on storage and private repositories.
- DTR Licensing: Docker Trusted Registry requires licensing.
- Fewer Enterprise Features: Might lack some advanced features compared to Harbor (e.g., more granular RBAC, advanced replication).

GitLab Container Registry:

Pros:
- Integrated with GitLab: Seamless integration with GitLab's CI/CD pipelines.
- Easy to Set Up: Relatively straightforward to configure if you already use GitLab.
- Free with GitLab: Included as part of the GitLab package.
Cons:
- Tightly Coupled to GitLab: Best used if you are already using GitLab for source control.
- Limited Feature Set: Might not have all the advanced enterprise features of Harbor.

Amazon Elastic Container Registry (ECR):

Pros:
- Managed Service: Fully managed by AWS, reducing operational overhead.
- Scalability: Scales automatically to handle increased demand.
- Tight Integration with AWS Ecosystem: Seamlessly integrates with other AWS services.
- Security: Inherits the security features of the AWS platform.
Cons:
- Vendor Lock-in: You are locked into the AWS ecosystem.
- Cost: Can be more expensive than self-hosted options, especially at scale.

Google Container Registry (GCR) / Artifact Registry:

Pros:
- Managed Service: Fully managed by Google Cloud Platform.
- Scalability: Designed to scale for large deployments.
- Integration with GCP: Well-integrated with other Google Cloud services.
- Artifact Registry: Supports various artifacts (not just containers).
Cons:
- Vendor Lock-in: You are locked into the Google Cloud Platform.
- Cost: Similar to ECR, can become costly with high usage.

Azure Container Registry (ACR):

Pros:
- Managed Service: Fully managed by Azure.
- Scalability: Highly scalable to meet enterprise demands.
- Integration with Azure Ecosystem: Deeply integrated with other Azure services.
Cons:
- Vendor Lock-in: You are locked into the Microsoft Azure ecosystem.
- Cost: Can be expensive based on storage and usage.

Which Alternative is "Best"?

The "best" alternative depends on your specific needs and context:

For Individual Developers or Small Teams:- Docker Hub is a good starting point for ease of use and the free tier.
For Teams Using GitLab:- GitLab Container Registry is a great option due to its seamless integration.
For Organizations Using Cloud Platforms:- ECR, GCR/Artifact Registry, and ACR are excellent choices due to their scalability and managed nature.
For Organizations Requiring Advanced Features and Control:- Harbor is often the preferred choice due to its robustness and feature-rich nature.

Key Considerations When Choosing a Registry:

Scale: How many images will you be storing and distributing?
Security: What are your security and compliance requirements?
Integration: How well does the registry integrate with your CI/CD pipelines and other tools?
Cost: What is your budget and how will the registry impact costs?
Ease of Use: How easy is it to set up, manage, and use the registry?
Community and Support: What level of community support is available?

Harbor Installation on Ubuntu 24.04: A Basic Guide

This is a basic guide for installing Harbor using Docker Compose on Ubuntu 24.04. This method is suitable for development or test environments. For production, you'll need to consider a more robust deployment.

Prerequisites:

A clean Ubuntu 24.04 server.
Docker Engine and Docker Compose installed.
- To install Docker on Ubuntu:

sudo apt update
sudo apt install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Verify Docker is working:

sudo docker run hello-world

To install Docker Compose:

sudo apt install docker-compose

A domain name or IP address that you will use to access Harbor.

Steps:

Download Harbor Installer:

curl -LO https://github.com/goharbor/harbor/releases/latest/download/harbor-offline-installer-v2.10.0.tgz
tar xzvf harbor-offline-installer-v*.tgz
cd harbor

Configure harbor.yml:

Copy the default config file and customize it:

    cp harbor.yml.example harbor.yml

Edit harbor.yml:

    nano harbor.yml

Key Parameters to configure:

hostname: This should be your domain name or IP address.
http.port: Default is 80, change it if necessary.
https.port: Default is 443. For production use, you'll usually want HTTPS.
ssl.cert and ssl.key: If using HTTPS, point to your SSL certificate and private key. For testing, you can generate self-signed certificates but this is not recommended for production.

You can use LetsEncrypt for a valid certificate.
harbor_admin_password: Set the initial password for the Harbor administrator account.
database.password: Set a strong password for the database user.
Example basic configuration:

    hostname: your-domain-or-ip
    https:
      certificate: /path/to/your/certificate.crt #Path to the certificate
      private_key: /path/to/your/private.key # Path to the private key
      port: 443
    http:
      port: 80
    harbor_admin_password: strongpassword
    database:
      password: dbstrongpassword

Prepare SSL Certificates:

If you are going to use the https protocol for production set the ssl.cert and ssl.key parameters to point to the correct certificates
If you don't have a certificate, for testing purposes you can use a self-signed certificate with these steps:

    sudo mkdir -p /etc/harbor/certs
    sudo openssl req -x509 -newkey rsa:4096 -keyout /etc/harbor/certs/harbor.key -out /etc/harbor/certs/harbor.crt -sha256 -days 365 -nodes -subj "/CN=$(hostname)"

Change the ssl.cert and ssl.key to /etc/harbor/certs/harbor.crt and /etc/harbor/certs/harbor.key.

Install and Start Harbor:

    sudo ./install.sh

This will start all necessary containers. It can take a few minutes.

Access Harbor:

Open your web browser and go to
https://your-domain-or-ip
(or http://your-domain-or-ip if you configured for HTTP only).
Log in with username admin and the password set in harbor.yml.

Important Notes:

Production Readiness: This installation is for basic setup. For a production environment, you'll need to consider:
- Using a proper database like PostgreSQL.
- Externalizing storage for registry data.
- Configuring a load balancer.
- Implementing proper backup and disaster recovery strategies.
- Securing the environment with firewalls and network segmentation.
Docker Compose Version: Ensure you are using a compatible version of Docker Compose.
Port Conflicts: Ensure that the chosen ports (80 and 443) are not already in use on your system.
Firewall: Configure your firewall to allow traffic on the ports used by Harbor.

Further Configuration and Customization:

After installation, you can further configure Harbor through its web interface or by editing harbor.yml and using sudo docker-compose up -d to restart the service.

You might need to consult the official Harbor documentation to understand more advanced configurations and customization options.

Official Website

In Conclusion:

Harbor is a powerful, feature-rich, and enterprise-grade container registry that excels in environments with stringent security and compliance needs. However, it might be overkill for simpler use cases. Understanding your own requirements and exploring the alternatives will help you choose the best container registry for your projects.

The Rise of AIOps: How AI is Transforming IT Operations

Sina Tavakkol — Thu, 09 Jan 2025 19:50:27 +0000

Introduction

Is your IT operations team struggling to keep up with the growing complexity of systems? AIOps might just be the answer.

Artificial Intelligence for IT Operations (AIOps) is revolutionizing how organizations manage their IT infrastructure and applications.

By combining the power of AI, machine learning (ML), and big data analytics, AIOps is enabling a shift from reactive, manual processes to proactive, automated workflows.

This transformation is not just about implementing new technology; it's about fundamentally changing how IT teams operate, improve efficiency, and drive strategic business outcomes.

Core Functionality of AIOps

AIOps automates and improves IT processes through several key functions:

Data Collection and Aggregation: AIOps platforms begin by ingesting data from diverse sources across the IT environment. This includes structured data like logs and metrics from applications, databases and systems, and unstructured data like tickets, events, and even emails. This ability to gather comprehensive data from various systems and tools is crucial for the platform's ability to see the full picture.
Advanced Analytics and Machine Learning: Once data is collected, AIOps platforms use advanced AI and ML algorithms to identify patterns, anomalies, and trends. Techniques like anomaly detection, pattern recognition, and predictive analysis help filter out "noise" and identify underlying issues that may not be apparent with traditional methods.
Event Correlation and Root Cause Analysis: AIOps goes beyond identifying issues, by digging deeper to understand the underlying cause of incidents. By correlating events across different data sources, AIOps can quickly pinpoint the root cause of problems, significantly reducing the time it takes to resolve issues (MTTR).
AIOps helps reduce MTTR (Mean Time to Repair) by providing engineers with more context and better solutions which allows faster reaction.
Intelligent Automation and Remediation: AIOps automates responses to events and incidents, using AI-driven workflows for tasks like incident triage, alerting, ticketing, and remediation. This allows for automatic resolution of recurring issues, freeing up IT staff to focus on more strategic and value-added activities.
Predictive Analytics and Forecasting: By analyzing historical data, AIOps platforms can predict future trends and potential challenges. This foresight allows IT teams to proactively allocate resources, prevent bottlenecks, and optimize resource utilization, ensuring the overall health of their IT environment.

Key Benefits of AIOps

By leveraging the features mentioned above, AIOps delivers several significant benefits:

Proactive Issue Resolution: AIOps enables businesses to move from reactive to proactive IT management. By analyzing real-time data and predicting potential problems before they impact business operations, it minimizes downtime and maintains high service availability and performance.
Enhanced Decision-Making: By sifting through large amounts of data from various sources, AIOps can provide IT teams with actionable insights and a deeper understanding of their infrastructure, enabling better, data-driven decisions.
Increased Efficiency and Automation: AIOps automates routine and time-consuming tasks, including monitoring, alerting, and incident response. This allows IT staff to focus on strategic initiatives and innovation.
Improved Scalability and Flexibility: AIOps can integrate with existing infrastructure and scale according to business needs. This allows IT operations to remain efficient and effective regardless of the organization's size and complexity.
Strengthened Security and Compliance: AIOps continuously monitors network traffic, user behavior, and system activities, detecting and responding to security threats in real-time. AIOps also helps ensure compliance with industry standards and regulations.

AIOps Across Various Industries

AIOps is increasingly being adopted across diverse industries globally. Some notable examples include:

Financial Services: In the Financial sector AIOps helps prevent transaction disruptions by detecting anomalies in real time, ensuring the smooth operation of banking and investment systems.
Retail and E-commerce: Ensuring seamless customer experiences during peak shopping times by monitoring online store performance.
Healthcare: Optimizing IT infrastructure for efficient data access and processing, such as detecting anomalies in health monitoring systems.
Energy and Utilities: Managing critical infrastructure and reducing downtime.
Telecommunications: Improving network performance and service reliability.

AIOps as the Next Evolution of DevOps

While DevOps focuses on collaboration, automation, and continuous delivery, AIOps builds upon those principles by integrating AI and ML to analyze large datasets and automate decision-making in real time.
AIOps can be seen as the next evolution of DevOps, enabling a more intelligent and proactive approach to IT operations.

Implementing AIOps: A Phased Approach

After the initial implementation, it's important to continuously monitor, learn, and adapt your AIOps strategy to maximize its effectiveness.
Implementing AIOps successfully involves taking a phased approach:

Identify Key Use Cases: Determine specific areas where AIOps can deliver the most value.
Select the Right Platform: Evaluate different AIOps vendors and solutions based on your specific needs.
Start Small and Iterate: Begin with a pilot project or a limited deployment to test and refine your AIOps strategy.
Gradually Scale Up: Once the initial implementation proves successful, expand AIOps adoption to other areas of your IT operations.

The Future of AIOps

AIOps is expected to evolve rapidly in the coming years. Future advancements include increased automation and autonomous IT operations, enhanced predictive capabilities, integration with emerging technologies, and democratization of AI, making it more user-friendly and accessible to a wide range of IT professionals.

In the future we will see AIOps to become more user friendly and accessible to a wider range of IT professionals and even non-technical staff.

Conclusion

AIOps is transforming IT operations from a reactive and manual approach to a more strategic and automated one. Its ability to deliver deep, real-time, and context-rich insights is driving proactive decision-making, leading to more efficient, reliable, and innovative IT systems.
AIOps represents a strategic imperative for organizations seeking to thrive in today's dynamic digital landscape.

40 Days Of Kubernetes (40/40)

Sina Tavakkol — Tue, 17 Sep 2024 14:29:45 +0000

Day 40/40

JSONPath Tutorial - Advanced Kubectl Commands

Video Link
@piyushsachdeva
Git Repository
My Git Repo

In this section we will deep dive into JSONPath from the beginners perspective and see how you can write advanced kubectl commands using JSONPATH

Return result in json format by api-server

kubectl get nodes -o json

Return result in yaml format by api-server

kubectl get nodes -o yaml

Sample of jsonpath:

root@sinaops:~# kubectl get nodes -o=jsonpath='{.items[*].status.nodeInfo.osImage}{"\n"}'
Ubuntu 24.04.1 LTS Ubuntu 22.04.2 LTS Ubuntu 22.04.4 LTS

With custom column:

root@sinaops:~# kubectl get nodes -o='custom-columns=OsType:{.status.nodeInfo.osImage},KubeletVersion:{.status.nodeInfo.kubeletVersion}'
OsType               KubeletVersion
Ubuntu 24.04.1 LTS   v1.30.4
Ubuntu 22.04.2 LTS   v1.30.0
Ubuntu 22.04.4 LTS   v1.30.4

With custom column and statement:

root@sinaops:~# kubectl get nodes -o=custom-columns='Host:{.status.addresses[?(@.type=="Hostname")].address},OsType:{.status.nodeInfo.osImage}'
Host         OsType
cloudy.net   Ubuntu 24.04.1 LTS
jolly-net    Ubuntu 22.04.2 LTS
sinaops      Ubuntu 22.04.4 LTS

Sort by an item:

root@sinaops:~# kubectl get nodes
NAME         STATUS                     ROLES           AGE     VERSION
cloudy.net   Ready                      worker          6d      v1.30.4
jolly-net    Ready,SchedulingDisabled   worker          7d22h   v1.30.0
sinaops      Ready                      control-plane   8d      v1.30.4
root@sinaops:~# kubectl get nodes --sort-by=.status.nodeInfo.kubeletVersion
NAME         STATUS                     ROLES           AGE     VERSION
jolly-net    Ready,SchedulingDisabled   worker          7d22h   v1.30.0
cloudy.net   Ready                      worker          6d      v1.30.4
sinaops      Ready                      control-plane   8d      v1.30.4

40 Days Of Kubernetes (39/40)

Sina Tavakkol — Mon, 16 Sep 2024 14:31:07 +0000

Day 39/40

Troubleshooting Worker Nodes Failures in Kubernetes

Video Link
@piyushsachdeva
Git Repository
My Git Repo

We have troubleshoot and fix worker node-related issues in Kubernetes in this video.

(Photos from the video)

40 Days Of Kubernetes (38/40)

Sina Tavakkol — Sun, 15 Sep 2024 16:09:57 +0000

Day 38/40

Troubleshooting control plane failure in kubernetes

Video Link
@piyushsachdeva
Git Repository
My Git Repo

In this section we will troubleshoot control plane components failure in a Kubernetes cluster such as ApiServer, scheduler, and controller manager, and how to fix those issues.

40 Days Of Kubernetes (37/40)

Sina Tavakkol — Sat, 14 Sep 2024 19:34:26 +0000

Day 37/40

Application Failure Troubleshooting From CKA

Video Link
@piyushsachdeva
Git Repository
My Git Repo

In this section, we're looking at application failures.

We have a sample app for instance

git clone https://github.com/piyushsachdeva/example-voting-app.git

As it mentioned in the source repository:

A front-end web app in Python which lets you vote between two options
A Redis which collects new votes
A .NET worker which consumes votes and stores them in…
A Postgres database backed by a Docker volume
A Node.js web app which shows the results of the voting in real time

root@sinaops:/opt/example-voting-app# docker compose up -d
...
root@sinaops:/opt/example-voting-app# docker compose ps
NAME                          IMAGE                       COMMAND                  SERVICE   CREATED          STATUS                    PORTS
example-voting-app-db-1       postgres:15-alpine          "docker-entrypoint.s…"   db        32 seconds ago   Up 31 seconds (healthy)   5432/tcp
example-voting-app-redis-1    redis:alpine                "docker-entrypoint.s…"   redis     32 seconds ago   Up 31 seconds (healthy)   6379/tcp
example-voting-app-result-1   example-voting-app-result   "nodemon --inspect=0…"   result    32 seconds ago   Up 25 seconds             127.0.0.1:9229->9229/tcp, 0.0.0.0:5001->80/tcp, :::5001->80/tcp
example-voting-app-vote-1     example-voting-app-vote     "python app.py"          vote      32 seconds ago   Up 25 seconds (healthy)   0.0.0.0:5000->80/tcp, :::5000->80/tcp
example-voting-app-worker-1   example-voting-app-worker   "dotnet Worker.dll"      worker    32 seconds ago   Up 26 seconds

Run in Kubernetes:

root@sinaops:/opt/example-voting-app# kubectl apply -f  k8s-specifications/
deployment.apps/db created
service/db created
networkpolicy.networking.k8s.io/access-redis created
deployment.apps/redis created
service/redis created
deployment.apps/result created
service/result created
deployment.apps/vote created
service/vote created
deployment.apps/worker created

root@sinaops:/opt/example-voting-app# kubectl get pods,deploy,svc -o wide
NAME                          READY   STATUS    RESTARTS   AGE   IP           NODE         NOMINATED NODE   READINESS GATES
pod/db-597b4ff8d7-h4flp       1/1     Running   0          21m   10.85.0.10   cloudy.net   <none>           <none>
pod/redis-796dc594bb-dgglw    1/1     Running   0          21m   10.85.0.11   cloudy.net   <none>           <none>
pod/result-d8c4c69b8-ffc8n    1/1     Running   0          21m   10.85.0.16   cloudy.net   <none>           <none>
pod/vote-69cb46f6fb-ln4np     1/1     Running   0          21m   10.85.0.12   cloudy.net   <none>           <none>
pod/worker-5dd767667f-4csr5   1/1     Running   0          21m   10.85.0.15   cloudy.net   <none>           <none>
pod/worker-5dd767667f-l2589   1/1     Running   0          21m   10.85.0.13   cloudy.net   <none>           <none>
pod/worker-5dd767667f-m6xk2   1/1     Running   0          21m   10.85.0.14   cloudy.net   <none>           <none>

NAME                     READY   UP-TO-DATE   AVAILABLE   AGE   CONTAINERS   IMAGES                                  SELECTOR
deployment.apps/db       1/1     1            1           21m   postgres     postgres:15-alpine                      app=db
deployment.apps/redis    1/1     1            1           21m   redis        redis:alpine                            app=redis
deployment.apps/result   1/1     1            1           21m   result       dockersamples/examplevotingapp_result   app=result
deployment.apps/vote     1/1     1            1           21m   vote         dockersamples/examplevotingapp_vote     app=vote
deployment.apps/worker   3/3     3            3           21m   worker       dockersamples/examplevotingapp_worker   app=worker

NAME                 TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE   SELECTOR
service/db           ClusterIP   10.110.219.117   <none>        5432/TCP         21m   app=db
service/kubernetes   ClusterIP   10.96.0.1        <none>        443/TCP          2d    <none>
service/redis        ClusterIP   10.109.149.22    <none>        6379/TCP         21m   app=redis
service/result       NodePort    10.100.83.247    <none>        5001:31001/TCP   21m   app=results
service/vote         NodePort    10.98.179.36     <none>        5000:31000/TCP   21m   app=vote

40 Days Of Kubernetes (36/40)

Sina Tavakkol — Fri, 13 Sep 2024 06:51:26 +0000

Day 36/40

Kubernetes Logging and Monitoring

Video Link
@piyushsachdeva
Git Repository
My Git Repo

Because we will start to troubleshooting some issues like:

Application failures
Control-plane issues
Worker nodes issues
Cluster components issues And so on, we need to know about how logging or monitoring is in Kubernetes.

As we discussed, we know that Kubernetes doesn't come with an embedded monitoring tools, so we use metrics-server as add-on for getting some exposed metrics from our cluster.

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

There's also another daemon running on nodes, named cAdvisor which is collecting and aggregating the metrics from container runtime and then publish them to the kubelet.

But because only kubelet has authority to expose the data, and collect pod data as well, it sends them to the metrics-server.

metrics-server with the help of metrics-api can expose the data to api-server and when we call kubectl top we can reach out the data.

Note we may face an error when we apply the metrics-server manifest, so let's troubleshoot and fix the issue:

root@sinaops:~# kubectl get pods -n kube-system
NAME                              READY   STATUS    RESTARTS      AGE
coredns-7db6d8ff4d-2gdsx          1/1     Running   0             22h
coredns-7db6d8ff4d-tck9c          1/1     Running   0             22h
etcd-sinaops                      1/1     Running   0             22h
kube-apiserver-sinaops            1/1     Running   0             22h
kube-controller-manager-sinaops   1/1     Running   0             22h
kube-proxy-5t5bv                  1/1     Running   0             20h
kube-proxy-gt7vh                  1/1     Running   0             22h
kube-scheduler-sinaops            1/1     Running   0             22h
metrics-server-7ffbc6d68-9rw8z    0/1     Running   3 (18m ago)   22m

As we can see the metrics-server isn't ready yet.

NAME                              READY
metrics-server-7ffbc6d68-9rw8z    0/1

let's take a look at its logs:

...
E0827 16:09:23.312687       1 scraper.go:149] "Failed to scrape node" err="Get \"https://{JOLLY-NET-IP}:10250/metrics/resource\": tls: failed to verify certificate: x509: cannot validate certificate for {JOLLY-NET-IP} because it doesn't contain any IP SANs" node="jolly-net"
I0827 16:09:23.331929       1 server.go:191] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
...

And:

root@sinaops:~# kubectl describe pod metrics-server-7ffbc6d68-9rw8z -n kube-system
...
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  26m                  default-scheduler  Successfully assigned kube-system/metrics-server-7ffbc6d68-9rw8z to jolly-net
  Normal   Pulling    26m                  kubelet            Pulling image "registry.k8s.io/metrics-server/metrics-server:v0.7.1"
  Normal   Pulled     24m                  kubelet            Successfully pulled image "registry.k8s.io/metrics-server/metrics-server:v0.7.1" in 1.301s (2m22.567s including waiting). Image size: 68346568 bytes.
  Normal   Killing    23m (x2 over 23m)    kubelet            Container metrics-server failed liveness probe, will be restarted
  Warning  Unhealthy  23m (x4 over 23m)    kubelet            Readiness probe failed: Get "https://10.85.0.5:10250/readyz": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
...  
  Warning  Unhealthy  77s (x143 over 22m)  kubelet            Readiness probe failed: HTTP probe failed with statuscode: 500

...

So we're facing error code 500, because

failed to verify certificate: x509: cannot validate certificate for {JOLLY-NET-IP} because it doesn't contain any IP SANs" node="jolly-net"

As it mentioned in the github repository of metrics-server,

Kubelet certificate needs to be signed by cluster Certificate Authority (or disable certificate validation by passing --kubelet-insecure-tls to Metrics Server)
And we are going to edit the deployment.

root@sinaops:~# kubectl get deployment -n kube-system
NAME             READY   UP-TO-DATE   AVAILABLE   AGE
coredns          2/2     2            2           22h
metrics-server   0/1     1            0           31m
root@sinaops:~# kubectl edit deployment metrics-server -n kube-system
deployment.apps/metrics-server edited

And in the spec of containers, add the option --kubelet-insecure-tls, then save and exit:

...
    spec:
      containers:
      - args:
        - --kubelet-insecure-tls
        - --cert-dir=/tmp
        - --secure-port=10250
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s

Now, we are ready :)

root@sinaops:~# kubectl get pods -n kube-system -w
NAME                              READY   STATUS    RESTARTS   AGE
coredns-7db6d8ff4d-2gdsx          1/1     Running   0          22h
coredns-7db6d8ff4d-tck9c          1/1     Running   0          22h
etcd-sinaops                      1/1     Running   0          22h
kube-apiserver-sinaops            1/1     Running   0          22h
kube-controller-manager-sinaops   1/1     Running   0          22h
kube-proxy-5t5bv                  1/1     Running   0          20h
kube-proxy-gt7vh                  1/1     Running   0          22h
kube-scheduler-sinaops            1/1     Running   0          22h
metrics-server-8455d49879-5mqr2   1/1     Running   0          2m46s

We have to wait some time to it collects the metrics:

root@sinaops:~# kubectl top nodes
error: Metrics API not available
root@sinaops:~# kubectl top pods
error: Metrics API not available