DEV Community: Eze Lanza

Which IDE do you use? Cursor / Codeium / Zed / Fleet ?

Eze Lanza — Thu, 13 Mar 2025 19:06:19 +0000

¿Quieres aprender sobre agentes en español? 🎥

Eze Lanza — Thu, 13 Mar 2025 18:53:08 +0000

He creado este video como parte de una serie de entrenamientos gratuitos para la comunidad latina. ¡Espero que te sea útil! 🚀

🔗 Míralo aquí:

¿Quieres aprender sobre agentes en español? 🎥 He creado este video como parte de una serie de entrenamientos gratuitos para la comunidad latina. ¡Espero que https://www.youtube.com/watch?v=yDzYesxqRdg&list=PLT8pS9J_8jJqe2eAlpAng7KJY7uqK2NyR&index=2&t=38s

Eze Lanza — Thu, 13 Mar 2025 18:50:11 +0000

S2E1 : Code & Deploy: Data Contracts: Ensuring Reliable and Usable Data Products

Eze Lanza — Thu, 20 Feb 2025 17:19:36 +0000

Want to stop wasting time on data inconsistencies? Learn how the Open Data Contract Standard helps enforce data quality and structure, ensuring smooth data exchange between systems.

Join Code & Deploy host Eze Lanza and his guests INNOQ's Jochen Christ and Chair of the TSC, Bitol, The Linux Foundation's Jean-Georges Perrin as they walk you through how to craft a data contract using the Open Data Contract Standard (ODCS) and how to use the open source Data Contract CLI tool to test the data contract and generate code.

Project link Open Data Contract Standard (ODCS):https://github.com/bitol-io/open-data-contract-standard/blob/main/README.md
Book : Implementing Data Mesh: Design, Build, and Implement Data Contracts, Data Products, and Data Mesh (https://www.amazon.com/Implementing-Data-Mesh-Implement-Contracts-ebook/dp/B0DGYRBCZ2?ref_=ast_author_mpb)

https://www.linkedin.com/events/code-deploy-datacontracts-ensur7293365820594823168/theater/

Eze Lanza — Wed, 19 Feb 2025 16:31:10 +0000

FEB 19 11am EST! Code & Deploy: Data Contracts: Ensuring Reliable and Usable Data Products

Eze Lanza — Wed, 19 Feb 2025 04:12:45 +0000

Are you struggling with inconsistent or unclear data definitions? Data contracts let you define clear, standardized expectations for your datasets.

Join us tomorrow for our live stream about Data contracts!

Happy coding!

Top 4 Takeaways from DeepSeek-R1

Eze Lanza — Thu, 06 Feb 2025 03:20:55 +0000

Ok, it's been a week or two since the DeepSeek fever shook the AI world again (as it often happens when a new model appears). The initial reaction is often to figure out how to get hands-on experience with it, though running it locally is only feasible with the distilled versions unless you opt for alternative methods.
What makes it particularly interesting is its ability to transfer reasoning capabilities to Small Language Models (SLMs) via distillation—from the massive 647B R1 down to 8B models using architectures like Llama and Qwen.

After testing these distilled models, which are still reasoning models at their core, we can observe they behave differently from traditional LLMs.

A reasoning model has a totally different goal. It has to evaluate all the options available to be sure the answer it gives is correct.

What does this mean?

✅ Accuracy over speed: Reasoning models analyze multiple possibilities before generating responses, leading to longer processing times—but also ideally, more precise outputs.
✅ Local testing: You can quickly prototype with the distilled models by downloading Ollama and running them locally. This also helps on scale deployments, on-prem and using APIs.
❌ Open Source? : Not quite. Despite the trend toward openness, DeepSeek R1 itself is not open-source. (More on that in this post).
🚀 Reasoning in SLMs: The most exciting takeaway: SLMs with reasoning capabilities could reshape how we think about efficient, intelligent AI at small scales.

Did you try it?

Check the full article

Happy coding!

Folks, we would love to hear from you!

Eze Lanza — Wed, 18 Dec 2024 16:03:04 +0000

Intel is conducting an Open Source Community Survey to better
understand the needs and challenges of developers like you. With just fiveminutes of your time, you can help us create a more supportive and effective
ecosystem.

Complete the survey and enter to win one of four $250 cash gift
cards! 🎁

Take the Survey: https://lnkd.in/g3w26ia7

OpenSource #Intel

S1E3 : Code & Deploy: Craft a Very Demure, Very Mindful Skincare Routine With GenA

Eze Lanza — Thu, 26 Sep 2024 14:39:01 +0000

Join Code & Deploy host Eze Lanza and special guest Aarushi Kansal, AI engineer at AutoGPT, as they show you how to build your own intelligent agent that can curate a personalized skincare routine which will leave you glowing. They’ll start with an introduction to the AutoGPT system, but you can choose to replicate the code in other frameworks or write it from scratch. Next, they’ll talk about retrieval augmented generation (RAG) and show you how to build a pipeline using a vector database—they’ll use Weaviate, but you can use whichever one you prefer. For this demo, Eze and Aarushi will be using Ollama [Mistral] so you can run the large language model (LLM) locally yourself.

The main goal of the demo is to show you how to set up a reliable RAG-based system that minimizes hallucination, allows you to use your own data (without the overhead of fine-tuning), is controllable by you, and—bonus—leaves you with a plan for achieving great-looking skin.

Understanding Retrieval Augmented Generation (RAG)

Eze Lanza — Wed, 28 Aug 2024 14:06:47 +0000

Learn what a RAG system is and how to deploy it using OPEA’s open source tools and frameworks

Originally posted on https://medium.com/p/4d1d08f736b3

By this point, most of us have used a large language model (LLM), like ChatGPT, to try to find quick answers to questions that rely on general knowledge and information. These questions range from the practical (What’s the best way to learn a new skill?) to the philosophical (What is the meaning of life?).

But how do you get answers to questions that are personal? How much does your LLM know about you? Or your family?
Let’s test ChatGPT and see how much it knows about my parents.

It’s understandable to feel frustrated when a model doesn’t recognize you, but it’s important to remember that these models don’t have much information about our personal lives. Unless you’re a celebrity or have your own Wikipedia page (as Tom Cruise has), the training dataset used for these models likely doesn’t include our information, which is why they can’t provide specific answers about us.

So, how do we get our LLMs to know us better?

That’s the million-dollar question facing enterprises looking to boost productivity with GenAI. They need models that provide context-based results. In this post, we’ll explain the basics of how retrieval augmented generation (RAG) improves your LLM’s responses and show you how to easily deploy your RAG-based model using a modular approach with the open source building blocks that are part of the new Open Platform for Enterprise AI (OPEA).

What is RAG?

We know that LLMs can greatly contribute to completing an extensive number of tasks, such as writing, learning, programming, translating, and more. However, the result we receive depends on what we ask the model, in other words, on how we meticulously build our prompts. For that reason, we spend too much time looking for the perfect prompt to get the answer we want; we’re starting to become experts in model prompting.

Let’s return to the above question: “Who is my mum?” We know who our mum is, we have memories, and that information lives in our “mental” knowledge base, our brain.

When building the prompt, we need to somehow provide it with memories of our mum and try to guide the model to use that information to creatively answer the question: Who is my mum? We’ll provide it with some of mum’s history and ask the model to take her past into account when answering the question.

As we can see, the model successfully gave us an answer that described my mum. Congratulations, we have used RAG!

Let’s inspect what we did.

Given the initial question, we tweaked the prompt to guide the model in how to use the information (context) we provided.

We can think of the RAG process in three parts :

Instruct: Guide the model. We have guided the model to use the information we provided (documents) to give us a creative answer and take into account my mum’s history. We used those instructions as an example; we could have used other guidance depending on the outcome we wanted to achieve. If we don’t want a creative answer, for example, this is the time to declare it.
Context: Provide the context. In this example, we already knew the information about my mother since we retrieved that information from my memories, but in a real scenario, the challenge would be finding the relevant data in a knowledge base to feed the model so that it has the context needed to provide us with an accurate response, this process is called “retrieval.”
Initial Question: The initial question we want answered.

Let’s explore how an enterprise can implement a real-life RAG example using open source tools and models. We’ll deploy it using the standardized frameworks and tools made available through OPEA, which was created to help streamline the implementation of enterprise AI.

Exploring the OPEA Architecture

Here’s the architecture we used for the previous example:

RAG can be understood as simply the steps mentioned above:

Initial Question
Context
Instruct

However, implementing the process in practice can be challenging because multiple components are needed: retrievers, embedding models, and a knowledge base, as shown in the image above. Let’s explore how those parts can work together.

The key lies in providing the right context. You can compare the process to how our memories help us answer questions. For a company, this might mean drawing from a knowledge base of historical financial data or other relevant documents.

For example, when a user asks a chatbot a question before the LLM can spit out an answer, the RAG application must first dive into a knowledge base and extract the most relevant information (the retrieval process). But even before the retrieval happens, an embedding model plays a crucial role in converting the data in the knowledge base into vector representations — meaningful numerical embeddings that capture the essence of the information. These embeddings will live in the knowledge base (vector database) and will allow the retriever to efficiently match the user’s query with the most relevant documents.

Once the RAG application finds the relevant documents, it performs a rerank process to check the quality of the information and then re-orders the information based on relevance. It then builds a new prompt based on the refined context from the top-ranked documents and sends this prompt to the LLM, enabling the model to generate a high-quality, contextually informed response. Easy, right?

As you can see, the RAG architecture isn’t about just one tool or one framework; it’s composed of multiple moving pieces making it difficult to pay attention to each component. When deploying a RAG system in our enterprise, we face multiple challenges, such as ensuring scalability, handling data security, and integrating with existing infrastructure.

The Open Platform for Enterprise AI (OPEA) aims to solve those problems by treating each component in the RAG pipeline as a building block that is easily interchangeable. Say, for example, you’re using Mistral, but want to easily replace it with Falcon. Or, say you want to replace a vector database on the fly. You don’t want to have to rebuild the entire application. That would be a nightmare. OPEA makes deployment easier by providing robust tools and frameworks designed to streamline these processes and facilitate seamless integration.

You can see this process in action by running the ChatQnA example: https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA. There, you’ll find all the steps needed to create the building blocks for your RAG application on your server or your AIPC.

Call to Action

We have shown you the basics of how RAG works and how to deploy a RAG pipeline using the OPEA framework. While the process is straightforward, deploying a RAG system at scale can introduce complexities. Here’s what you can do next:

Explore GenAIComps: Gain insights into how generative AI components work together and how you can leverage them for real-world applications. OPEA provides detailed examples and documentation to guide your exploration.
Explore RAG demo(ChatQnA): Each part of a RAG system presents its own challenges, including ensuring scalability, handling data security, and integrating with existing infrastructure. OPEA, as an open source platform, offers tools and frameworks designed to address these issues and make the deployment process more efficient. Explore our demos to see how these solutions come together in practice.
Explore GenAI Examples: OPEA is not focused only on RAG; it is about generative AI as a whole. Multiple other demos, such as VisualQnA, showcase different GenAI capabilities. These examples demonstrate how OPEA can be leveraged across various tasks, expanding beyond RAG into other innovative GenAI applications.
Contribute to the project! OPEA is built by a growing community of developers and AI professionals. Whether you’re interested in contributing code, improving documentation, or building new features, your involvement is key to our success.

Join us on the OPEA GitHub to start contributing or explore our issues list for ideas on where to start.

S1E2: Code & Deploy: Build Your First Gen AI Agent with Haystack

Eze Lanza — Mon, 26 Aug 2024 19:22:13 +0000

Follow along with Intel Open Source AI Evangelist Ezequiel Lanza and his guest Bilge Yücel, developer relations engineer at deepset, as they show you how to build an intelligent GenAI agent using Haystack, an open source Python framework.

In this step-by-step live demo, you'll learn how to build a customer support chatbot that can handle a wide range of user queries in real-time.

Don't worry if you're new to Haystack! We'll start with a brief introduction to the framework, covering its core features and how it can be used to build robust and flexible AI applications. Following the introduction, we'll walk you through the process of creating a retrieval-augmented generation (RAG) pipeline using Haystack and the open source Mistral-7B-Instruct-v0.3 model, which is licensed under Apache-2.

By integrating custom components, such as a Weather API Tool, you'll see how the chatbot handles specific requests, like checking the weather forecast, enhancing its functionality and providing a seamless user experience. This use case highlights the practical application of building adaptable AI agents with Haystack that can be deployed across various industries to improve customer interaction and satisfaction.

Haystack Basics + Web QA: https://lnkd.in/dA_FU5Hg
Build a Custom AI Agent with Haystack: https://lnkd.in/dhcCuEUJ

S1E1: Code & Deploy: Build and Deploy an ML Binary Classifier

Eze Lanza — Mon, 12 Aug 2024 23:05:59 +0000

Join me with my guest William Arias, developer advocate at GitLab, as they show you step-by-step how to code and deploy a binary classifier using machine learning with open source tools. You’ll first prototype a solution in a Jupyter Notebook, then set up a basic CI pipeline to automate a machine learning app experimentation with MLFlow. Next, you’ll refactor the application (a bit) and create an API endpoint and simple UI for it to test manually (vibe testing). Before you’re done, you’ll learn how to combine everything and deploy the application to a runtime in the cloud using CI/CD principles. 

In this session, you’ll learn how to:

• Prototype a binary classifier using machine learning in a Jupyter Notebook.

• Set up a basic continuous integration (CI) pipeline with MLFlow for app experimentation.

• Refactor the application to create an API endpoint and a simple user interface.

• Deploy the application to a cloud runtime using continuous deployment (CD) principles.

Links to repo
https://gitlab.com/gitlab-da/use-cases/devsecops-platform/deep-learning/meowsky-classifier/-/blob/main/classifier/modeling/train.py

https://gitlab.com/gitlab-da/use-cases/devsecops-platform/deep-learning/meowsky-classifier/-/blob/main/classifier/modeling/train.py