DEV Community

Cover image for ๐Ÿ”ช 6 Killer Open-Source Libraries to Achieve AI Mastery in 2024 ๐Ÿ”ฅ๐Ÿช„
Jeffrey Ip for Confident AI

Posted on โ€ข Edited on

33 19 19 19 19

๐Ÿ”ช 6 Killer Open-Source Libraries to Achieve AI Mastery in 2024 ๐Ÿ”ฅ๐Ÿช„

TL;DR

AI has traditionally been a very difficult field for web developers to break into... until now ๐Ÿ˜Œ With the introduction of large language models (LLMs) like ChatGPT, it seems like nowadays anyone can become an AI engineer. But make no mistake, this cannot be further from the truth.

In this article, I will reveal the current top AI libraries that makes a mediocre AI engineer exceptional. As an ex-Google, ex-Microsoft AI engineer myself, I will show you how exceptional AI engineers use these libraries to build great applications.

Are you ready to up-skill yourself and be one step closer to becoming an AI wizard before 2024? Lets begin ๐Ÿค—


1. DeepEval - Open-source Evaluation Infrastructure for LLMs

Image description

A good engineer can build, but an exceptional engineer can communicate the value of what they're built. DeepEval allows you to do exactly that.

DeepEval allows you to unit test and debug your large language model (LLM, or just AI) applications at scale in both development and production in under 10 lines of code.

Why is this valuable you ask? Because companies nowadays want to be seen as an innovative AI company and so stakeholders prefer engineers that can not just build like an indie hacker, but know how to ship reliable AI applications like a seasonal AI specialist.**

import pytest
from deepeval import assert_test
from deepeval.test_case import LLMTestCase
from deepeval.metrics import AnswerRelevancyMetric
import chatbot

def test_chatbot():
   input = "How to become an AI engineer in 2024?"
   test_case = LLMTestCase(input=input, actual_output=chatbot(input))
   answer_relevancy_metric = AnswerRelevancyMetric()
   assert_test(test_case, [answer_relevancy_metric])
Enter fullscreen mode Exit fullscreen mode

๐ŸŒŸ Star DeepEval on GitHub


2. Unstructured - Pre-processing for Unstructured Data

LLMs thrive because they are versatile and can handle a large variety of inputs, but not all. Unstructured helps you easily transform unstructured data like webpages, PDFs, tables into readable formats for LLMs.

What does this mean? This means you can now enable your AI application to be customized on your internal documents. Unstructured is amazing because it in my opinion, operates at the right level of abstraction - it gives the boring hard work while giving you enough control as a developer.

from unstructured.partition.auto import partition

elements = partition(filename="example-docs/eml/fake-email.eml")
print("\n\n".join([str(el) for el in elements]))
Enter fullscreen mode Exit fullscreen mode

๐ŸŒŸ Star Unstructured


3. Airbyte - Data Integration for LLMs

Image description

Connect data sources, move data around, basically most of what you need to build a real-time AI application, using Airbyte. Allows your LLMs to be connected to information outside of the data it was trained on.

Alike Unstructured, Airbyte provides a great level of abstraction over the work an AI engineer does.

๐ŸŒŸ Star Airbyte


4. Qdrant - Fast Vector Search for LLMs

Ever wondered what happens if you feed in too much data to ChatGPT? That's right, you'll encounter a context overflow error.

That's because LLMs cannot take in infinite information. To help with that, we need a way to only feed in relevant information. And this process, is known as retrieval augmented generation (RAG). Here's another great article on what RAG is.

Qdrant is a vector database that helps you do just that. It stores and retrieve relevant information at blazing fast speed, ensuring your application stays up to date with the real world.

๐ŸŒŸ Star Qdrant


5. MemGPT - Memory Management for LLMs

So Qdrant helps give LLMs "long-term memory", but what happens if there's too much to "remember"? MemGPT helps you manage memory for this exact use case.

MemGPT is like a cache for vector databases, with its own proprietary way to clearing caches. It helps you manage redundant information in your knowledge bases, making your AI application more performant and accurate.

๐ŸŒŸ Star MemGPT


6. LiteLLM - LLM proxy

LiteLLM is a proxy for multiple LLMs. It is great for experimentation and combined with DeepEval, allows you to pick the best model for your use case. The best part? it allows you to use any model it supports in the same OpenAI interface.

from litellm import completion
import os

## set ENV variables 
os.environ["OPENAI_API_KEY"] = "your-openai-key" 

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
Enter fullscreen mode Exit fullscreen mode

๐ŸŒŸ Star LiteLLM


Closing Remarks

That's all folks, thanks for reading and I'd hope you learned a few things along the way!

Please like and comment if enjoyed this article, and as always, don't forget to give open-source some love by starring their repos as a token of appreciation ๐ŸŒŸ.

Top comments (23)

Collapse
 
matijasos profile image
Matija Sosic โ€ข

Great list! I agree, it's so hard to choose just 5. I'd add usemage.ai/ -> it's not a library per se, but if you want to generate a full React/Node.js app from a short description, this is the best tool out there (and it's free, no OpenAI key required!)

Keep up the great work :)

Collapse
 
guybuildingai profile image
Jeffrey Ip โ€ข

Great addition!

Collapse
 
ranjancse profile image
Ranjan Dailata โ€ข
Collapse
 
guybuildingai profile image
Jeffrey Ip โ€ข

Excluded for a reason :) Quality over quantity

Collapse
 
uliyahoo profile image
uliyahoo โ€ข

Great stuff. Coding with LLMs is just getting started and people need help finding all the great tools.

Would also check out CopilotKit - React library for building in-app chatbots & Textareas.
github.com/CopilotKit/CopilotKit

Collapse
 
guybuildingai profile image
Jeffrey Ip โ€ข

Nice work!

Collapse
 
rajeshj3 profile image
Rajesh Joshi โ€ข

Here's an OpenSource project, helpful in running ML models in the background.

Get Job Execution Reminders โฐ via Webhook using WebhookPlan

View on GitHub

Collapse
 
guybuildingai profile image
Jeffrey Ip โ€ข

Seems cool!

Collapse
 
valvonvorn profile image
val von vorn โ€ข

Another killer! Thanks for your spam post!

Collapse
 
guybuildingai profile image
Jeffrey Ip โ€ข

Ur welcome!

Collapse
 
valvonvorn profile image
val von vorn โ€ข

did you inspire from Devs Killer website maybe?
devskiller.com/

Collapse
 
srbhr profile image
Saurabh Rai โ€ข

MemGPT being on this list is awesome. It's a nice little cache for your "Vector Databases."

Collapse
 
guybuildingai profile image
Jeffrey Ip โ€ข

I see what you did there!

Collapse
 
srbhr profile image
Saurabh Rai โ€ข

๐Ÿ˜‚

Collapse
 
fernandezbaptiste profile image
Bap โ€ข

I love your banner! Really cool list thanks for sharing!

Collapse
 
guybuildingai profile image
Jeffrey Ip โ€ข

Thank you, glad you liked it!

Collapse
 
majilaii profile image
Kuong Ao Ieong โ€ข

I am using DeepEval for almost all of my AI projects and so far I love it! Honestly love the platform and the intuitive design

Collapse
 
biplobsd profile image
Biplob Sutradhar โ€ข

Great list. โœจ

Collapse
 
guybuildingai profile image
Jeffrey Ip โ€ข

Anytime :)

Collapse
 
marisogo profile image
Marine โ€ข

Nice list to have some motivation to try new things!

Collapse
 
guybuildingai profile image
Jeffrey Ip โ€ข

Glad you liked it!

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

๐Ÿ‘‹ Kindness is contagious

Please leave a โค๏ธ or a friendly comment on this post if you found it helpful!

Okay