Easy RAG and LLM tutorials to learn how to get started to become an "AI Developer" (Beginner Friendly)

#programming #python #beginners #ai

We have all heard stories of "AI" experts making hundreds of thousands to millions of dollars each year. As an example, the current pay at OpenAI is reported to be a total compensation package of US$916,000 for L5 engineers.

It is estimated that there are 30 million developers, 300,000 Machine Learning engineers, and only 30,000 ML researchers in the world.

Does that mean that only 1% of the developers in the world are qualified to be AI Developers? Do you need to be an expert in ML to qualify as an "AI Developer"?

You do NOT need to be an ML Expert to be a GEN AI Developer

In a broad sense, because of the prevalence of Gen AI, almost every developer will in some sense interact with and use Gen AI in their job. Think of all the code generation apps and image generation apps you may already be using today.

Even if you only use LLMs like ChatGPT and OpenAI APIs to help generate content, when dealing with a large amount of data, you will still need to learn how to automate workflows and to retrieve the right data to send to the LLMs.

I have heard the argument that all developers will soon be AI developers in a general sense. But it will be difficult to command the pay rates that skilled AI Developers can command if you are a mere consumer of AI as opposed to being a skilled developer who can create and enable AI use cases and workflows.

Using resources available for FREE, you can start to learn the fundamentals of Generative AI so that you too can become a skilled AI Developer.

To get started in your AI Developer journey, here are the first steps to help you on your way. 👇

(Remember that Rome wasn't built in one day. You have to start digging in to learn anything, right?)

Learn About RAG

Retrieval Augmented Generation (RAG) is one of the fundamental aspects of Generative AI. RAG is basically a short-hand way of describing the workflow of linking documents or knowledge to LLMs. While this is foundational to using Gen AI in many enterprise workflows, many developers have not yet had first-hand experience with it.

LLMWare: Easy lessons in RAG for AI Beginners

A little about me -- I am the founder of LLMWare. LLMWare is a RAG platform that is designed for both beginners and experts alike.

In 'FAST-START: Learning RAG with LLMWare through 6 Examples', novice AI developers will learn the fundamentals of RAG in Python by:

1) Creating your first library;

2) Building embeddings (an embedding is a way of representing data as points in n-dimensional space so that similar data points cluster together);

3) Learning how to use prompts and models;

4) Text query using RAG;

5) Semantic query using RAG (natural language querying for retrieval); and

6) RAG with multi-step queries.

Explore LLMWare on GitHub ⭐️

In about an hour or two, you will have experimented with all of the basic components in RAG so that you will become familiar with how the pieces (like text parsing, chunking, indexing, vector databases, and LLMs) fit together in a very easy-to-use RAG workflow.

Once you have passed the Fast-Start stage, there are over 50 'cut and paste' recipes to help you dive into the more advanced capabilities around RAG.

Learn about LLMs

Hugging Face Tutorials on LLMs

Once you have learned about RAG, you will need to learn about models.

There are significant benefits to using a pre-trained model. It reduces computation costs, your carbon footprint, and allows you to use state-of-the-art models without having to train one from scratch.

Transformers provides access to thousands of pretrained models for a wide range of tasks. When you use a pretrained model, you train it on a dataset specific to your task. This is known as fine-tuning, an incredibly powerful training technique.

In these tutorials, you will learn to fine-tune a pretrained model with a deep learning framework of your choice:

Fine-tune a pretrained model with 🤗 Transformers Trainer.
Fine-tune a pretrained model in TensorFlow with Keras.
Fine-tune a pretrained model in native PyTorch.

https://huggingface.co/docs/transformers/training

Large Language Model Course by Maxime Labonne

Maxime Labonne is currently a senior applied researcher at JPMorgan Chase who published a beloved open source library that teaches you the true nuts and bolts of LLMs. This is a comprehensive, deep-dive course that starts with the mathematics of Machine Learning (linear algebra, calculus and probability & statistics) all the way to Natural Language Processing.

His LLM course is divided into three parts:

LLM Fundamentals covers essential knowledge about mathematics, Python, and neural networks.
The LLM Scientist focuses on building the best possible LLMs using the latest techniques.
The LLM Engineer focuses on creating LLM-based applications and deploying them.

Although this library is not going to be mastered in an afternoon, it is definitely worth checking out in detail if you want to truly take your AI skills to the next level.

https://github.com/mlabonne/llm-course