[YouTube] Practical Data Considerations for Building Production-Ready LLM Applications - Summary

#llm #rag #ai #data

title: [YouTube] Practical Data Considerations for building Production-Ready LLM Applications - Summary
published: false
date: 2023-09-04 00:00:00 UTC
tags: 
canonical_url: http://www.evanlin.com/llama-index-talk/

Slides: https://docs.google.com/presentation/d/1wTEt3sy7ZHk3rYO3nFYhPZEFrfpG70l6WzY12wIaycE/edit?usp=sharing

Simple Summary:

RAG (Retrieval Augmented Generation) mainly talks about using LLMs to generate responses by querying data.

Preparing data: Read data, split it into chunks, add Embedding, put it into Vector DB. Retrieve data: Convert input into Embedding, compare Vectors, find Chunk, put it into LLM to generate answers.

Difficulties:

The retrieved (Retrieval) data is not good enough, even if you use gpt20 (the speaker really said that) it's useless.
How to achieve real-time data updates at the system level?

Improvement Methods:

Choose a good splitting tool (parser): Recommended LLamaHub, Unstructured-IO/unstructured
Enhance the retrieved data (add some metadata): e.g. page numbers, chapter descriptions...
Create a data pair (doc_hash_id, ver_num), update faster.

Product Introduction: LlamaIndex https://www.llamaindex.ai/