title: [YouTube] Practical Data Considerations for building Production-Ready LLM Applications - Summary
published: false
date: 2023-09-04 00:00:00 UTC
tags:
canonical_url: http://www.evanlin.com/llama-index-talk/
Slides: https://docs.google.com/presentation/d/1wTEt3sy7ZHk3rYO3nFYhPZEFrfpG70l6WzY12wIaycE/edit?usp=sharing
Simple Summary:
RAG (Retrieval Augmented Generation) mainly talks about using LLMs to generate responses by querying data.
Preparing data: Read data, split it into chunks, add Embedding, put it into Vector DB. Retrieve data: Convert input into Embedding, compare Vectors, find Chunk, put it into LLM to generate answers.
Difficulties:
- The retrieved (Retrieval) data is not good enough, even if you use gpt20 (the speaker really said that) it's useless.
- How to achieve real-time data updates at the system level?
Improvement Methods:
- Choose a good splitting tool (parser): Recommended LLamaHub, Unstructured-IO/unstructured
- Enhance the retrieved data (add some metadata): e.g. page numbers, chapter descriptions...
- Create a data pair (doc_hash_id, ver_num), update faster.
Product Introduction: LlamaIndex https://www.llamaindex.ai/
- Supports multiple Loaders: Llama hub
- Supports Document Update:
About Llama-Index Tutorial Resources


Top comments (0)