DEV Community

Evan Lin
Evan Lin

Posted on • Originally published at evanlin.com on

[YouTube] Practical Data Considerations for Building Production-Ready LLM Applications - Summary

title: [YouTube] Practical Data Considerations for building Production-Ready LLM Applications - Summary
published: false
date: 2023-09-04 00:00:00 UTC
tags: 
canonical_url: http://www.evanlin.com/llama-index-talk/
Enter fullscreen mode Exit fullscreen mode

Slides: https://docs.google.com/presentation/d/1wTEt3sy7ZHk3rYO3nFYhPZEFrfpG70l6WzY12wIaycE/edit?usp=sharing

image-20230905141355527

Simple Summary:

RAG (Retrieval Augmented Generation) mainly talks about using LLMs to generate responses by querying data.

Preparing data: Read data, split it into chunks, add Embedding, put it into Vector DB. Retrieve data: Convert input into Embedding, compare Vectors, find Chunk, put it into LLM to generate answers.

Difficulties:

  1. The retrieved (Retrieval) data is not good enough, even if you use gpt20 (the speaker really said that) it's useless.
  2. How to achieve real-time data updates at the system level?

Improvement Methods:

  • Choose a good splitting tool (parser): Recommended LLamaHub, Unstructured-IO/unstructured
  • Enhance the retrieved data (add some metadata): e.g. page numbers, chapter descriptions...
  • Create a data pair (doc_hash_id, ver_num), update faster.

Product Introduction: LlamaIndex https://www.llamaindex.ai/

image-20230905141916740

About Llama-Index Tutorial Resources


Enter fullscreen mode Exit fullscreen mode

Top comments (0)