RAG retrieval has a quiet problem: a short user question makes a terrible search vector. "How do I cancel?" is four words; the answer doc is a detailed paragraph. They don't embed close together. HyDE fixes this with a clever trick.
📝 See both retrieval paths race: https://dev48v.infy.uk/prompt/day13-hyde.html
The HyDE idea
HyDE = Hypothetical Document Embeddings. Instead of embedding the question, you first ask the LLM to write a hypothetical answer — a full, detailed passage as if it already knew the answer. Then you embed THAT and search with it.
Why a fake answer beats the real question
A hypothetical answer looks like the real answer docs — same vocabulary, same length, same shape. So in vector space it lands right next to the true evidence, and retrieval pulls back far better matches than the bare query would.
The draft can be wrong — that's fine
You never show the hypothetical doc to the user. It's only a search probe. The final answer is generated from the REAL documents you retrieved with it. Cost: one extra LLM call. Payoff: noticeably better recall, zero training.
🔨 Full pipeline (query → draft hypo → embed → retrieve → answer from real docs) on the page: https://dev48v.infy.uk/prompt/day13-hyde.html
Part of PromptFromZero. 🌐 https://dev48v.infy.uk
Top comments (0)