From the course: Hands-On AI: RAG using LlamaIndex

Unlock the full course today

Join today to access over 24,900 courses taught by industry experts.

Document summary index

Document summary index

- [Instructor] By this point in the course, I hope you're familiar with how a retrieval augmented generation pipeline works. We have these source documents, which we then split up and create text chunks out of. These text chunks get embedded using some embedding model and stored in vector database. At query time, we're going to retrieve chunks by looking at the embedding similarity between what is in our vector database and what the user's query is. Then, we'll synthesize the response by packaging up that retrieved context, putting it into a prompted setting to get to our LLM. All this, as we have seen, works quite well and we get some decent responses, but there is a problem here. The problem is the best way to represent text for retrieval might not be the best way to represent it for synthesis. For example, a raw text chunk might have some really important details that the LLM needs to synthesize a good response. However, it could also contain irrelevant information that will bias…

Contents