From the course: Building RAG Solutions with Azure AI Foundry (Formerly Azure AI Studio)
The basics of RAG: Adding custom data to your LLM
From the course: Building RAG Solutions with Azure AI Foundry (Formerly Azure AI Studio)
The basics of RAG: Adding custom data to your LLM
Large language models are trained on a large set of data, mainly from the internet. However, they do have limitations. First, if you ask it questions about current events, it will not be able to respond accurately. Each model will have a specific date on how recent the data it was trained on. The free version of ChatGPT, for example, was trained on January 2022 data. So it will reply that Queen Elizabeth II is still alive when we already know she passed away. Second, if you ask questions about your domain data, it may also not respond back accurately. And worse, it may even make up a fabricated answer. In the given example, the model is providing an answer, but the source links given when clicked do not match the actual product. RAG LLM context is a popular acronym for retrieval-augmented generation. It is the technique of adding data to an LLM from an external data source. This data can be your legal contracts, product manuals, customer information sheets, software designs, and even your code. A good analogy for drag is to come to an open book exam when you are a student. In an open book exam, we can refer to any books you have brought to the classroom to answer questions. Imagine your brain as the LLM, but you needed to open the books you came with to get the information needed to answer questions. To further understand drag, let us discuss the workflow. First, every time a user makes a query, the system needs to retrieve from an external data source the relevant information that will answer that query. Second, the users query and retrieve content is augmented or added together. This becomes the new prompt. Third, the new prompt is now fed into the LLM to generate a response. To simplify, the main difference between RAG and a typical LLM system is that a typical LLM system would answer user queries based on its training data set, while RAG provides answers to queries from an external source, you have provided. How the relevant data is retrieved based on the user's initial prompt is best explained by discussing other concepts, called tokens and embeddings in the next chapters.
Contents
-
-
-
The basics of RAG: Adding custom data to your LLM2m 40s
-
Understanding tokens: A key factor of costs in your system2m 39s
-
(Locked)
Vector embeddings: How words connect to each other3m 28s
-
(Locked)
How RAG works: Understanding the process under the hood2m 38s
-
(Locked)
RAG high-level architecture: The required components2m 35s
-
-
-
-
-
-
-