From the course: Agentic AI Design Patterns for GenAI and Predictive AI
Autonomous self-RAG
From the course: Agentic AI Design Patterns for GenAI and Predictive AI
Autonomous self-RAG
- When it comes to using generative AI systems to produce content for us, we're not always guaranteed that the content the system creates is accurate or complete. For example, if we ask the system to produce a complete list of all the traffic laws in a specific region, it will do its best to put this together for us. But depending on how well it was trained and how reliable its data sources are, the list may not be 100% accurate or complete. We then need to manually review the list and verify that those laws listed are correct, and then further check to see if any more laws need to be added to the list. In other words, the generative AI system will do the bulk of the work for us, but it's up to us to then proof and edit this content before it can be used and relied upon. Retrieval-augmented generation or RAG for short is a traditional technique used with LLMs whereby they're given access to external data sources that they can then use to retrieve facts from a knowledge base, which enables them to generate more accurate content to help reduce hallucinations. But a problem with RAG is that it will often result in the pre-programmed retrieval of data, much of which may not be relevant or actually needed. With the autonomous self-RAG pattern, we involve a self-correcting agent, which is capable of critically evaluating what new information is needed and is then further capable of autonomously retrieving that information for whatever external sources it has access to. The self-correcting agent can have different types of logic that enable it to carry out this role, including self-reflection logic, which is used to determine if the current output is complete and correct. It essentially asks, does this make sense and can I back it up? Confidence logic, which is used to assess the agent's own confidence in the facts it has generated. For example, it might flag statements that it knows are based on potentially outdated training data, decision-making logic, which is used to decide whether external information is needed to verify or augment a response. It's basically the logic that triggers the external data retrieval process. And targeting logic, which is used to formulate specific queries to find the exact information the agent needs to fill a knowledge gap as opposed to doing a broad search, which may not be required. And then there's relevance and verification logic, which is used to evaluate the retrieved information to ensure that it's relevant and credible before it gets incorporated into the final response. And then we have synthesis and integration logic, which the agent uses to figure out how to best merge the newly found data with its initial response. This is where it corrects any errors and creates its final output. The autonomous self-RAG pattern introduces highly sophisticated logic that enables an agent to significantly enhance the quality and completeness of generated content. But the downside is that this logic will add complexity and cost to the overall solution. It can demand significant computational power to manage the continuous cycles of self-critique, search, and synthesis. With the necessary infrastructure, the overall solution can be slower than just the standard generative AI system.