From the course: LLMOps in Practice: A Deep Dive

Unlock the full course today

Join today to access over 24,900 courses taught by industry experts.

Extending your app with RAG

Extending your app with RAG - Python Tutorial

From the course: LLMOps in Practice: A Deep Dive

Extending your app with RAG

Now that you've seen how to create a vector database in Chroma and you've sliced up the book, created embeddings, stored them, and written code to retrieve context-appropriate snippets, the next step is to update the chatbot that we've been working on to use these snippets. For the first step, let's revisit the architecture of the chatbot so we can see how we would update it with RAG. Typically, the chatbot, usually called an assistant in the API, is primed with a system prompt like you are an expert in public speaking who can..., et cetera. It then emits a welcome message. The user then enters a prompt like, please help me write a speech about whatever, and the assistant will reply with some kind of speech outline. So all this stuff in the middle is the conversation history, which is maintained in the context window for your LLM. And this is why LLMs with larger context windows are very valuable for complex tasks. They can remember the very long conversations and refer back to pieces…

Contents