From the course: LLMOps in Practice: A Deep Dive
Unlock the full course today
Join today to access over 24,900 courses taught by industry experts.
Extending your app with RAG - Python Tutorial
From the course: LLMOps in Practice: A Deep Dive
Extending your app with RAG
Now that you've seen how to create a vector database in Chroma and you've sliced up the book, created embeddings, stored them, and written code to retrieve context-appropriate snippets, the next step is to update the chatbot that we've been working on to use these snippets. For the first step, let's revisit the architecture of the chatbot so we can see how we would update it with RAG. Typically, the chatbot, usually called an assistant in the API, is primed with a system prompt like you are an expert in public speaking who can..., et cetera. It then emits a welcome message. The user then enters a prompt like, please help me write a speech about whatever, and the assistant will reply with some kind of speech outline. So all this stuff in the middle is the conversation history, which is maintained in the context window for your LLM. And this is why LLMs with larger context windows are very valuable for complex tasks. They can remember the very long conversations and refer back to pieces…
Contents
-
-
-
-
-
(Locked)
Retrieval augmented generation (RAG)8m 14s
-
(Locked)
Installing and setting up a VectorDB4m 50s
-
(Locked)
Create a VectorDB15m 2s
-
(Locked)
BYOD to a VectorDB6m 14s
-
(Locked)
VectorDB: Hands-on use case14m 10s
-
(Locked)
Querying the VectorDB3m 47s
-
(Locked)
Demonstration: Querying the VectorDB11m 33s
-
(Locked)
Extending your app with RAG5m 23s
-
(Locked)
RAG: Showing it in action9m 48s
-
(Locked)
Challenge: Complete RAG application1m 8s
-
(Locked)
Solution: Complete RAG application5m 16s
-
(Locked)
-
-