From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)

Unlock the full course today

Join today to access over 24,900 courses taught by industry experts.

RAG: Knowledge curation process

RAG: Knowledge curation process

How do we build a RAG system? There are two steps here; the knowledge curation process and the inference process. We will discuss these processes in this chapter. We will implement them and build a RAG system in the next chapter. Let's look at the workflow for the curation process now. We can have one or more sources of data for the RAG system. These could be websites, ticketing systems, traditional RDBMS databases, document hubs like SharePoint or Google Drive, and a Doc documents. Do note that the structure of these data sources will be vastly different. Some may only have unstructured text data, while others may have a mixture of numeric, structured, and unstructured data. For each of these data sources, we need to build an acquisition module. This module will fetch data from the sources, filter it for relevant information, and then cleanse them to eliminate any kind of noise. This module may also do continuous and incremental features to catch up with new additions and changes on…

Contents