From the course: Oracle Cloud Infrastructure Generative AI Professional
Unlock this course with a free trial
Join today to access over 24,900 courses taught by industry experts.
Process documents - Oracle Cloud Infrastructure Tutorial
From the course: Oracle Cloud Infrastructure Generative AI Professional
Process documents
(light instrumental music) - [Instructor] In the previous lesson, we discussed that RAG by playing consists of ingestion, retrieval, and generation. Now let us discuss each of these in detail. We'll begin with ingestion. The first step in ingestion is to load documents. The documents can come from a variety of sources and have multiple formats. The documents can be PDFs, comma-separated values, HTML, JSON, and many other types. Most of the LLM frameworks, including LangChain, offer classes to load different types of documents. The loader classes also support loading just a single document or all the documents in a given directory. Once the documents are loaded, the next step is to split the documents into smaller pieces, also referred to as chunks. There are a few things to consider while splitting the documents. Let us understand each of these. First consideration is the size of the chunk. That is how big or small the chunk should be. Most of the LLMs have a maximum input size…
Contents
-
-
-
-
-
(Locked)
OCI Generative AI integrations6m 34s
-
(Locked)
Retrieval augmented generation (RAG)3m 58s
-
(Locked)
Process documents3m 52s
-
(Locked)
Embed and store documents5m 47s
-
(Locked)
Retrieval and generation4m 56s
-
(Locked)
Demo: LangChain basics7m 8s
-
(Locked)
Conversational RAG1m 50s
-
(Locked)
Demo: RAG with Oracle Database 23ai10m 38s
-
(Locked)
-
-