From the course: Oracle Cloud Infrastructure Generative AI Professional

Unlock this course with a free trial

Join today to access over 24,900 courses taught by industry experts.

Customize LLMs with your data

Customize LLMs with your data

(gentle music) - [Rohit] Hello, everyone. Welcome to this lesson on customizing LLMs with your data. Before we dive deeper, let us look at whether you can train LLMs from scratch with your data. It's not a great idea to do so. Why? There are three main reasons. The first reason is it's very expensive to train these models. You can see some numbers here. It around roughly costs $1 million to train a language model with 10 billion parameters. You need a lot of data for training these models. For example, Meta's Llama-2 model was trained on 2 trillion tokens. That's something like 1 billion legal briefs, and you need a lot of annotated data, basically data which is labeled, categorized, and tagged. So it's also very labor-intensive. And then you need a lot of expertise. Pre-training these models is hard, requires a thorough understanding of the model performance, how to monitor for it, detect and mitigate hardware failures, and understands the limitations of the model. So it's not a…

Contents