From the course: Introduction to Large Language Models
Unlock the full course today
Join today to access over 24,900 courses taught by industry experts.
Chinchilla
From the course: Introduction to Large Language Models
Chinchilla
- [Instructor] Over the years, the trend has been to increase the model size. Although we won't look at any of these models in detail. I'll mention them briefly now because we'll be comparing them later. So Megatron-Turing was released by a collaboration between Microsoft and Nvidia in Jan of 2022 that had 530 billion parameters. The Google DeepMind team released details about Gopher, which had 280 billion parameters, and it was one of the best models out there at the time. You can see that the model sizes were getting very large, and this was because of the scaling laws. But what if the scaling laws didn't capture the entire picture? The DeepMind team's hypothesis was that large language models were significantly undertrained. You could get much better performance with the same computational budget by training a smaller model for longer. Now, the way you would try and test out a hypothesis is to do a whole lot of…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
(Locked)
BERT3m 16s
-
(Locked)
Scaling laws3m 30s
-
(Locked)
GPT-37m 41s
-
(Locked)
Chinchilla7m 54s
-
(Locked)
PaLM and PaLM 23m 59s
-
ChatGPT and GPT-45m 47s
-
(Locked)
Open LLMs5m 40s
-
(Locked)
Comparing LLMs3m 35s
-
(Locked)
GitHub Models: Comparing LLMs2m 52s
-
(Locked)
Accessing large language models using an API6m 25s
-
(Locked)
LLM trends4m 6s
-
(Locked)
-