From the course: Artificial Intelligence Foundations: Thinking Machines

Foundation models

From the course: Artificial Intelligence Foundations: Thinking Machines

Foundation models

- In 2021, a computer science team at Stanford University wrote a paper that described a new type of machine learning model. These new foundation models would be trained on broad data and can be adapted to a wide range of downstream tasks. These foundation models would have a lot more power and flexibility. That's because most AI systems are still using machine learning models that are trained for a very specific task. Right now, most organizations still use supervised machine learning to create a model to classify massive amounts of data. That means that you could create a supervised machine learning model to help identify email messages as spam. But to do that, you have to train a model using data that's already been labeled as spam messages. Then once you train the model then you can test it with more data. As you can imagine, this is a pretty difficult process. First, you have to identify high quality training data. That means that you need to find hundreds or thousands or even millions of messages that are identified as spam. Then once you've trained the system, you can only really use it for this one task. You can't then use the same system that you train for spam to look for messages that might have profanity or dull content or viruses. To do that, you have to get more data and then retrain the system. The Stanford team theorized that if you could feed the system enough data then it could create a foundation model that wouldn't need retraining. That means that instead of identifying spam messages you could train an AI system on millions or even billions of email messages. The system would start to find patterns in these messages. The foundation model would understand spam, profanity, viruses, and maybe even some patterns that you hadn't considered. Once you have this foundation model, you could then classify email messages by mood, emotional impact, or even effectiveness. With enough data, you could even use this foundation model to generate human-like replies to messages. These systems could match the mood and even add data from other sources. Foundation models have supercharged the development of generative artificial intelligence. You may have heard of some common foundation models like large language models or LLMs. These models can generate text that sounds remarkably human. There's also foundation models that can generate images. Some of these early systems used generative adversarial networks where two competing artificial neural networks create very lifelike images. Advanced diffusion models can generate captivating images that can seamlessly blend the elements of different objects such as cats and boats. These foundation models train themselves by destroying and recreating millions of images that it finds online. In many ways, foundation models are one of the key innovations in generative artificial intelligence. These models that have shown that if you can gather enough data, you can develop systems that mix and create data in innovative new ways.

Contents