From the course: Introduction to Transformer Models for NLP
Unlock this course with a free trial
Join today to access over 24,900 courses taught by industry experts.
Off-the-shelf results with T5
From the course: Introduction to Transformer Models for NLP
Off-the-shelf results with T5
- Section 11.1, off the shelf results with a T5. So we've talked about T5, the text to text transfer transformer and how it is trying to take transfer learning to its limits. The authors are basically saying, "Hey, BERT's great for natural language understanding through its encoder architecture. GPT is great for generating language through its decoder stack, based on the decoder stack of the transformer. But there's got to be a way to go back to basics, back to the original idea of the transformer and how far can we push that, something that can both encode and decode." So while they were pre-training T5, they also threw in both unsupervised and supervised tasks. In our last session, we talked about the the Common Crawl dataset that it was training on using a language modeling task, both auto-regressive and auto-encoding. But T5 was also pre-trained on supervised tasks, including translation, linguistic acceptability, semantics, text similarity, and summarization. So let's go ahead…