Introduction to GluonNLP

© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
GluonNLP
A Natural Language Processing toolkit
gluon-nlp.mxnet.io

2
Three common myths …
Motivations for GluonNLP

• I will write clean and reusable code
when I’m prototyping this time.
• Variant:
• - I will write clean and reusable code
next time.
=> Well crafted reusable APIs
Common myth 1
function &
script?
hard-coded
parameter?

Common myth 2
• My code will still run next year.
• Sometimes, it’s not our fault.
=> Integrated testing of examples

Common myth3
• I will finish setting up the baseline
model this afternoon.
• Though it may not be our fault
again.
=> Re-implementation of SOTA results

Goals
1. Problem: prototype code is not reusable without copying.
Solution: carefully designed API for versatile needs.
2. Problem: code may break due to API changes.
Solution: integrated testing for examples.
3. Problem: setting up baseline for NLP tasks is hard.
Solution: implementation for state-of-the-art models.
6

• Designed for engineers and researchers
• Enable fast prototyping for NLP application and research
7
GluonNLP goals

GluonNLP Community
• Internal users
• Amazon Comprehend
• Amazon Lex
• AmazonTranscribe
• AmazonTranslate
• Amazon Personalize
• Alexa NLU
• Alexa Brain
• External users

• High-level packages
• gluonnlp.data, gluonnlp.model, gluonnlp.embedding
• Low-Level packages
• gluonnlp.data.batchify, gluonnlp.model.StandardRNN
• Datasets:
• gluonnlp.data.SQuAD, gluonnlp.data.WikiText103
Designed for practitioners: researchers and engineers
http://gluon-nlp.mxnet.io/api/modules/data.html#public-datasets

GluonNLP Models
• Language Modeling
• MachineTranslation
• Word Embedding (100+)
• Text Classification
• Text Generation
• Sentence Embedding
1
0
• Dependency Parsing
• Entailment
• Question Answering
• Named Entity Recognition
• Keyphrase Extraction
• Semantic Role Labeling
• Summarization
Released
WIP
Planned

APIs: Data Loading: Bucketing
How to generate the mini-batches?

No Bucketing + Directly Pad the Samples
Average Padding = 11.7
Be Frugal! Use Bucketing.

Sorted Bucketing

Fixed Bucketing
Shorter sequences can have larger batch sizes.

Fixed Bucketing + Length-aware Batch Size
Batch Size = 18Batch Size = 11
Better throughput! ✌️
Batch Size = 8
ratio
Length of the buckets

Improvement over published results
AWD [1] model on WikiText2 Test Perplexity
GluonNLP 66.9 (250 epochs)
Pytorch 67.8 (250 epochs)
Diff -0.9
Table 3: AWD Language Model
Table 1: fastText n-gram embedding scores, trained onText8 dataset, evaluated on Wordsim353
Table 2: Machine Translation Model BLEU score same standard and settings

MachineTranslation
Encoder: Bidirectional
LSTM + Residual
Decoder: LSTM + Residual +
MLP Attention
• GluonNLP:
• BLEU 26.22 on
IWSLT2015, 10 epochs,
Beam Size=10
• Tensorflow/nmt:
• BLEU 26.10 on
IWSLT2015,
Beam Size=10
Wu, Yonghui, et al. "Google's neural machine translation system: Bridging the gap between human and machine translation." arXiv preprint arXiv:1609.08144 (2016).
Google Neural MachineTranslation (GNMT)

• Encoder
• 6 layers of self-attention+feed-forward
• Decoder
• 6 layers of masked self-attention and
output of encoder + feed-forward
• GluonNLP:
• BLEU 26.81 onWMT2014en_de, 40 epochs
• Tensorflow/t2t:
• BLEU 26.55 onWMT2014en_de
Vaswani, Ashish, et al. "Attention is all you need." Advances in Neural Information Processing Systems. 2017.
MachineTranslation
Transformer

• Feature-based approach
• Pre-training bidirectional
language model
• Character embedding +
stacked bidirectional LSTMs
• GluonNLPTutorial
Transfer learning: ELMo
Embedding from Language Model
Deep contextualized word representations, Peters et al., 2018

• Fine-tuning approach
• Pre-training masked language model +
next sentence prediction
• Stacked transformer encoder + BPE
• GluonNLPTutorial
Transfer Learning: BERT
Bidirectional Encoder Representations fromTransformers
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Devlin et al., 2018

Go build!
• http://gluon-nlp.mxnet.io/
Get help:
• https://discuss.mxnet.io/

Introduction to GluonNLP

More Related Content

Similar to Introduction to GluonNLP

More from Apache MXNet

Recently uploaded

Introduction to GluonNLP

Editor's Notes