gluonnlp Changelog

0.9.0

News
====
- GluonNLP was featured in EMNLP 2019 Hong Kong! [Check out the code accompanying the tutorial](https://github.com/leezu/EMNLP19-D2L).
- "[GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing](http://jmlr.org/papers/volume21/19-429/19-429.pdf)" has been published in the Journal of Machine Learning Research.

0.8.3

- Add int32 support for importance sampling (`model.ISDense`) and noise contrastive estimation (`model.NCEDense`).

0.8.2

This release covers a few fixes for the bugs reported:
- Fixed argument passing in the `bert/embedding.py` script
- Updated `SimVerb3500` dataset URL to the aclweb hosted version
- Removed multi-processing in DataLoader from in `bert/pretraining_utils.py` which potentially causes crash when horovod mpi is used for training
- Before MXNet 1.6.0, Gluon `Trainer` assumes **deterministic parameter creation order** for distributed traiing. The attention cell for BERT and transformer has a non-deterministic parameter creation order in v0.8.1 and v0.8.0, which will cause divergence during distributed training. It is now fixed.

Note that since v0.8.2, the default branch of gluon-nlp github will be switched to the latest stable branch, instead of the master branch under development.

0.8.1

News
====
- GluonNLP was featured in KDD 2019 Alaska! Check out our tutorial: [From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond](http://kdd19.mxnet.io).
- GluonNLP 0.8.1 will no longer support Python 2. (721, 838)
- Interested in BERT int8 quantization for deployment? Check out the blog post [here](https://medium.com/apache-mxnet/optimization-for-bert-inference-performance-on-cpu-3bb2413d376c).

Models and Scripts
==================
RoBERTa
- The RoBERTa model introduced by *Yinhan Liu, et. al* in "[RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692)". The model checkpoints are converted from the [original repository](https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.md). Check out the usage [here](http://gluon-nlp.mxnet.io/model_zoo/bert/index.html). (870)

Transformer-XL
- The Transformer-XL model introduced by *Zihang Dai, et. al* in "[Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860)". (846)

Bug Fixes
=========
- Fixed hybridization for the BERT model (877)
- Change the variable model to bert_classifier (828) thank you LindenLiu
- Revert "Add axis argument to squeeze()" (857)
- [BUGFIX] Remove incorrect vocab.padding_token requirement in CorpusBPTTBatchify
- [BUGFIX] Fix Vocab with unknown_token remapped to != 0 via token_to_idx arg (862)
- [BUGFIX] Fix AMP in finetune_classifier.py (848)
- [BUGFIX] fix broken multi-head attention cell (878) ZiyueHuang
- [FIX] fix chnsenticorp dataset download link (873)
- fix the usage of pad in bert (850)

Documentation
=============
- Clarify Bert does not require MXNet nightly anymore (860)
- [DOC] fix broken links (833)
- [DOC] Update BERT index.rst (844)
- [DOC] Add GluonCV/NLP archive (823)
- [DOC] add missing dataset document (832)
- [DOC] remove wrong tutorial header level (826)
- [DOC] Fix a typo in attention_cell's docstring (841) thank you shenfei
- [DOC] Upgrade mxnet dependency to 1.5.0 and use Cuda 10.1 on CI (842)
- Remove Py2 icon from Readme. Add 3.7 (856)
- [DOC] Improve help message (855) thank you apeforest
- Update index.rst (853)
- [DOC] Fix Machine Translation with Transformers example (865)
- update button style (869)
- [DOC] doc fix for vocab.subwords (885) thank you liusy182

Continuous Integration
======================
- [CI] Support py3-master_gpu_doc CI run on arbitrary branches (829)
- Enforce AWS Batch jobName rules (836)
- dump linkcheck errors to comments (827)
- Enable Sphinx Autodoc typehints (830)
- [CI] Preserve stderr and stdout streams in doc CI stage for Cloudwatch

0.8.0

News
====
- GluonNLP is featured in KDD 2019 Alaska! Check out our tutorial: [From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond](https://kdd19.mxnet.io).
- GluonNLP 0.8.0 will no longer support Python 2. 721

Models
======
RoBERTa
- [RoBERTa](https://ai.facebook.com/blog/roberta-an-optimized-method-for-pretraining-self-supervised-nlp-systems/) is now available in GluonNLP BERT model zoo. 870

Transformer-XL
- [Transformer-XL](https://arxiv.org/abs/1901.02860) is now available in GluonNLP language model zoo. 846

0.7.1

News
====
- GluonNLP will be featured in KDD 2019 Alaska! Check out our tutorial: [From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond](https://www.kdd.org/kdd2019/hands-on-tutorials).
- GluonNLP was featured in JSALT 2019 in Montreal, 2019-6-14! Checkout https://jsalt19.mxnet.io.
- This is the last release in GluonNLP that will officially support Python 2. 721

Models and Scripts
==================
BERT
- a BERT BASE model pre-trained on a large corpus including [OpenWebText Corpus](https://skylion007.github.io/OpenWebTextCorpus/), BooksCorpus, and English Wikipedia, which has comparable performance with the BERT large model from Google. The test score on GLUE Benchmark is reported below. Also improved usability of the BERT pre-training script: on-the-fly training data generation, sentencepiece, horovod, etc. (799, 687, 806, 669, 665). Thank you davisliang vanyacohen Skylion007

| Source | GluonNLP | google-research/bert | google-research/bert |
|-----------|-----------------------------------------|-----------------------------|-----------------------------|
| Model | bert_12_768_12 | bert_12_768_12 | bert_24_1024_16 |
| Dataset | `openwebtext_book_corpus_wiki_en_uncased` | `book_corpus_wiki_en_uncased` | `book_corpus_wiki_en_uncased` |
| SST-2 | **95.3** | 93.5 | 94.9 |
| RTE | **73.6** | 66.4 | 70.1 |
| QQP | **72.3** | 71.2 | 72.1 |

Gluonnlp

Page 2 of 3