# Tensor2tensor

### 1801.10198

* New [`RTransformer`](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/research/r_transformer.py) model, a recurrent Transformer * New [English-Estonian translation dataset](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/translate_enet.py) thanks to stefan-it * New `ROC_AUC` metric thanks to jjtan * Various fixes, improvements, additions, etc.

### 1.13.4

Minor fix to 1.13.3, please see release notes there.

### 1.13.3

TODO(afrozm): Document more. * Various PRs. * Development on TRAX

### 1.13.2

* jax, jaxlib moved to extras in setup.py PRs: fixed get_standardized_layers spelling, thanks cbockman in 1529 serving utils fixes - Thanks Drunkar ! in 1495 Fixing a checkpoint name bug in 1487, thanks lzhang10 Enhancements: * [DeepMind Math dataset](https://github.com/tensorflow/tensor2tensor/commit/9dc3d1274ce8cb25513adb071262cadb4ba7e5d3). * [VideoGlow paper added to T2T Papers.](https://github.com/tensorflow/tensor2tensor/commit/b6a9bbbd7c04e69ccfbf8f8d9c4b5b8947729bea) * [Mixture Transformer](https://github.com/tensorflow/tensor2tensor/commit/151dc27eb1b9f169c7e08e9e1b660f011ea99796) * A very basic PPO implementation in TRAX. * More TRAX and RL changes. Bugs: [Correct flat CIFAR modality to not consider 0 as padding](https://github.com/tensorflow/tensor2tensor/commit/2d2d160c4773e38ecdac03d9862b2a90e0170ef6)

### 1.13.1

Bug Fixes: * RL fixes for Model Based RL in 1505 - thanks koz4k * Serving util corrections in 1495 by Drunkar -- thanks! * Fix step size extraction in checkpoints by lzhang10 in 1487 -- thanks!

### 1.13.0

** Modalities refactor: Thanks to Dustin, all modalities are now an enum and just functions, making it easier to understand what's happening in the model. Thanks Dustin! **[Model-Based Reinforcement Learning for Atari](https://arxiv.org/abs/1903.00374)** using T2T, please find a nice writeup in at https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/rl/README.md -- thanks a lot to all the authors! lukaszkaiser mbz piotrmilos blazejosinski Roy Campbell konradczechowski doomie Chelsea Finn koz4k Sergey Levine rsepassi George Tucker and henrykmichalewski ! **[TRAX](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/trax) = T2T + [JAX]**(https://github.com/google/jax) - please try out and give us feedback at 1478 New Models: * Evolved Transformer, thanks stefan-it for adding the paper in 1426 * textCNN model by ybbaigo in 1421 Documentation and Logging: * MultiProblem by cwbeitel in 1399 * ML Enginge logging in 1390 by lgeiger Thanks again cwbeitel and lgeiger -- good docs and logging goes a long way for understandability. Bugs fixed: * t2t_decoder checkpoint fix in 1471 by wanqizhu * xrange fix for py3 by in 1468 lgeiger * Fixing COCO dataset in 1466 by hbrylkowski * Fix math problems by artitw * Decoding rev problems enzh by googlehjx on 1389 * And honourable mentions to qixiuai , 1440 Many many thanks wanqizhu lgeiger hbrylkowski artitw googlehjx and qixiuai for finding and fixing these and sorry for missing anyone else -- this is really really helpful. Code Cleanups: * Registry refactor and optimizer registry by jackd in 1410 and 1401 * Numerous very nice cleanup PRs ex: 1454 1451 1446 1444 1424 1411 1350 by lgeiger Many thanks for the cleanups jackd and lgeiger -- and sorry if I missed anyone else. v.1.12.0 Summary of changes: PRs: * A lot of code cleanup thanks a ton to lgeiger ! This goes a long way with regards to code maintainability and is much appreciated. Ex: PR 1361 , 1350 , 1344 , 1346 , 1345 , 1324 * Fixing LM decode, thanks mikeymezher - PR 1282 * More fast decoding by gcampax, thanks! - PR 999 * Avoid error on beam search - PR 1302 by aeloyq , thanks! * Fix invalid list comprehension, unicode simplifications, py3 fixes 1343, 1318 , 1321, 1258 thanks cclauss ! * Fix is_generate_per_split hard to spot bug, thanks a lot to kngxscn in PR 1322 * Fix py3 compatibility issues in PR 1300 by ywkim , thanks a lot again! * Separate train and test data in MRPC and fix broken link in PR 1281 and 1247 by ywkim - thanks for the hawk eyed change! * Fix universal transformer decoding by artitw in PR 1257 * Fix babi generator by artitw in PR 1235 * Fix transformer moe in 1233 by twilightdema - thanks! * Universal Transformer bugs corrected in 1213 by cfiken - thanks! * Change beam decoder stopping condition, makes decode faster in 965 by mirkobronzi - many thanks! * Bug fix, problem_0_steps variable by senarvi in 1273 * Fixing a typo, by hsm207 in PR 1329 , thanks a lot! New Model and Problems: * New problem and model by artitw in PR 1290 - thanks! * New model for scalar regression in PR 1332 thanks to Kotober * Text CNN for classification in PR 1271 by ybbaigo - thanks a lot! * en-ro translation by lukaszkaiser ! * CoNLL2002 Named Entity Recognition problem added in 1253 by ybbaigo - thanks! New Metrics: * Pearson Correlation metrics in 1274 by luffy06 - thanks a lot! * Custom evaluation metrics, this was one of the most asked features, thanks a lot ywkim in PR 1336 * Word Error Rate metric by stefan-falk in PR 1242 , many thanks! * SARI score for paraphrasing added. Enhancements: * Fast decoding !! Huge thanks to aeloyq in 1295 * Fast GELU unit * Relative dot product visualization PR 1303 thanks aeloyq ! * New MTF models and enhacements, thanks to Noam, Niki and the MTF team * Custom eval hooks in PR 1284 by theorm - thanks a lot ! RL: Lots of commits to Model Based Reinforcement Learning code by konradczechowski koz4k blazejosinski piotrmilos - thanks all !

### 1.11.0

PRs: * Bug fixes in the insight server thanks to haukurb ! * Fix weights initialization in 1196 by mikeymezher - thanks ! * Fix Universal Transformer convergence by MostafaDehghani and rllin-fathom in 1194 and 1192 - thanks ! * Fix add problem hparams after parsing the overrides in 1053 thanks gcampax ! * Fixing error of passing wrong dir in 1185 by stefan-falk , thanks ! New Problems: * Wikipedia Multiproblems by urvashik - thanks ! * New LM problems in de, fr, ro by lukaszkaiser - thanks ! RL: * Continual addition to Model Based RL by piotrmilos , konradczechowski koz4k and blazejosinski ! Video Models: * Many continual updates thanks to mbz and MechCoder - thanks all !

### 1.10.0

NOTE: - MTF code in Tensor2Tensor has been moved to github.com/tensorflow/mesh - thanks dustinvtran New Problems: - English-Setswana translation problem, thanks jaderabbit New layers, models, etc: - Add Bayesian feedforward layer, thanks dustinvtran - Lots of changes to the RL pipeline, thanks koz4k , blazejosinski , piotrmilos , lukaszkaiser , konradczechowski - Lots of work on video mdoels, thanks mbz , MechCoder - Image transformer with local1d and local 2d spatial partitioning, thanks nikiparmar vaswani Usability: - Support DistributionStrategy in Tensor2Tensor for multi-GPU, thanks smit-hinsu ! - Pass data_dir to feature_encoders, thanks stefan-falk - variable_scope wrapper for avg_checkpoints, thanks Mehrad0711 - Modalities cleanup, thanks dustinvtran - Avoid NaN while adding sinusoidal timing signals, thanks peakji - Avoid a ascii codec error in CNN/DailyMail, thanks shahzeb1 - Allow exporting T2T models as tfhub modules, thanks cyfra

### 1.9.0

PRs accepted: Cleaning up the code for gru/lstm as transition function for universal transformer. Thanks MostafaDehghani ! Clipwrapper by piotrmilos ! Corrected transformer spelling mistake - Thanks jurasofish! Fix to universal transformer update weights - Thanks cbockman and cyvius96 ! Common Voice problem fixes and refactoring - Thanks tlatkowski ! Infer observation datatype and shape from the environment - Thanks koz4k ! New Problems / Models: * Added a simple discrete autoencoder video model. Thanks lukaszkaiser ! * DistributedText2TextProblem, a base class for Text2TextProblem for large-datasets. Thanks afrozenator! * Stanford Natural Language Inference problem added `StanfordNLI` in [stanford_nli.py](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/stanford_nli.py). Thanks urvashik ! * `Text2TextRemotedir` added for problems with a persistent remote directory. Thanks rsepassi ! * Add a separate binary for vocabulary file generation for subclasses of Text2TextProblem. Thanks afrozenator! * Added support for non-deterministic ATARI modes and sticky keys. Thanks mbz ! * Pretraining schedule added to MultiProblem and reweighting losses. Thanks urvashik ! * `SummarizeWikiPretrainSeqToSeq32k` and `Text2textElmo` added. * `AutoencoderResidualVAE` added, thanks lukaszkaiser ! * Discriminator changes by lukaszkaiser and aidangomez * Allow scheduled sampling in basic video model, simplify default video modality. Thanks lukaszkaiser ! Code Cleanups: * Use standard vocab naming and fixing translate data generation. Thanks rsepassi ! * Replaced manual ops w/ dot_product_attention in masked_local_attention_1d. Thanks dustinvtran ! * Eager tests! Thanks dustinvtran ! * Separate out a [video/](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/models/video) directory in models/. Thanks lukaszkaiser ! * Speed up RL test - thanks lukaszkaiser ! Bug Fixes: * Don't daisy-chain variables in Universal Transformer. Thanks lukaszkaiser ! * Corrections to mixing, dropout and sampling in autoencoders. Thanks lukaszkaiser ! * WSJ parsing only to use 1000 examples for building vocab. * Fixed scoring crash on empty targets. Thanks David Grangier! * Bug fix in transformer_vae.py Enhancements to MTF, Video Models and much more!

### 1.8.0

Introducing [**MeshTensorFlow**](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/mesh_tensorflow/README.md) - this enables training really big models O(Billions) of parameters. Models/Layers: * Layers Added: NAC and NALU from https://arxiv.org/abs/1808.00508 Thanks lukaszkaiser ! * Added a [sparse graph neural net message passing layer]((https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/layers/common_layers.py)) to tensor2tensor. * Targeted dropout added to ResNet. Thanks aidangomez ! * Added VQA models in `models/research/vqa_*` * Added [`Weight Normalization`](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/layers/common_layers.py) layer from https://arxiv.org/abs/1602.07868. Datasets/Problems: * MSCoCo paraphrase problem added by tlatkowski - many thanks! * `VideoBairRobotPushingWithActions` by mbz ! Usability: * Code cleaup in autoencoder, works both on image and text. Thanks lukaszkaiser * Set the default value of Text2TextProblem.max_subtoken_length to 200, this prevents very long vocabulary generation times. Thanks afrozenator * Add examples to distributed_training.md, update support for async training, and simplify run_std_server codepath. Thanks rsepassi ! * Store variable scopes in T2TModel; add T2TModel.initialize_from_ckpt. Thanks rsepassi ! * Undeprecate exporting the model from the trainer Thanks gcampax ! * Doc fixes, thanks to stefan-it :) * Added t2t_prune: simple magnitude-based pruning script for T2T Thanks aidangomez ! * Added task sampling support for more than two tasks. Thanks urvashik ! Bug Fixes: * Override serving_input_fn for video problems. * `StackWrapper` eliminates problem with repeating actions. Thanks blazejosinski ! * Calculated lengths of sequences using _raw in lstm.py * Update universal_transformer_util.py to fix TypeError Thanks zxqchat ! Testing: * Serving tests re-enabled on Travis using Docker. Thanks rsepassi ! Many more fixes, tests and work on RL, Glow, SAVP, Video and other models and problems.

### 1.7.0

* Added a MultiProblem class for Multitask Learning. Thanks urvashik ! * Added decoding option to pass through the features dictionary to predictions. Thanks rsepassi ! * Enabled MLEngine path to use Cloud TPUs. Thanks rsepassi ! * Added a simple One-Hot Symbol modality. Thanks mbz ! * Added Cleverhans integration. Thanks aidangomez ! * Problem definitions added for: * Allen Brain Atlas problems. Thanks cwbeitel ! * [LSUN Bedrooms](http://lsun.cs.princeton.edu/2017/) dataset. * Added various NLP datasets. Thanks urvashik ! * [MSR Paraphrase Corpus](https://www.microsoft.com/en-us/download/details.aspx?id=52398), * [Quora Question Pairs](https://data.quora.com/First-Quora-Dataset-Release-Question-Pairs), * [Stanford Sentiment Treebank](https://nlp.stanford.edu/sentiment/treebank.html), * [Question Answering NLI classification problems](https://gluebenchmark.com/tasks), * [Recognizing Textual Entailment](https://gluebenchmark.com/tasks), * [Corpus of Linguistic Acceptability](https://gluebenchmark.com/tasks), * [Winograd NLI](https://gluebenchmark.com/tasks). * Added a data generator for WSJ parsing. * Model additions: * Implemented Targeted Dropout for Posthoc Pruning. Thanks aidangomez ! * Added self attention to VQA attention model. * Added fast block parallel transformer model * Implemented auxiliary losses from [Stochastic Activation Pruning for Robust Adversarial Defense](https://arxiv.org/abs/1803.00144). Thanks alexyku ! * Added probability based scheduled sampling for SV2P problem. Thanks mbz ! * Reimplementated Autoencoder and Eval. Thanks piotrmilos ! * Relative memory efficient unmasked self-attention. * Notable bug fixes: * bug with data_gen in style transfer problem Thanks tlatkowski ! * wmt_enfr dataset should not use vocabulary based on "small" dataset. Thanks nshazeer ! * **Many more fixes, tests and work on Model based RL, Transfomer, Video and other models and problems.**

### 1.6.6

* added Mozilla common voice as Problem and style transfer one others! * improvements to ASR data preprocessing (thanks to jarfo) * decoding works for Transformer on TPUs and for timeseries problems * corrections and refactoring of the RL part * Removed deprecated Experiment API code, and support SessionRunHooks on TPU. * many other corrections and work on video problems, latent variables and other Great thanks to everyone!

### 1.6.5

* `registry.hparams` now returns an `HParams` object instead of a function that returns an `HParams` object * New `MultistepAdamOptimizer` thanks to fstahlberg * New video models and problems and improvements to `VideoProblem` * Added `pylintrc` and lint tests to Travis CI * Various fixes, improvements, and additions

### 1.6.3

* `--random_seed` is unset by default now. Set it to an integer value to get reproducible results. * [bAbI text understanding tasks added](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/babi_qa.py) * Have the ML Engine and TPU codepaths use TF 1.8 * Various cloud-related bug fixes * `WikisumWeb` data generation fixes * Various other fixes

### 1.6.2

* Lambada and wikitext103 datasets. * ASR model with Transformer and iPython notebook. * Many other improvements including RL code, autoencoders, the latent transformer (transformer_vae) and more.

### 1.6.1

### 1.6.0

* `--problems` command-line flag renamed to `--problem` * `hparams.problems` renamed to `hparams.problem_hparams` and `hparams.problem_instances` renamed to `hparams.problem` (and neither are lists now) * Dropped support for TensorFlow 1.4 * Various additions, fixes, etc.

### 1.5.7

* Distillation codepath added * Improved support for serving language models * New `TransformerScorer` model which return log prob of targets on `infer` * Support for `bfloat16` weights and activations on TPU * SRU gate added to `common_layers` * `--checkpoint_path` supported in interactive decoding * Improved support for multiple outputs * `VideoProblem` base class * Various fixes, additions, etc.

### 1.5.6

* Scalar summary support on TPUs * New `Squad` and `SquadConcat` problem for question answering (and relevant base class) * New video problems * `bfloat16` support for `Transformer` on TPUs * New `SigmoidClassLabelModality` for binary classification * Support batch prediction with Cloud ML Engine * Various fixes, improvements, additions

### 1.5.5

* Updates to experimental RL codebase * `ImageTransformer` on TPU * Various updates, fixes, additions, etc.

### 1.5.4

* Updates to the RL codebase * Tests updated to use TensorFlow 1.6 * Various fixes, additions, etc.

### 1.5.3

* More flexible Cloud ML Engine usage thanks to bbarnes52 * Fixes thanks to stefan-it wes-turner deasuke bwilbertz * Various other additions, fixes, etc.

### 1.5.2

**Note**: The `Text2TextProblem` has been refactored so if you have subclassed it you may need to rename some methods. Some vocabulary files may need to be renamed as well. * `Text2TextProblem`, `Text2ClassProblem` and `Text2SelfProblem` base classes make specifying new text-based problems easy. See [text_problems.py](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/text_problems.py). * New models and problems, including for image generation and speech-to-text * Various bug fixes, feature additions, improvements, etc. * Test model export and serving for Python 2.7 and TensorFlow 1.5 * Update Travis tests to test against TensorFlow version 1.4, 1.5, and 1.6

### 1.5.1

* TF 1.4 compatibility bug fix for Cloud ML Engine

### 1.5.0

* Launch training on [Cloud TPUs](https://github.com/tensorflow/tensor2tensor/blob/master/docs/cloud_tpu.md) * Launch training and hyperparameter tuning on [Cloud ML Engine](https://github.com/tensorflow/tensor2tensor/blob/master/docs/cloud_mlengine.md) * New [`models/research`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/models/research) subdirectory for more experimental models * Some documentation updates * Bug fixes

### 1.4.4

* Cloud ML Engine support added * New experimental RL module thanks to piotrmilos * Various bug fixes, improvements, etc.

### 1.4.3

**Note**: Tensor2Tensor now requires TensorFlow 1.5. * Working `t2t-bleu` thanks to martinpopel * Improvements to image models: `resnet`, `revnet`, and `shake_shake` * Image problems refactor: faster input pipeline, richer ImageNet data preprocessing. Note that `ImageModality.bottom` no longer normalizes images; that's now done in the input pipeline. * Improvements for running on Google's Cloud TPUs, coming to you soon... * Various bug fixes, improvements, and additions

### 1.4.2

* New [export method](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/serving) for exporting to TensorFlow Serving * [Script for BLEU evaluation](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/bin/t2t_bleu.py) thanks to martinpopel * Better TensorBoard metrics (what was removed has returned), with options to summarize gradients (`--hparams='summarize_grads=True'`) * Various bug fixes, doc updates, new features, as usual Internals: * Scripts in `bin/` are now thin and executable * Main training utility library moved to [`trainer_lib.py`](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/utils/trainer_lib.py)

### 1.4.1

* Support for multi-device evaluation * Support for early stopping in distributed training * Refactor Librispeech problem to use a new speech recognition base class

### 1.4.0

This release is a significant refactor of T2T internals. * [`T2TModel`](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/utils/t2t_model.py) subclasses now have the ability to override the entire Estimator model function with the `estimator_model_fn` method, making them much more flexible. Subclasses can also now override `bottom`, `body`, `top`, `loss`, and `optimize`. * [`Problem`](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/problem.py) subclasses now have the ability to override the entire Estimator input function with the `input_fn` method, making them much more flexible. * The key components of the trainer and decoder - `Experiment`, `Estimator`, `RunConfig`, `HParams` - are all much more easily constructed and used by library callers through [`tpu_trainer_lib.py`](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/tpu/tpu_trainer_lib.py). * We decided to drop support for MultiModel, i.e. training on multiple problems, because it added too much code complexity for the benefit gained. We will consider adding support back in a way that doesn't overcomplicate things too much if there's sufficient interest. There are also the usual new models, feature improvements, bug fixes. * New `image_fashion_mnist` dataset * New `revnet104` model, implementing a large [Reversible Residual Network](https://arxiv.org/abs/1707.04585) * Set `--decode_hparams=write_beam_scores=True` to include beam scores when writing to a file * Beginnings of new interactive visualization server at [insights/](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/insights)