Changelogs » Tensor2tensor



* New [`RTransformer`]( model, a recurrent Transformer
* New [English-Estonian translation dataset]( thanks to stefan-it
* New `ROC_AUC` metric thanks to jjtan
* Various fixes, improvements, additions, etc.


* Bug fixes in the insight server thanks to haukurb !
* Fix weights initialization in 1196 by mikeymezher - thanks !
* Fix Universal Transformer convergence by MostafaDehghani and rllin-fathom in 1194 and 1192  - thanks !
* Fix add problem hparams after parsing the overrides in 1053 thanks gcampax !
* Fixing error of passing wrong dir in 1185 by stefan-falk , thanks !

New Problems:
* Wikipedia Multiproblems by urvashik - thanks !
* New LM problems in de, fr, ro by lukaszkaiser - thanks !

* Continual addition to Model Based RL by piotrmilos , konradczechowski koz4k and blazejosinski !

Video Models:
* Many continual updates thanks to mbz and MechCoder - thanks all !


- MTF code in Tensor2Tensor has been moved to - thanks dustinvtran

New Problems:
- English-Setswana translation problem, thanks jaderabbit

New layers, models, etc:
- Add Bayesian feedforward layer, thanks dustinvtran
- Lots of changes to the RL pipeline, thanks koz4k , blazejosinski , piotrmilos , lukaszkaiser , konradczechowski
- Lots of work on video mdoels, thanks mbz , MechCoder
- Image transformer with local1d and local 2d spatial partitioning, thanks nikiparmar vaswani

- Support DistributionStrategy in Tensor2Tensor for multi-GPU, thanks smit-hinsu !
- Pass data_dir to feature_encoders, thanks stefan-falk
- variable_scope wrapper for avg_checkpoints, thanks Mehrad0711
- Modalities cleanup, thanks dustinvtran
- Avoid NaN while adding sinusoidal timing signals, thanks peakji
- Avoid a ascii codec error in CNN/DailyMail, thanks shahzeb1
- Allow exporting T2T models as tfhub modules, thanks cyfra


PRs accepted:
Cleaning up the code for gru/lstm as transition function for universal transformer. Thanks MostafaDehghani !
Clipwrapper by piotrmilos !
Corrected transformer spelling mistake - Thanks jurasofish!
Fix to universal transformer update weights - Thanks cbockman and cyvius96 !
Common Voice problem fixes and refactoring - Thanks tlatkowski !
Infer observation datatype and shape from the environment - Thanks koz4k !

New Problems / Models:
* Added a simple discrete autoencoder video model. Thanks lukaszkaiser !
* DistributedText2TextProblem, a base class for Text2TextProblem for large-datasets. Thanks afrozenator!
* Stanford Natural Language Inference problem added `StanfordNLI` in []( Thanks urvashik !
* `Text2TextRemotedir` added for problems with a persistent remote directory. Thanks rsepassi !
* Add a separate binary for vocabulary file generation for subclasses of Text2TextProblem. Thanks afrozenator!
* Added support for non-deterministic ATARI modes and sticky keys. Thanks mbz !
* Pretraining schedule added to MultiProblem and reweighting losses. Thanks urvashik !
* `SummarizeWikiPretrainSeqToSeq32k` and `Text2textElmo` added.
* `AutoencoderResidualVAE` added, thanks lukaszkaiser !
* Discriminator changes by lukaszkaiser  and aidangomez
* Allow scheduled sampling in basic video model, simplify default video modality. Thanks lukaszkaiser !

Code Cleanups:
* Use standard vocab naming and fixing translate data generation. Thanks rsepassi !
* Replaced manual ops w/ dot_product_attention in masked_local_attention_1d. Thanks dustinvtran !
* Eager tests! Thanks dustinvtran !
* Separate out a [video/]( directory in models/. Thanks lukaszkaiser !
* Speed up RL test - thanks lukaszkaiser !

Bug Fixes:
* Don't daisy-chain variables in Universal Transformer. Thanks lukaszkaiser !
* Corrections to mixing, dropout and sampling in autoencoders. Thanks lukaszkaiser !
* WSJ parsing only to use 1000 examples for building vocab.
* Fixed scoring crash on empty targets. Thanks David Grangier!
* Bug fix in

Enhancements to MTF, Video Models and much more!


Introducing [**MeshTensorFlow**]( - this enables training really big models O(Billions) of parameters.

* Layers Added: NAC and NALU from Thanks lukaszkaiser !
* Added a [sparse graph neural net message passing layer](( to tensor2tensor.
* Targeted dropout added to ResNet. Thanks aidangomez !
* Added VQA models in `models/research/vqa_*`
* Added [`Weight Normalization`]( layer from

* MSCoCo paraphrase problem added by tlatkowski - many thanks!
* `VideoBairRobotPushingWithActions` by mbz !

* Code cleaup in autoencoder, works both on image and text. Thanks lukaszkaiser
* Set the default value of Text2TextProblem.max_subtoken_length to 200, this prevents very long vocabulary generation times. Thanks afrozenator
* Add examples to, update support for async training, and simplify run_std_server codepath. Thanks rsepassi !
* Store variable scopes in T2TModel; add T2TModel.initialize_from_ckpt. Thanks rsepassi !
* Undeprecate exporting the model from the trainer Thanks gcampax !
* Doc fixes, thanks to stefan-it :)
* Added t2t_prune: simple magnitude-based pruning script for T2T Thanks aidangomez !
* Added task sampling support for more than two tasks. Thanks urvashik !

Bug Fixes:
* Override serving_input_fn for video problems.
* `StackWrapper` eliminates problem with repeating actions. Thanks blazejosinski !
* Calculated lengths of sequences using _raw in
* Update to fix TypeError Thanks zxqchat !

* Serving tests re-enabled on Travis using Docker. Thanks rsepassi !

Many more fixes, tests and work on RL, Glow, SAVP, Video and other models and problems.


* Added a MultiProblem class for Multitask Learning. Thanks urvashik !
* Added decoding option to pass through the features dictionary to predictions. Thanks rsepassi !
* Enabled MLEngine path to use Cloud TPUs. Thanks rsepassi !
* Added a simple One-Hot Symbol modality. Thanks mbz !
* Added Cleverhans integration. Thanks aidangomez !

* Problem definitions added for:
* Allen Brain Atlas problems. Thanks cwbeitel !
* [LSUN Bedrooms]( dataset.
* Added various NLP datasets. Thanks urvashik !
* [MSR Paraphrase Corpus](,
* [Quora Question Pairs](,
* [Stanford Sentiment Treebank](,
* [Question Answering NLI classification problems](,
* [Recognizing Textual Entailment](,
* [Corpus of Linguistic Acceptability](,
* [Winograd NLI](
* Added a data generator for WSJ parsing.

* Model additions:
* Implemented Targeted Dropout for Posthoc Pruning. Thanks aidangomez !
* Added self attention to VQA attention model.
* Added fast block parallel transformer model
* Implemented auxiliary losses from [Stochastic Activation Pruning for Robust Adversarial Defense]( Thanks alexyku !
* Added probability based scheduled sampling for SV2P problem. Thanks mbz !
* Reimplementated Autoencoder and Eval. Thanks piotrmilos !
* Relative memory efficient unmasked self-attention.

* Notable bug fixes:
* bug with data_gen in style transfer problem Thanks tlatkowski !
* wmt_enfr dataset should not use vocabulary based on "small" dataset. Thanks nshazeer !

* **Many more fixes, tests and work on Model based RL, Transfomer, Video and other models and problems.**


* added Mozilla common voice as Problem and style transfer one others!
* improvements to ASR data preprocessing (thanks to jarfo)
* decoding works for Transformer on TPUs and for timeseries problems
* corrections and refactoring of the RL part
* Removed deprecated Experiment API code, and support SessionRunHooks on TPU.
* many other corrections and work on video problems, latent variables and other

Great thanks to everyone!


* `registry.hparams` now returns an `HParams` object instead of a function that returns an `HParams` object
* New `MultistepAdamOptimizer` thanks to fstahlberg
* New video models and problems and improvements to `VideoProblem`
* Added `pylintrc` and lint tests to Travis CI
* Various fixes, improvements, and additions


* `--random_seed` is unset by default now. Set it to an integer value to get reproducible results.
* [bAbI text understanding tasks added](
* Have the ML Engine and TPU codepaths use TF 1.8
* Various cloud-related bug fixes
* `WikisumWeb` data generation fixes
* Various other fixes


* Lambada and wikitext103 datasets.
* ASR model with Transformer and iPython notebook.
* Many other improvements including RL code, autoencoders, the latent transformer (transformer_vae) and more.




* `--problems` command-line flag renamed to `--problem`
* `hparams.problems` renamed to `hparams.problem_hparams` and `hparams.problem_instances` renamed to `hparams.problem` (and neither are lists now)
* Dropped support for TensorFlow 1.4
* Various additions, fixes, etc.


* Distillation codepath added
* Improved support for serving language models
* New `TransformerScorer` model which return log prob of targets on `infer`
* Support for `bfloat16` weights and activations on TPU
* SRU gate added to `common_layers`
* `--checkpoint_path` supported in interactive decoding
* Improved support for multiple outputs
* `VideoProblem` base class
* Various fixes, additions, etc.


* Scalar summary support on TPUs
* New `Squad` and `SquadConcat` problem for question answering (and relevant base class)
* New video problems
* `bfloat16` support for `Transformer` on TPUs
* New `SigmoidClassLabelModality` for binary classification
* Support batch prediction with Cloud ML Engine
* Various fixes, improvements, additions


* Updates to experimental RL codebase
* `ImageTransformer` on TPU
* Various updates, fixes, additions, etc.


* Updates to the RL codebase
* Tests updated to use TensorFlow 1.6
* Various fixes, additions, etc.


* More flexible Cloud ML Engine usage thanks to bbarnes52
* Fixes thanks to stefan-it wes-turner deasuke bwilbertz
* Various other additions, fixes, etc.


**Note**: The `Text2TextProblem` has been refactored so if you have subclassed it you may need to rename some methods. Some vocabulary files may need to be renamed as well.

* `Text2TextProblem`, `Text2ClassProblem` and `Text2SelfProblem` base classes make specifying new text-based problems easy. See [](
* New models and problems, including for image generation and speech-to-text
* Various bug fixes, feature additions, improvements, etc.
* Test model export and serving for Python 2.7 and TensorFlow 1.5
* Update Travis tests to test against TensorFlow version 1.4, 1.5, and 1.6


* TF 1.4 compatibility bug fix for Cloud ML Engine


* Launch training on [Cloud TPUs](
* Launch training and hyperparameter tuning on [Cloud ML Engine](
* New [`models/research`]( subdirectory for more experimental models
* Some documentation updates
* Bug fixes


* Cloud ML Engine support added
* New experimental RL module thanks to piotrmilos
* Various bug fixes, improvements, etc.


**Note**: Tensor2Tensor now requires TensorFlow 1.5.

* Working `t2t-bleu` thanks to martinpopel
* Improvements to image models: `resnet`, `revnet`, and `shake_shake`
* Image problems refactor: faster input pipeline, richer ImageNet data preprocessing. Note that `ImageModality.bottom` no longer normalizes images; that's now done in the input pipeline.
* Improvements for running on Google's Cloud TPUs, coming to you soon...
* Various bug fixes, improvements, and additions


* New [export method]( for exporting to TensorFlow Serving
* [Script for BLEU evaluation]( thanks to martinpopel
* Better TensorBoard metrics (what was removed has returned), with options to summarize gradients (`--hparams='summarize_grads=True'`)
* Various bug fixes, doc updates, new features, as usual


* Scripts in `bin/` are now thin and executable
* Main training utility library moved to [``](


* Support for multi-device evaluation
* Support for early stopping in distributed training
* Refactor Librispeech problem to use a new speech recognition base class


This release is a significant refactor of T2T internals.

* [`T2TModel`]( subclasses now have the ability to override the entire Estimator model function with the `estimator_model_fn` method, making them much more flexible. Subclasses can also now override `bottom`, `body`, `top`, `loss`, and `optimize`.
* [`Problem`]( subclasses now have the ability to override the entire Estimator input function with the `input_fn` method, making them much more flexible.
* The key components of the trainer and decoder - `Experiment`, `Estimator`, `RunConfig`, `HParams` - are all much more easily constructed and used by library callers through [``](
* We decided to drop support for MultiModel, i.e. training on multiple problems, because it added too much code complexity for the benefit gained. We will consider adding support back in a way that doesn't overcomplicate things too much if there's sufficient interest.

There are also the usual new models, feature improvements, bug fixes.

* New `image_fashion_mnist` dataset
* New `revnet104` model, implementing a large [Reversible Residual Network](
* Set `--decode_hparams=write_beam_scores=True` to include beam scores when writing to a file
* Beginnings of new interactive visualization server at [insights/](


* Small improvements for attention vizualization in colab.


* Improvements for TF Eager compatibility


* **WARNING**: Checkpoints produced with old versions break with this new release due to new variable scoping
* Various changes make T2T models and problems compatible with the new TF Eager mode - we'll have more on that soon
* `tpu_trainer` becoming more fully featured
* Internal refactoring moving towards more flexibility in specifying the Estimator `input_fn` and `model_fn`


* Quick bug fixes


* Batch norm should now work in T2T - fixed the custom variable getters
* Simplified `ImageModality` and removal of `SmallImageModality`
* Simplified `ClassLabelModality` and removal of `ClassLabel1DModality`
* New modality with CTC loss
* New vanilla_gan model that's a good example of a simple GAN
* TPU advances: Xception, Resnet50, and Transformer verified to work, code path uses Experiment, usage doc for Cloud TPU alpha customers
* Various small fixes, improvements, features