Added - Added support for LHUC in RNN models (David Vilar, "Learning Hidden Unit Contribution for Adapting Neural Machine Translation Models" NAACL 2018)
Fixed - Word based batching with very small batch sizes.
1.18.6
Not secure
Fixed - Fixed a problem with learning rate scheduler not properly being loaded when resuming training.
1.18.5
Not secure
Fixed - Fixed a problem with trainer not waiting for the last checkpoint decoder (367).
1.18.4
Added - Added options to control training length w.r.t number of updates/batches or number of samples: `--min-updates`, `--max-updates`, `--min-samples`, `--max-samples`.
1.18.3
Changed - Training now supports training and validation data that contains empty segments. If a segment is empty, it is skipped during loading and a warning message including the number of empty segments is printed.
1.18.2
Changed - Removed combined linear projection of keys & values in source attention transformer layers for performance improvements. - The topk operator is performed in a single operation during batch decoding instead of running in a loop over each sentence, bringing speed benefits in batch decoding.