Added - Added support for Translation Error Rate (TER) metric as implemented in sacrebleu==1.4.14. Checkpoint decoder metrics will now include TER scores and early stopping can be determined via TER improvements (`--optimized-metric ter`)
3.0.13
Changed - use `expand` instead of `repeat` for attention masks to not allocate additional memory - avoid repeated `transpose` for initializing cached encoder-attention states in the decoder.
3.0.12
Removed - Removed unused code for Weight Normalization. Minor code cleanups.
3.0.11
Fixed
- Fixed training with a single, fixed learning rate instead of a rate scheduler (`--learning-rate-scheduler none --initial-learning-rate ...`).
3.0.10
Changed
- End-to-end trace decode_step of the Sockeye model. Creates less overhead during decoding and a small speedup.
3.0.9
Fixed
- Fixed not calling the traced target embedding module during inference.