Sockeye

Latest version: v3.1.34

Safety actively analyzes 629004 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 19 of 45

2.1.3

Changed

- Performance optimizations to beam search inference
- Remove unneeded take ops on encoder states
- Gathering input data before sending to GPU, rather than sending each batch element individually
- All of beam search can be done in fp16, if specified by the model
- Other small miscellaneous optimizations
- Model states are now a flat list in ensemble inference, structure of states provided by `state_structure()`

2.1.2

Changed

- Updated to [MXNet 1.6.0](https://github.com/apache/incubator-mxnet/tree/1.6.0)

Added

- Added support for CUDA 10.2

Removed

- Removed support for CUDA<9.1 / CUDNN<7.5

2.1.1

Added
- Ability to set environment variables from training/translate CLIs before MXNet is imported. For example, users can
configure MXNet as such: `--env "OMP_NUM_THREADS=1;MXNET_ENGINE_TYPE=NaiveEngine"`

2.1.0

Changed

- Version bump, which should have been included in commit b0461b due to incompatible models.

2.0.1

Changed

- Inference defaults to using the max input length observed in training (versus scaling down based on mean length ratio and standard deviations).

Added

- Additional parameter fixing strategies:
- `all_except_feed_forward`: Only train feed forward layers.
- `encoder_and_source_embeddings`: Only train the decoder (decoder layers, output layer, and target embeddings).
- `encoder_half_and_source_embeddings`: Train the latter half of encoder layers and the decoder.
- Option to specify the number of CPU threads without using an environment variable (`--omp-num-threads`).
- More flexibility for source factors combination

2.0.0

Changed

- Update to [MXNet 1.5.0](https://github.com/apache/incubator-mxnet/tree/1.5.0)
- Moved `SockeyeModel` implementation and all layers to [Gluon API](http://mxnet.incubator.apache.org/versions/master/gluon/index.html)
- Removed support for Python 3.4.
- Removed image captioning module
- Removed outdated Autopilot module
- Removed unused training options: Eve, Nadam, RMSProp, Nag, Adagrad, and Adadelta optimizers, `fixed-step` and `fixed-rate-inv-t` learning rate schedulers
- Updated and renamed learning rate scheduler `fixed-rate-inv-sqrt-t` -> `inv-sqrt-decay`
- Added script for plotting metrics files: [sockeye_contrib/plot_metrics.py](sockeye_contrib/plot_metrics.py)
- Removed option `--weight-tying`. Weight tying is enabled by default, disable with `--weight-tying-type none`.

Added

- Added distributed training support with Horovod/MPI. Use `horovodrun` and the `--horovod` training flag.
- Added Dockerfiles that build a Sockeye image with all features enabled. See [sockeye_contrib/docker](sockeye_contrib/docker).
- Added `none` learning rate scheduler (use a fixed rate throughout training)
- Added `linear-decay` learning rate scheduler
- Added training option `--learning-rate-t-scale` for time-based decay schedulers
- Added support for MXNet's [Automatic Mixed Precision](https://mxnet.incubator.apache.org/versions/master/tutorials/amp/amp_tutorial.html). Activate with the `--amp` training flag. For best results, make sure as many model dimensions are possible are multiples of 8.
- Added options for making various model dimensions multiples of a given value. For example, use `--pad-vocab-to-multiple-of 8`, `--bucket-width 8 --no-bucket-scaling`, and `--round-batch-sizes-to-multiple-of 8` with AMP training.
- Added [GluonNLP](http://gluon-nlp.mxnet.io/)'s BERTAdam optimizer, an implementation of the Adam variant used by Devlin et al. ([2018](https://arxiv.org/pdf/1810.04805.pdf)). Use `--optimizer bertadam`.
- Added training option `--checkpoint-improvement-threshold` to set the amount of metric improvement required over the window of previous checkpoints to be considered actual model improvement (used with `--max-num-checkpoint-not-improved`).

Page 19 of 45

Links

Releases

Has known vulnerabilities

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.