Model Memory Estimator
A new model estimation tool to help calculate how much memory is needed for inference has been added. This does not download the pretrained weights, and utilizes `init_empty_weights` to stay memory efficient during the calculation.
Usage directions:
bash
accelerate estimate-memory {model_name} --library {library_name} --dtypes fp16 int8
Or:
python
from accelerate.commands.estimate import estimate_command_parser, estimate_command, gather_data
parser = estimate_command_parser()
args = parser.parse_args(["bert-base-cased", "--dtypes", "float32"])
output = gather_data(args)
🤗 Hub is a first-class citizen
We've made the `huggingface_hub` library a first-class citizen of the framework! While this is mainly for the model estimation tool, this opens the doors for further integrations should they be wanted
`Accelerator` Enhancements:
- `gather_for_metrics` will now also de-dupe for non-tensor objects. See 1937
- `mixed_precision="bf16"` support on NPU devices. See 1949
- New `breakpoint` API to help when dealing with trying to break from a condition on a single process. See 1940
-
Notebook Launcher Enhancements:
- The notebook launcher now supports launching across multiple nodes! See 1913
FSDP Enhancements:
- Activation checkpointing is now natively supported in the framework. See https://github.com/huggingface/accelerate/pull/1891
- `torch.compile` support was fixed. See 1919
DeepSpeed Enhancements:
- XPU/ccl support (1827)
- Easier gradient accumulation support, simply set `gradient_accumulation_steps` to `"auto"` in your deepspeed config, and Accelerate will use the one passed to `Accelerator` instead (1901)
- Support for custom schedulers and deepspeed optimizers (1909)
What's Changed
* Update release instructions by sgugger in https://github.com/huggingface/accelerate/pull/1877
* fix detach_hook by SunMarc in https://github.com/huggingface/accelerate/pull/1880
* Enable power users to bypass device_map="auto" training block by muellerzr in https://github.com/huggingface/accelerate/pull/1881
* Introduce model memory estimator by muellerzr in https://github.com/huggingface/accelerate/pull/1876
* Update with new url for explore by muellerzr in https://github.com/huggingface/accelerate/pull/1884
* Enable a token to be used by muellerzr in https://github.com/huggingface/accelerate/pull/1886
* Add doc on model memory usage by muellerzr in https://github.com/huggingface/accelerate/pull/1887
* Add hub as core dep by muellerzr in https://github.com/huggingface/accelerate/pull/1885
* update import of deepspeed integration from transformers by pacman100 in https://github.com/huggingface/accelerate/pull/1894
* Final nits on model util by muellerzr in https://github.com/huggingface/accelerate/pull/1896
* Fix nb launcher test by muellerzr in https://github.com/huggingface/accelerate/pull/1899
* Add FSDP activation checkpointing feature by arde171 in https://github.com/huggingface/accelerate/pull/1891
* Solve at least one failing test by muellerzr in https://github.com/huggingface/accelerate/pull/1898
* Deepspeed integration for XPU/ccl by abhilash1910 in https://github.com/huggingface/accelerate/pull/1827
* Add PR template by muellerzr in https://github.com/huggingface/accelerate/pull/1906
* deepspeed grad_acc_steps fixes by pacman100 in https://github.com/huggingface/accelerate/pull/1901
* Skip pypi transformers until release by muellerzr in https://github.com/huggingface/accelerate/pull/1911
* Fix docker images by muellerzr in https://github.com/huggingface/accelerate/pull/1910
* Use hosted CI runners for building docker images by muellerzr in https://github.com/huggingface/accelerate/pull/1915
* fix: add debug argument to sagemaker configuration by maximegmd in https://github.com/huggingface/accelerate/pull/1904
* improve help info when run `accelerate config` on npu by statelesshz in https://github.com/huggingface/accelerate/pull/1895
* support logging with mlflow in case of mlflow-skinny installed by ghtaro in https://github.com/huggingface/accelerate/pull/1874
* More CI fun - run all test parts always by muellerzr in https://github.com/huggingface/accelerate/pull/1916
* Expose auto in dataclass by muellerzr in https://github.com/huggingface/accelerate/pull/1914
* Add support for deepspeed optimizer and custom scheduler by pacman100 in https://github.com/huggingface/accelerate/pull/1909
* reduce gradient first for XLA when unscaling the gradients in mixed precision training with AMP. by statelesshz in https://github.com/huggingface/accelerate/pull/1926
* Check for invalid keys by muellerzr in https://github.com/huggingface/accelerate/pull/1935
* clean num devices by SunMarc in https://github.com/huggingface/accelerate/pull/1936
* Bring back pypi to runners by muellerzr in https://github.com/huggingface/accelerate/pull/1939
* Support multi-node notebook launching by ggaaooppeenngg in https://github.com/huggingface/accelerate/pull/1913
* fix the fsdp docs by pacman100 in https://github.com/huggingface/accelerate/pull/1947
* Fix docs by ggaaooppeenngg in https://github.com/huggingface/accelerate/pull/1951
* Protect tensorflow dependency by SunMarc in https://github.com/huggingface/accelerate/pull/1959
* fix safetensor saving by SunMarc in https://github.com/huggingface/accelerate/pull/1954
* FIX: patch_environment restores pre-existing environment variables when finished by BenjaminBossan in https://github.com/huggingface/accelerate/pull/1960
* Better guards for slow imports by muellerzr in https://github.com/huggingface/accelerate/pull/1963
* [`Tests`] Finish all todos by younesbelkada in https://github.com/huggingface/accelerate/pull/1957
* Rm strtobool by muellerzr in https://github.com/huggingface/accelerate/pull/1964
* Implementing gather_for_metrics with dedup for non tensor objects by Lorenzobattistela in https://github.com/huggingface/accelerate/pull/1937
* add bf16 mixed precision support for NPU by statelesshz in https://github.com/huggingface/accelerate/pull/1949
* Introduce breakpoint API by muellerzr in https://github.com/huggingface/accelerate/pull/1940
* fix torch compile with FSDP by pacman100 in https://github.com/huggingface/accelerate/pull/1919
* Add `force_hooks` to `dispatch_model` by austinapatel in https://github.com/huggingface/accelerate/pull/1969
* update FSDP and DeepSpeed docs by pacman100 in https://github.com/huggingface/accelerate/pull/1973
* Flex fix patch for accelerate by abhilash1910 in https://github.com/huggingface/accelerate/pull/1972
* Remove checkpoints only on main process by Kepnu4 in https://github.com/huggingface/accelerate/pull/1974
New Contributors
* arde171 made their first contribution in https://github.com/huggingface/accelerate/pull/1891
* maximegmd made their first contribution in https://github.com/huggingface/accelerate/pull/1904
* ghtaro made their first contribution in https://github.com/huggingface/accelerate/pull/1874
* ggaaooppeenngg made their first contribution in https://github.com/huggingface/accelerate/pull/1913
* Lorenzobattistela made their first contribution in https://github.com/huggingface/accelerate/pull/1937
* austinapatel made their first contribution in https://github.com/huggingface/accelerate/pull/1969
* Kepnu4 made their first contribution in https://github.com/huggingface/accelerate/pull/1974
**Full Changelog**: https://github.com/huggingface/accelerate/compare/v0.22.0...v0.23.0