Mlx

Latest version: v0.12.0

Safety actively analyzes 621825 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 3

0.12.0

Highlights

* Faster quantized matmul
* Up to 40% faster QLoRA or prompt processing, [some numbers](https://github.com/ml-explore/mlx/pull/1030#issuecomment-2075606627)

Core

* `mx.synchronize` to wait for computation dispatched with `mx.async_eval`
* `mx.radians` and `mx.degrees`
* `mx.metal.clear_cache` to return to the OS the memory held by MLX as a cache for future allocations

Bugfixes

* Fixed quantization of a block with all 0s that produced NaNs

0.11.0

Core

- `mx.block_masked_mm` for block-level sparse matrix multiplication
- Shared events for synchronization and asynchronous evaluation

NN

- `nn.QuantizedEmbedding` layer
- `nn.quantize` for quantizing modules
- `gelu_approx` uses tanh for consistency with PyTorch

0.10.0

0.9.0

Highlights:
- Fast partial RoPE (used by Phi-2)
- Fast gradients for RoPE, RMSNorm, and LayerNorm
- Up to 7x faster, [benchmarks](https://github.com/ml-explore/mlx/pull/883#issue-2204137982)

Core
- More overhead reductions
- Partial fast RoPE (fast Phi-2)
- Better buffer donation for copy
- Type hierarchy and issubdtype
- Fast VJPs for RoPE, RMSNorm, and LayerNorm

NN
- `Module.set_dtype`
- Chaining in `nn.Module` (`model.freeze().update(…)`)

Bugfixes
- Fix set item bugs
- Fix scatter vjp
- Check shape integer overlow on array construction
- Fix bug with module attributes
- Fix two bugs for odd shaped QMV
- Fix GPU sort for large sizes
- Fix bug in negative padding for convolutions
- Fix bug in multi-stream race condition for graph evaluation
- Fix random normal generation for half precision

0.8.0

Highlights

- More perf!
- `mx.fast.rms_norm` and `mx.fast.layer_norm`
- Switch to Nanobind [substantially reduces overhead](https://github.com/ml-explore/mlx/pull/839#issuecomment-2002659144)
- Up to 4x faster `__setitem__` (e.g. `a[...] = b`)

Core

- `mx.inverse`, CPU only
- vmap over `mx.matmul` and `mx.addmm`
- Switch to nanobind from pybind11
- Faster __setitem__ indexing
- [Benchmarks](https://github.com/ml-explore/mlx/pull/861#issuecomment-2010791492)
- `mx.fast.rms_norm`, [token generation benchmark](https://github.com/ml-explore/mlx/pull/862)
- `mx.fast.layer_norm`, [token generation benchmark](https://github.com/ml-explore/mlx/pull/870#issuecomment-2013707376)
- vmap for inverse and svd
- Faster non-overlapping pooling

Optimizers
- Set minimum value in cosine decay scheduler

Bugfixes
- Fix bug in multi-dimensional reduction

0.7.0

Highlights
- Perf improvements for attention ops:
- No copy broadcast matmul ([benchmarks](https://github.com/ml-explore/mlx/pull/801#issuecomment-1989548617))
- Fewer copies in reshape

Core

- Faster broadcast + gemm
- [benchmarks](https://github.com/ml-explore/mlx/pull/801#issuecomment-1989548617)
- `mx.linalg.svd` (CPU only)
- Fewer copies in reshape
- Faster small reductions
- [benchmarks](https://github.com/ml-explore/mlx/pull/826#issue-2182833003)

NN
- `nn.RNN`, `nn.LSTM`, `nn.GRU`

Bugfixes
- Fix bug in depth traversal ordering
- Fix two edge case bugs in compilation
- Fix bug with modules with dictionaries of weights
- Fix bug with scatter which broke MOE training
- Fix bug with compilation kernel collision

Page 1 of 3

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.