This is the release note of v12.0.0b1. See [here](https://github.com/cupy/cupy/pulls?q=is%3Apr+is%3Aclosed+milestone%3Av12.0.0b1) for the complete list of solved issues and merged PRs.
We are running a [Gitter chat](https://gitter.im/cupy/community) for general discussions and quick questions. Feel free to join the channel to talk with developers and users!
Highlights
Support for CUDA 11.8 & NVIDIA H100 GPUs
This release adds support for CUDA 11.8 and the latest NVIDIA H100 GPUs. Note that CUDA 11.8 support is included in the `cupy-cuda11x` wheel.
Support for Python 3.11
Wheels are now available for Python 3.11.
`ufunc` Methods
This release adds `ufunc.reduce`, `ufunc.accumulate`, `ufunc.reduceat`, and `ufunc.at` methods. See the [documentation](https://docs.cupy.dev/en/latest/reference/ufunc.html#methods) for more details.
Use Thrust in `cupyx.jit` (7054, 7139)
Now it is possible to use the Thrust library device functions in kernels written using CuPy JIT.
python
import cupy, cupyx
cupyx.jit.rawkernel()
def sort_by_key(x, y):
i = cupyx.jit.threadIdx.x
x_array = x[i]
y_array = y[i]
cupyx.jit.thrust.sort_by_key(
cupyx.jit.thrust.device,
x_array.begin(),
x_array.end(),
y_array.begin(),
)
h, w = (256, 256)
x = cupy.arange(h * w, dtype=cupy.int32)
cupy.random.shuffle(x)
x = x.reshape(h, w)
y = cupy.arange(h * w, dtype=cupy.int32)
cupy.random.shuffle(y)
y = y.reshape(h, w)
sort_by_key[1, 256](x, y)
Currently supported Thrust functions are `count`, `copy`, `find`, `mismatch`, `sort`, `sort_by_key`.
Acknowledgements: This work was done by Tsutsui Masayoshi (TsutsuiMasayoshi) as a part of the internship program at [Preferred Networks](https://www.preferred.jp/en/).
Changes without compatibility
Deprecates `ndarray.scatter_{add,max,min}` (7097)
`cupy.ndarray.scatter_{add,max,min}` methods are marked as deprecated. Use the corresponding ufunc methods (`cupy.{add,maximum,minimum}.at`) instead.
CUDA library wrappers now live in `cupyx` (7013)
Previously, CuPy has been providing high-level wrappers for CUDA libraries as `cupy.cudnn`, `cupy.cusolver`, `cupy.cusparse`, and `cupy.cutensor`. These modules are now moved to `cupyx` as a part of the `cupy` namespace cleanup. The old modules are still available but marked as deprecated. Note that these modules are still undocumented and may be subject to change.
Changes
New Features
- Add `axis` to `cupy.logspace` (6797)
- Support `thrust::count, device` in CuPy JIT (7054)
- Add `cupy.ndarray.searchsorted` (7059)
- Support `add.at`, `maximum.at`, `minimum.at` (7077)
- Add pdist implementation to distance functions (7078)
- Support `subtract.at`, `bitwise_and.at`, `bitwise_or.at`, `bitwise_xor.at` (7099)
- Add `ufunc.reduce` and `ufunc.accumulate` (7105)
- Add `cupy.add.reduceat` (7115)
- Implement `cupy.min_scalar_type` (7136)
- JIT: Support more thrust functions (7139)
Enhancements
- Move `cupy.cudnn` `cupy.cusolver` `cupy.cutensor` `cupy.cusparse` to `cupyx` (7013)
- Allow randint to support array bounds (7051)
- Deprecate `ndarray.scatter_{add, max, min}` (7097)
- Support CUDA 11.8 H100 GPUs (7100)
- Support CUDA 11.8 (7117)
- Add CUDA 11.8 on documents (7119)
- Fix compile error from `inf`/`nan` in cupy.fuse (7122)
- Raise `TypeError` instead of `ValueError` in `cupy.from_dlpack` when CPU tensor is passed (7133)
- Support NCCL 2.15 (7153)
- Support Python 3.11 (7159)
- Fix indexing sparse matrix with empty index arguments (7143)
Bug Fixes
- Make sure that cupy (array-api) Array objects can be composed using asarray (6874)
- Don't use `__del__` in `TCPStore` (6989)
- JIT: Fix compile error for `op.routine` including `in0_type` (7076)
- Fix `cupy.nansum` in fusing (7102)
- Fusion `TypeError` in `cupy._core.fusion._call_ufunc()` (7113)
- Fix a typo (7163)
- JIT: Fix compile error of minmax function (7167)
Code Fixes
- Remove `_ufunc_method` directory (7116)
- Add missing base type to cdef declarations (7170)
Documentation
- Docs: Add missing functions (7103)
- Docs: ufunc methods (7104)
- Improve benchmark documentation (7176)
Installation
- Bump version to v12.0.0b1 (7181)
Examples
Tests
- CI: Add ROCm 5.3 (7124)
- CI: Allow `/test jenkins` to trigger Jenkins only (7126)
- Install zlib for CUDA 11.8 Windows CI (7137)
- CI: improve use of cache in GitHub Actions (7141)
- Fix for pytest 7.2 (7147)
- CI: Add support for the latest FlexCI Windows image (7161)
- JIT: Skip HIP `thrust::sort` test (7162)
- CI: use pre-commit in GitHub Actions (7123)
Others
Contributors
The CuPy Team would like to thank all those who contributed to this release!
anaruse andfoy asi1024 Diwakar-Gupta emcastillo IncubatorShokuhou kmaehashi MarcoGorelli takagi TsutsuiMasayoshi