Torchvision

Latest version: v0.18.0

Safety actively analyzes 630254 Python packages for vulnerabilities to keep your Python projects secure.

Page 17 of 23

0.16.1

This is a minor release that only contains bug-fixes

Bug Fixes

* [models] Fix download of efficientnet weights (8036)
* [transforms] Fix v2 transforms in spawn multi-processing context (8067)

0.16.0

Highlights

[BETA] Transforms and augmentations

![sphx_glr_plot_transforms_getting_started_004](https://github.com/pytorch/vision/assets/1190450/fc42eabe-d3fe-40c1-8365-2177e389521b)

Major speedups

The new transforms in `torchvision.transforms.v2` support image classification, segmentation, detection, and video tasks. They are now [10%-40% faster](https://github.com/pytorch/vision/issues/7497#issuecomment-1557478635) than before! This is mostly achieved thanks to 2X-4X improvements made to `v2.Resize()`, which now supports native `uint8` tensors for Bilinear and Bicubic mode. Output results are also now closer to PIL's! Check out our [performance recommendations](https://pytorch.org/vision/stable/transforms.html#performance-considerations) to learn more.

Additionally, `torchvision` now ships with `libjpeg-turbo` instead of `libjpeg`, which should significantly speed-up the jpeg decoding utilities ([`read_image`](https://pytorch.org/vision/stable/generated/torchvision.io.read_image.html#torchvision.io.read_image), [`decode_jpeg`](https://pytorch.org/vision/stable/generated/torchvision.io.read_image.html#torchvision.io.decode_jpeg)), and avoid compatibility issues with PIL.

CutMix and MixUp

Long-awaited support for the `CutMix` and `MixUp` augmentations is now here! Check [our tutorial](https://pytorch.org/vision/stable/auto_examples/transforms/plot_cutmix_mixup.html#sphx-glr-auto-examples-transforms-plot-cutmix-mixup-py) to learn how to use them.

Towards stable V2 transforms

In the [previous release 0.15](https://github.com/pytorch/vision/releases/tag/v0.15.1) we BETA-released a new set of transforms in `torchvision.transforms.v2` with native support for tasks like segmentation, detection, or videos. We have now stabilized the design decisions of these transforms and made further improvements in terms of speedups, usability, new transforms support, etc.

We're keeping the `torchvision.transforms.v2` and `torchvision.tv_tensors` namespaces as BETA until 0.17 out of precaution, but we do not expect disruptive API changes in the future.

Whether you’re new to Torchvision transforms, or you’re already experienced with them, we encourage you to start with [Getting started with transforms v2](https://pytorch.org/vision/stable/auto_examples/transforms/plot_transforms_getting_started.html#sphx-glr-auto-examples-transforms-plot-transforms-getting-started-py) in order to learn more about what can be done with the new v2 transforms.

Browse our [main docs](https://pytorch.org/vision/stable/transforms.html#) for general information and performance tips. The available transforms and functionals are listed in the [API reference](https://pytorch.org/vision/stable/transforms.html#v2-api-ref). Additional information and tutorials can also be found in our [example gallery](https://pytorch.org/vision/stable/auto_examples/index.html#gallery), e.g. [Transforms v2: End-to-end object detection/segmentation example](https://pytorch.org/vision/stable/auto_examples/transforms/plot_transforms_e2e.html#sphx-glr-auto-examples-transforms-plot-transforms-e2e-py) or [How to write your own v2 transforms](https://pytorch.org/vision/stable/auto_examples/transforms/plot_custom_transforms.html#sphx-glr-auto-examples-transforms-plot-custom-transforms-py).

[BETA] MPS support

The `nms` and roi-align kernels (`roi_align`, `roi_pool`, `ps_roi_align`, `ps_roi_pool`) now support MPS. Thanks to [Li-Huai (Allan) Lin](https://github.com/qqaatw) for this contribution!

---------

Detailed Changes

Deprecations / Breaking changes

All changes below happened in the `transforms.v2` and `datapoints` namespaces, which were BETA and protected with a warning. **We do not expect other disruptive changes to these APIs moving forward!**

[transforms.v2] `to_grayscale()` is not deprecated anymore (7707)
[transforms.v2] Renaming: `torchvision.datapoints.Datapoint` -> `torchvision.tv_tensors.TVTensor` (7904, 7894)
[transforms.v2] Renaming: `BoundingBox` -> `BoundingBoxes` (7778)
[transforms.v2] Renaming: `BoundingBoxes.spatial_size` -> `BoundingBoxes.canvas_size` (7734)
[transforms.v2] All public method on `TVTensor` classes (previously: `Datapoint` classes) were removed
[transforms.v2] `transforms.v2.utils` is now private. (7863)
[transforms.v2] Remove `wrap_like` class method and add `tv_tensors.wrap()` function (7832)

New Features

[transforms.v2] Add support for `MixUp` and `CutMix` (7731, 7784)
[transforms.v2] Add `PermuteChannels` transform (7624)
[transforms.v2] Add `ToPureTensor` transform (7823)
[ops] Add MPS kernels for `nms` and `roi` ops (7643)

Improvements

[io] Added support for CMYK images in `decode_jpeg` (7741)
[io] Package torchvision with `libjpeg-turbo` instead of `libjpeg` (7672, 7840)
[models] Downloaded weights are now sha256-validated (7219)
[transforms.v2] Massive `Resize` speed-up by adding native `uint8` support for bilinear and bicubic modes (7557, 7668)
[transforms.v2] Enforce pickleability for v2 transforms and wrapped datasets (7860)
[transforms.v2] Allow catch-all "others" key in `fill` dicts. (7779)
[transforms.v2] Allow passthrough for `Resize` (7521)
[transforms.v2] Add `scale` option to `ToDtype`. Remove `ConvertDtype`. (7759, 7862)
[transforms.v2] Improve UX for `Compose` (7758)
[transforms.v2] Allow users to choose whether to return `TVTensor` subclasses or pure `Tensor` (7825)
[transforms.v2] Remove import-time warning for v2 namespaces (7853, 7897)
[transforms.v2] Speedup `hsv2rgb` (7754)
[models] Add `filter` parameters to `list_models()` (7718)
[models] Assert `RAFT` input resolution is 128 x 128 or higher (7339)
[ops] Replaced `gpuAtomicAdd` by `fastAtomicAdd` (7596)
[utils] Add GPU support for `draw_segmentation_masks` (7684)
[ops] Add deterministic, pure-Python `roi_align` implementation (7587)
[tv_tensors] Make `TVTensors` deepcopyable (7701)
[datasets] Only return small set of targets by default from dataset wrapper (7488)
[references] Added support for v2 transforms and `tensors` / `tv_tensors` backends (7732, 7511, 7869, 7665, 7629, 7743, 7724, 7742)
[doc] A lot of documentation improvements (7503, 7843, 7845, 7836, 7830, 7826, 7484, 7795, 7480, 7772, 7847, 7695, 7655, 7906, 7889, 7883, 7881, 7867, 7755, 7870, 7849, 7854, 7858, 7621, 7857, 7864, 7487, 7859, 7877, 7536, 7886, 7679, 7793, 7514, 7789, 7688, 7576, 7600, 7580, 7567, 7459, 7516, 7851, 7730, 7565, 7777)

Bug Fixes

[datasets] Fix `split=None` in `MovingMNIST` (7449)
[io] Fix heap buffer overflow in `decode_png` (7691)
[io] Fix blurry screen in video decoder (7552)
[models] Fix weight download URLs for some models (7898)
[models] Fix `ShuffleNet` ONNX export (7686)
[models] Fix detection models with pytorch 2.0 (7592, 7448)
[ops] Fix segfault in `DeformConv2d` when `mask` is None (7632)
[transforms.v2] Stricter `SanitizeBoundingBoxes` `labels_getter` heuristic (7880)
[transforms.v2] Make sure `RandomPhotometricDistort` transforms all images the same (7442)
[transforms.v2] Fix `v2.Lambda`’s transformed types (7566)
[transforms.v2] Don't call `round()` on float images for `Resize` (7669)
[transforms.v2] Let `SanitizeBoundingBoxes` preserve output type (7446)
[transforms.v2] Fixed int type support for sigma in `GaussianBlur` (7887)
[transforms.v2] Fixed issue with jitted `AutoAugment` transforms (7839)
[transforms] Fix `Resize` pass-through logic (7519)
[utils] Fix color in `draw_segmentation_masks` (7520)

Others

[tests] Various test improvements / fixes (7693, 7816, 7477, 7783, 7716, 7355, 7879, 7874, 7882, 7447, 7856, 7892, 7902, 7884, 7562, 7713, 7708, 7712, 7703, 7641, 7855, 7842, 7717, 7905, 7553, 7678, 7908, 7812, 7646, 7841, 7768, 7828, 7820, 7550, 7546, 7833, 7583, 7810, 7625, 7651)
[CI] Various CI improvements (7485, 7417, 7526, 7834, 7622, 7611, 7872, 7628, 7499, 7616, 7475, 7639, 7498, 7467, 7466, 7441, 7524, 7648, 7640, 7551, 7479, 7634, 7645, 7578, 7572, 7571, 7591, 7470, 7574, 7569, 7435, 7635, 7590, 7589, 7582, 7656, 7900, 7815, 7555, 7694, 7558, 7533, 7547, 7505, 7502, 7540, 7573)
[Code Quality] Various code quality improvements (7559, 7673, 7677, 7771, 7770, 7710, 7709, 7687, 7454, 7464, 7527, 7462, 7662, 7593, 7797, 7805, 7786, 7831, 7829, 7846, 7806, 7814, 7606, 7613, 7608, 7597, 7792, 7781, 7685, 7702, 7500, 7804, 7747, 7835, 7726, 7796)

Contributors

We're grateful for our community, which helps us improve torchvision by submitting issues and PRs, and providing feedback and suggestions. The following persons have contributed patches for this release:
Adam J. Stewart, Aditya Oke , Andrey Talman, Camilo De La Torre, Christoph Reich, Danylo Baibak, David Chiu, David Garcia, Dennis M. Pöpperl, Dhuige, Duc Mguyen, Edward Z. Yang, Eric Sauser , Fansure Grin, Huy Do, Illia Vysochyn, Johannes, Kai Wana, Kobrin Eli, kurtamohler, Li-Huai (Allan) Lin, Liron Ilouz, Masahiro Hiramori, Mateusz Guzek, Max Chuprov, Minh-Long Luu (刘明龙), Minliang Lin, mpearce25, Nicolas Granger, Nicolas Hug , Nikita Shulga, Omkar Salpekar, Paul Mulders, Philip Meier , ptrblck, puhuk, Radek Bartoň, Richard Barnes , Riza Velioglu, Sahil Goyal, Shu, Sim Sun, SvenDS9, Tommaso Bianconcini, Vadim Zubov, vfdev-5

0.15.2

This is a minor release, which is compatible with [PyTorch 2.0.1](https://github.com/pytorch/pytorch/releases/tag/v2.0.1) and contains some minor bug fixes.

Highlights

Bug Fixes
- Move parameter sampling of v2.RandomPhotometricDistort into _get_params https://github.com/pytorch/vision/pull/7442
- Fix split parameter for MovingMNIST https://github.com/pytorch/vision/pull/7449
- Prevent unwrapping in v2.SanitizeBoundingBoxes https://github.com/pytorch/vision/pull/7446

0.15.1

Highlights
[[BETA](https://pytorch.org/blog/pytorch-feature-classification-changes/#beta)] New transforms API
TorchVision is extending its Transforms API! Here is what’s new:
- You can use them not only for Image Classification but also for Object Detection, Instance & Semantic Segmentation and Video Classification.
- You can use new functional transforms for transforming Videos, Bounding Boxes and Segmentation Masks.

The API is **completely backward compatible** with the previous one, and remains the same to assist the migration and adoption. We are now releasing this new API as Beta in the `torchvision.transforms.v2` namespace, and we would love to get early feedback from you to improve its functionality. Please [reach out to us](https://github.com/pytorch/vision/issues/6753) if you have any questions or suggestions.

py
import torchvision.transforms.v2 as transforms

Exactly the same interface as V1:
trans = transforms.Compose([
transforms.ColorJitter(contrast=0.5),
transforms.RandomRotation(30),
transforms.CenterCrop(480),
])
imgs, bboxes, masks, labels = trans(imgs, bboxes, masks, labels)

You can read more about these new transforms in our [docs](https://pytorch.org/vision/main/transforms.html), and you can also check out our examples:

- [End-to-end object detection example
](https://pytorch.org/vision/stable/auto_examples/plot_transforms_v2_e2e.html#sphx-glr-auto-examples-plot-transforms-v2-e2e-py)
- [Getting started with transforms v2
](https://pytorch.org/vision/stable/auto_examples/plot_transforms_v2.html#sphx-glr-auto-examples-plot-transforms-v2-py)

Note that this API is still Beta. **While we do not expect major breaking changes, some APIs may still change according to user feedback**. Please submit any feedback you may have in https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes.

[[BETA](https://pytorch.org/blog/pytorch-feature-classification-changes/#beta)] New Video Swin Transformer

We added a Video SwinTransformer model is based on the [Video Swin Transformer](https://arxiv.org/abs/2106.13230) paper.

py
import torch
from torchvision.models.video import swin3d_t

video = torch.rand(1, 3, 32, 800, 600)
or swin3d_b, swin3d_s
model = swin3d_t(weights="DEFAULT")
model.eval()
with torch.inference_mode():
prediction = model(video)
print(prediction)

The model has the following accuracies on the Kinetics-400 dataset:

| Model | Acc1 | Acc5 |
| --- | ----------- | --------- |

0.14.1

This is a minor release, which is compatible with [PyTorch 1.13.1](https://github.com/pytorch/pytorch/releases/tag/v1.13.1). There are no new features added.

0.14.0

**Highlights**

[[BETA](https://pytorch.org/blog/pytorch-feature-classification-changes/#beta)] New Model Registration API

Following up on the [multi-weight support API](https://pytorch.org/blog/introducing-torchvision-new-multi-weight-support-api/) that was released on the previous version, we have added a new [model registration API](https://pytorch.org/blog/easily-list-and-initialize-models-with-new-apis-in-torchvision/) to help users retrieve models and weights. There are now 4 new methods under the `torchvision.models` module: `get_model`, `get_model_weights`, `get_weight`, and `list_models`. Here are examples of how we can use them:

python
import torchvision
from torchvision.models import get_model, get_model_weights, list_models

max_params = 5000000

tiny_models = []
for model_name in list_models(module=torchvision.models):
weights_enum = get_model_weights(model_name)
if len([w for w in weights_enum if w.meta["num_params"] <= max_params]) > 0:
tiny_models.append(model_name)

print(tiny_models)
['mnasnet0_5', 'mnasnet0_75', 'mnasnet1_0', 'mobilenet_v2', ...]

model = get_model(tiny_models[0], weights="DEFAULT")
print(sum(x.numel() for x in model.state_dict().values()))
2239188

As of now, this API is still [beta](https://pytorch.org/blog/pytorch-feature-classification-changes/#beta) and there might be changes in the future in order to improve its usability based on your [feedback](https://github.com/pytorch/vision/issues/6365).

New Architecture and Model Variants

Classification Models

We’ve added the Swin Transformer V2 architecture along with pre-trained weights for its tiny/small/base variants. In addition, we have added support for the MaxViT transformer. Here is an example on how to use the models:

python
import torch
from torchvision.models import *

image = torch.rand(1, 3, 224, 224)
model = swin_v2_t(weights="DEFAULT").eval()
model = maxvit_t(weights="DEFAULT").eval()
prediction = model(image)

Here is the table showing the accuracy of the models tested on ImageNet1K dataset.

<table>
<tr>
<td>Model
</td>
<td>Acc1
</td>
<td>Acc1

change over V1
</td>
<td>Acc5
</td>
<td>Acc5

change over V1
</td>
</tr>
<tr>
<td>swin_v2_t
</td>
<td>

Page 17 of 23

Releases

Has known vulnerabilities

Previous Next

Torchvision

Page 17 of 23

0.16.1

0.16.0

0.15.2

0.15.1

0.14.1

0.14.0

Page 17 of 23

Links

Releases