Torchvision

Latest version: v0.18.0

Safety actively analyzes 630305 Python packages for vulnerabilities to keep your Python projects secure.

Page 19 of 23

0.11.1

Users were reporting issues installing torchvision on PyPI, this release contains an update to the dependencies for wheels to point directly to torch==0.10.0

0.11.0

This release introduces the RegNet and EfficientNet architectures, a new FX-based utility to perform Feature Extraction, new data augmentation techniques such as RandAugment and TrivialAugment, updated training recipes that support EMA, Label Smoothing, Learning-Rate Warmup, Mixup and Cutmix, and many more.

Highlights

New Models

[RegNet](https://arxiv.org/abs/2003.13678) and [EfficientNet](https://arxiv.org/abs/1905.11946) are two popular architectures that can be scaled to different computational budgets. In this release we include 22 pre-trained weights for their classification variants. The models were trained on ImageNet and can be used as follows:

python
import torch
from torchvision import models

x = torch.rand(1, 3, 224, 224)

regnet = models.regnet_y_400mf(pretrained=True)
regnet.eval()
predictions = regnet(x)

efficientnet = models.efficientnet_b0(pretrained=True)
efficientnet.eval()
predictions = efficientnet(x)

The accuracies of the pre-trained models obtained on ImageNet val are seen below (see [4403](https://github.com/pytorch/vision/pull/4403#issuecomment-930381524), [4530](https://github.com/pytorch/vision/pull/4530#issuecomment-933213238) and [4293](https://github.com/pytorch/vision/pull/4293) for more details)

|Model |Acc1 |Acc5 |
|--- |--- |--- |
|regnet_x_400mf |72.834 |90.95 |
|regnet_x_800mf |75.212 |92.348 |
|regnet_x_1_6gf |77.04 |93.44 |
|regnet_x_3_2gf |78.364 |93.992 |
|regnet_x_8gf |79.344 |94.686 |
|regnet_x_16gf |80.058 |94.944 |
|regnet_x_32gf |80.622 |95.248 |
|regnet_y_400mf |74.046 |91.716 |
|regnet_y_800mf |76.42 |93.136 |
|regnet_y_1_6gf |77.95 |93.966 |
|regnet_y_3_2gf |78.948 |94.576 |
|regnet_y_8gf |80.032 |95.048 |
|regnet_y_16gf |80.424 |95.24 |
|regnet_y_32gf |80.878 |95.34 |
|EfficientNet-B0 |77.692 |93.532 |
|EfficientNet-B1 |78.642 |94.186 |
|EfficientNet-B2 |80.608 |95.31 |
|EfficientNet-B3 |82.008 |96.054 |
|EfficientNet-B4 |83.384 |96.594 |
|EfficientNet-B5 |83.444 |96.628 |
|EfficientNet-B6 |84.008 |96.916 |
|EfficientNet-B7 |84.122 |96.908 |

We would like to thank Ross Wightman and Luke Melas-Kyriazi for contributing the weights of the EfficientNet variants.

FX-based Feature Extraction

A new Feature Extraction method has been added to our utilities. It uses PyTorch FX and enables us to retrieve the outputs of intermediate layers of a network which is useful for feature extraction and visualization. Here is an example of how to use the new utility:

python
import torch
from torchvision.models import resnet50
from torchvision.models.feature_extraction import create_feature_extractor

x = torch.rand(1, 3, 224, 224)

model = resnet50()

return_nodes = {
"layer4.2.relu_2": "layer4"
}
model2 = create_feature_extractor(model, return_nodes=return_nodes)
intermediate_outputs = model2(x)

print(intermediate_outputs['layer4'].shape)

We would like to thank Alexander Soare for developing this utility.

New Data Augmentations

Two new Automatic Augmentation techniques were added: [Rand Augment](https://arxiv.org/abs/1909.13719) and [Trivial Augment](https://arxiv.org/abs/2103.10158). Both methods can be used as drop-in replacement of the AutoAugment technique as seen below:

python
from torchvision import transforms

t = transforms.RandAugment()
t = transforms.TrivialAugmentWide()
transformed = t(image)

transform = transforms.Compose([
transforms.Resize(256),
transforms.RandAugment(), transforms.TrivialAugmentWide()
transforms.ToTensor()])

We would like to thank Samuel G. Müller for contributing Trivial Augment and for his help on refactoring the AA package.

Updated Training Recipes

We have updated our training reference scripts to add support of Exponential Moving Average, Label Smoothing, Learning-Rate Warmup, [Mixup](https://arxiv.org/abs/1710.09412), [Cutmix](https://arxiv.org/abs/1905.04899) and other [SOTA primitives](https://github.com/pytorch/vision/issues/3911). The above enabled us to improve the classification Acc1 of some pre-trained models by [over 4 points](https://github.com/pytorch/vision/issues/3995). A major update of the existing pre-trained weights is expected on the next release.

Backward-incompatible changes

[models] Use torch instead of scipy for random initialization of inception and googlenet weights (4256)

Deprecations

[models] Deprecate the C++ vision::models namespace (4375)

New Features

[datasets] Add iNaturalist dataset (4123)
[datasets] Download and Kinetics 400/600/700 Datasets (3680)
[datasets] Added LFW Dataset (4255)
[models] Add FX feature extraction as an alternative to intermediate_layer_getter (4302) (4418)
[models] Add RegNet Architecture in TorchVision (4403) (4530) (4550)
[ops] Add new masks_to_boxes op (4290) (4469)
[ops] Add StochasticDepth implementation (4301)
[reference scripts] Adding Mixup and Cutmix (4379)
[transforms] Integration of TrivialAugment with the current AutoAugment Code (4221)
[transforms] Adding RandAugment implementation (4348)
[models] Add EfficientNet Architecture in TorchVision (4293)

Improvements

Various documentation improvements (4239) (4251) (4275) (4342) (3894) (4159) (4133) (4138) (4089) (3944) (4349) (3754) (4308) (4352) (4318) (4244) (4362) (3863) (4382) (4484) (4503) (4376) (4457) (4505) (4363) (4361) (4337) (4546) (4553) (4565) (4567) (4574) (4575) (4383) (4390) (3409) (4451) (4340) (3967) (4072) (4028) (4132)
[build] Add CUDA-11.3 builds to torchvision (4248)
[ci, tests] Skip some CPU-only tests on CircleCI machines with GPU (4002) (4025) (4062)
[ci] New issue templates (4299)
[ci] Various CI improvements, in particular putting back GPU testing on windows (4421) (4014) (4053) (4482) (4475) (3998) (4388) (4179) (4394) (4162) (4065) (3928) (4081) (4203) (4011) (4055) (4074) (4419) (4067) (4201) (4200) (4202) (4496) (3925)
[ci] ping maintainers in case a PR was not properly labeled (3993) (4012) (4021) (4501)
[datasets] Add bzip2 file compression support to datasets (4097)
[datasets] Faster dataset indexing (3939)
[datasets] Enable logging of internal dataset instanciations. (4319) (4090)
[datasets] Removed copy=False in torch.from_numpy in MNIST to avoid warning (4184)
[io] Add warning for files with corrupt containers (3961)
[models, tests] Add test to check that classification models are FX-compatible (3662)
[tests] Speedup various tests (3929) (3933) (3936)
[models] Allow custom activation in SqueezeExcitation of EfficientNet (4448)
[models] Allow gradient backpropagation through GeneralizedRCNNTransform to inputs (4327)
[ops, tests] Add JIT tests (4472)
[ops] Make StochasticDepth FX-compatible (4373)
[ops] Added backward pass on CPU and CUDA for interpolation with anti-alias option (4208) (4211)
[ops] Small refactoring to support opt mode for torchvision ops (fb internal specific) (4080) (4095)
[reference scripts] Added Exponential Moving Average support to classification reference script (4381) (4406) (4407)
[reference scripts] Adding label smoothing on classification reference (4335)
[reference scripts] Further enhance Classification Reference (4444)
[reference scripts] Replaced to_tensor() with pil_to_tensor() + convert_image_dtype() (4452)
[reference scripts] Update the metrics output on reference scripts (4408)
[reference scripts] Warmup schedulers in References (4411)
[tests] Add check for fx compatibility on segmentation and video models (4131)
[tests] Mock redirection logic for tests (4197)
[tests] Replace set_deterministic with non-deprecated spelling (4212)
[tests] Skip building torchvision with ffmpeg when python==3.9 (4417)
[tests] [jit] Make operation call accept Stack& instead Stack* (63414) (4380)
[tests] make tests that involve GDrive more robust (4454)
[tests] remove dependency for dtype getters (4291)
[transforms] Replaced example usage of ToTensor() by PILToTensor() + ConvertImageDtype() (4494)
[transforms] Explicitly copying array in pil_to_tensor (4566) (4573)
[transforms] Make get_image_size and get_image_num_channels public. (4321)
[transforms] adding gray images support for adjust_contrast and adjust_saturation (4477) (4480)
[utils] Support single color in utils.draw_bounding_boxes (4075)
[video, documentation] Port the video_api.ipynb notebook to the example gallery (4241)
[video, io, tests] Added check for invalid input file (3932)
[video, io] remove deprecated function call (3861) (3989)
[video, tests] Removed test_audio_video_sync as it doesn't work as expected (4050)
[video] Build torchvision with ffmpeg only on Linux and ignore ffmpeg on other platforms (4413, 4410, 4041)

Bug Fixes

[build] Conda: Add numpy dependency (4442)
[build] Explicitly exclude PIL 8.3.0 from compatible dependencies (4148)
[build] More robust version check (4285)
[ci] Fix broken clang format test. (4320)
[ci] Remove mentions of conda-forge (4082)
[ci] fixup '*' -> '/.*/' for CI filter (4059)
[datasets] Fix download from google drive which was downloading empty files in some cases (4109)
[datasets] Fix splitting CelebA dataset (4377)
[datasets] Add support for files with periods in name (4099)
[io, tests] Don't check transparency channel for pil >= 8.3 in test_decode_png (4167)
[io] Fix size_t issues across JPEG versions and platforms (4439)
[io] Raise proper error when decoding 16-bits jpegs (4101)
[io] Unpinned the libjpeg version and fixed jpeg_mem_dest's size type Wind… (4288)
[io] deinterlacing PNG images with read_image (4268)
[io] More robust ffmpeg version query in setup.py (4254)
[io] Fixed read_image bug (3948)
[models] Don't download backbone weights if pretrained=True (4283)
[onnx, tests] Do not disable profiling executor in ONNX tests (4324)
[ops, tests] Fix DeformConvTester::test_backward_cuda by setting threads per block to 512 (3942)
[ops] Fix typing issue to make DeformConv2d scriptable (4079)
[ops] Fixes deform_conv issue with large input/output (4351)
[ops] Resolving tracing problem on StochasticDepth iterator. (4372)
[ops] Port quantize_val and dequantize_val into torchvision to avoid at::native and android xplat incompatibility (4311)
[reference scripts] Fix bug on EMA n_averaged estimation. (4544) (4545)
[tests] Avoid cmyk in nvjpeg tests (4246)
[tests] Catch ValueError due to recent change to torch.testing.assert_close (4165)
[tests] Fix failing tests by catching the proper exception from torch.testing (4121)
[tests] Skip test if connection issues on fate (4284)
[transforms] Fix RandAugment and TrivialAugment bugs (4370)
[transforms] [FBcode->GH] [JIT] Add reference semantics to TorchScript classes (44324) (4166)
[utils] Handle grayscale images on draw_bounding_boxes (4043) (4049)
[video, io] Fixed missing audio with video_reader and pyav backend (3934, 4064)

Code Quality

Various typing improvements (4369) (4168) (4169) (4170) (4171) (4224) (4227) (4395) (4409) (4232) (4234 (4236) (4226) (4416)
Renamed the “master” branch into “main” (4306) (4365)
[ci] (fb-internal only) Allow all torchvision test rules to run with RE (4073)
[ci] add pre-commit hooks for convenient formatting checks (4387)
[ci] Import hipify_python only when needed (4031)
[io] Fixed a couple of typos and removed unnecessary bracket (4345)
[io] use from_blob to avoid memcpy (4118)
[models, ops] Moving common layers to ops (4504)
[models, ops] Replace MobileNetV3's SqueezeExcitation with EfficientNet's one (4487)
[models] Explicitely store a distance value that is reused (4341)
[models] Use torch instead of scipy for random initialization of inception and googlenet weights (4256)
[onnx, tests] Use test images from repo rather than internet for ONNX tests (4176)
[onnx] Import ONNX utils from symbolic_opset11 module (4230)
[ops] Fix clang formatting in deform_conv2d_kernel.cu (3943)
[ops] Update gpu atomics include path (4478) (reverted)
[reference scripts] Cleaned-up coco evaluation code (4453)
[reference scripts] remove unused package in coco_eval.py (4404)
[tests] Ported all tests to pytest (3962) (3996) (3950) (3964) (3957) (3959) (3981) (3952) (3977) (3974) (3976) (3983) (3971) (3988) (3990) (3985) (3984) (4030) (3955)r (4008) (4010) (4023) (3954) (4026) (3953) (4047) (4185) (3947) (4045) (4036) (4034) (3978) (4046) (3991) (3930) (4038) (4037) (4215) (3972) (3966) (4114) (4177) (4280) (3946) (4233) (4258) (4035) (4040) (4000) (4196) (3922) (4032)
[tests] Prevent tests from leaking their respective RNG (4497) (3926) (4250)
[tests] Remove TestCase dependency for test_models_detection_anchor_utils.py (4207)
[tests] Removed tests executing deprecated F_t.center/five/ten_crop methods (4479)
[tests] Replace set_deterministic with non-deprecated spelling (4212)
[tests] Remove torchvision/test/fakedata_generation.py (4130)
[transforms, reference scripts] Added PILToTensor and ConvertImageDtype classes in reference scripts and used them to replace ToTensor(4495, 4481)
[transforms] Refactor AutoAugment to support more augmentations. (4338)
[transforms] Replace deprecated torch.lstsq with torch.linalg.lstsq (3918)
[video] Drop virtual from private member functions of Decoder class (4027)
[video] Fixed comparison warnings in audio_stream and video_stream (4007)
[video] Fixed some ffmpeg deprecation warnings in decoder (4003)

Contributors

We're grateful for our community, which helps us improving torchvision by submitting issues and PRs, and providing feedback and suggestions. The following persons have contributed patches for this release:

ABD-01, Adam J. Stewart, Aditya Oke, Alex Lin, Alexander Grund, Alexander Soare, Allen Goodman, Amani Kiruga, Anirudh, Beat Buesser, beet, Bert Maher, Bruno Korbar, Camilo De La Torre, cyy, D. Khuê Lê-Huu, David Fan, DevPranjal, dgenzel, dgenzel2, Dmitriy Genzel, Drishti Bhasin, Edward Z. Yang, Eli Uriegas, F-G Fernandez, Francisco Massa, Gary Miguel, Gaurav7888, IgorSusmelj, Ishan Kumar, Ivan Kobzarev, Jiawei Liu, Jithun Nair, Joao Gomes, Joe Early, Julien RIPOCHE, julienripoche, Kai Zhang, kingyiusuen, Loi Ly, Matti Picus, Meghan Lele, Muhammed Abdullah, Nicolas Hug, Nikita Shulga, ORippler, peterbell10, Philip Meier, Prabhat Roy, puhuk, Rajat Jaiswal, S Harish, Sahil Goyal, Samuel Gabriel, Santiago Castro, Saswat Das, Sepehr Sameni, Shengwei An, Shrill Shrestha, Shruti Pulstya, Sugato Ray, tanvimoharir, Vasilis Vryniotis, Vassilis C. Nicodemou, Vassilis Nicodemou, vfdev-5, Vincent Moens, Vivek Kumar, Yi Zhang, Yiwen Song, Yonghye Kwon, Yuchen Huang, Zhengxu Chen, Zhiqiang Wang, Zhongkai Zhu, zzk1st

0.10.1

0.10.0

This release improves support for mobile, with new mobile-friendly detection models based on SSD and SSDlite, CPU kernels for quantized NMS and quantized RoIAlign, pre-compiled binaries for iOS available in cocoapods and an iOS demo app. It also improves image IO by providing JPEG decoding on the GPU, and many more.

Highlights

[BETA] New models for detection

[SSD](https://arxiv.org/abs/1512.02325) and [SSDlite](https://arxiv.org/abs/1801.04381) are two popular object detection architectures which are efficient in terms of speed and provide good results for low resolution pictures. In this release, we provide implementations for the original SSD model with VGG16 backbone and for its mobile-friendly variant SSDlite with MobileNetV3-Large backbone. The models were pre-trained on COCO train2017 and can be used as follows:

python
import torch
import torchvision

Original SSD variant
x = [torch.rand(3, 300, 300), torch.rand(3, 500, 400)]
m_detector = torchvision.models.detection.ssd300_vgg16(pretrained=True)
m_detector.eval()
predictions = m_detector(x)

Mobile-friendly SSDlite variant
x = [torch.rand(3, 320, 320), torch.rand(3, 500, 400)]
m_detector = torchvision.models.detection.ssdlite320_mobilenet_v3_large(pretrained=True)
m_detector.eval()
predictions = m_detector(x)

The following accuracies can be obtained on COCO val2017 (full results available in 3403 and 3757):

Model | mAP | mAP50 | mAP75
-- | -- | -- | --
SSD300 VGG16 | 25.1 | 41.5 | 26.2
SSDlite320 MobileNetV3-Large | 21.3 | 34.3 | 22.1

[STABLE] Quantized kernels for object detection

The forward pass of the nms and roi_align operators now support tensors with a quantized dtype, which can help lowering the memory footprint of object detection models, particularly on mobile environments.

[BETA] JPEG decoding on the GPU

Decoding jpegs is now possible on GPUs with the use of [nvjpeg](https://developer.nvidia.com/nvjpeg), which should be readily available in your CUDA setup. The decoding time of a single image should be about 2 to 3 times faster than with libjpeg on CPU. While the resulting tensor will be stored on the GPU device, the input raw tensor still needs to reside on the host (CPU), because the first stages of the decoding process take place on the host:

python
from torchvision.io.image import read_file, decode_jpeg

data = read_file('path_to_image.jpg') raw data is on CPU
img = decode_jpeg(data, device='cuda') decoded image in on GPU

[BETA] iOS support

TorchVision 0.10 now provides pre-compiled iOS binaries for its C++ operators, which means you can run Faster R-CNN and Mask R-CNN on iOS. An example app on how to build a program leveraging those ops can be found in [here](https://github.com/pytorch/vision/tree/master/ios/VisionTestApp).

[STABLE] Speed optimizations for Tensor transforms

The resize and flip transforms have been optimized and its runtime improved by up to 5x on the CPU. The corresponding PRs were sent to PyTorch in https://github.com/pytorch/pytorch/pull/51653, https://github.com/pytorch/pytorch/pull/54500 and https://github.com/pytorch/pytorch/pull/56713

[STABLE] Documentation improvements

Significant improvements were made to the documentation. In particular, a new gallery of examples is available: see [here](https://pytorch.org/vision/master/auto_examples/index.html) for the latest version (the stable version is not released at the time of writing). These examples visually illustrate how each transform acts on an image, and also properly documents and illustrate the output of the segmentation models.

The example gallery will be extended in the future to provide more comprehensive examples and serve as a reference for common torchvision tasks.

Backwards Incompatible Changes

* [transforms] Ensure input type of `normalize` is float. (3621)
* [models] Use PyTorch `smooth_l1_loss` and remove private custom implementation (3539)

New Features

* Added iOS binaries and test app (3582)(3629) (3806)
* [datasets] Added KITTI dataset (3640)
* [utils] Added utility to draw segmentation masks (3330, 3824)
* [models] Added the SSD & SSDlite object detection models (3403, 3757, 3766, 3855, 3896, 3818, 3799)
* [transforms] Added `antialias` option to `transforms.functional.resize` (3761, 3810, 3842)
* [transforms] Add new `max_size` parameter to `Resize` (3494)
* [io] Support for decoding jpegs on GPU with `nvjpeg` (3792)
* [ci, rocm] Add ROCm to builds (3840) (3604) (3575)
* [ops, models.quantization] Add quantized version of NMS (3601)
* [ops, models.quantization] Add quantized version of RoIAlign (3624, 3904)

Improvement

* [build] Various build improvements: (3618) (3622) (3399) (3794) (3561)
* [ci] Various CI improvements (3647) (3609) (3635) (3599) (3778) (3636) (3809) (3625) (3764) (3679) (3869) (3871) (3444) (3445) (3480) (3768) (3919) (3641)(3900)
* [datasets] Improve error handling in `make_dataset` (3496)
* [datasets] Remove caching from MNIST and variants (3420)
* [datasets] Make `DatasetFolder.find_classes` public (3628)
* [datasets] Separate extraction and decompression logic in `datasets.utils.extract_archive` (3443)
* [datasets, tests] Improve dataset test coverage and infrastructure (3450) (3457) (3454) (3447) (3489) (3661) (3458 (3705) (3411) (3461) (3465) (3543) (3550) (3665) (3464) (3595) (3466) (3468) (3467) (3486) (3736) (3730) (3731) (3477) (3589) (3503) (3423) (3492)(3578) (3605) (3448) (3864) (3544)
* [datasets, tests] Fix lazy importing for dataset tests (3481)
* [datasets, tests] Fix `test_extract(zip|tar|tar_xz|gzip)` on windows (3542)
* [datasets, tests] Fix `kwargs` forwarding in fake data utility functions (3459)
* [datasets, tests] Properly fix dataset test that passes by accident (3434)
* [documentation] Improve the documentation infrastructure (3868) (3724) (3834) (3689) (3700) (3513) (3671) (3490) (3660) (3594)
* [documentation] Various documentation improvements (3793) (3715) (3727) (3838) (3701) (3923) (3643) (3537) (3691) (3453) (3437) (3732) (3683) (3853) (3684) (3576) (3739) (3530) (3586) (3744) (3645) (3694) (3584) (3615) (3693) (3706) (3646) (3780) (3704) (3774) (3634)(3591)(3807)(3663)
* [documentation, ci] Improve the CI infrastructure for documentation (3734) (3837) (3796) (3711)
* [io] remove deprecated function calls (3859) (3858)
* [documentation, io] Improve IO docs and expose `ImageReadMode` in `torchvision.io` (3812)
* [onnx, models] Replace `reshape` with `flatten` in MobileNetV2 (3462)
* [ops, tests] Added test for `aligned=True` (3540)
* [ops, tests] Add onnx test for `batched_nms` (3483)
* [tests] Various test improvements (3548) (3422) (3435) (3860) (3479) (3721) (3872) (3908) (2916) (3917) (3920) (3579)
* [transforms] add `__repr__` for `transforms.RandomErasing` (3491)
* [transforms, documentation] Adds Documentation for AutoAugmentation (3529)
* [transforms, documentation] Add illustrations of transforms with sphinx-gallery (3652)
* [datasets] Remove pandas dependency for CelebA dataset (3656, 3698)
* [documentation] Add docs for missing datasets (3536)
* [referencescripts] Make reference scripts compatible with `submitit` (3785)
* [referencescripts] Updated `all_gather()` to make use of `all_gather_object()` from PyTorch (3857)
* [datasets] Added dataset download support in fbcode (3823) (3826)

Code quality

* Remove inconsistent FB copyright headers (3741)
* Keep consistency in classes `ConvBNActivation` (3750)
* Removed unused imports (3738, 3740, 3639)
* Fixed `floor_divide` deprecation warnings seen in pytest output (3672)
* Unify onnx and JIT `resize` implementations (3654)
* Cleaned-up imports in test files related to datasets (3720)
* [documentation] Remove old css file (3839)
* [ci] Fix inconsistent version pinning across yaml files (3790)
* [datasets] Remove redundant `path.join` in `Places365` (3545)
* [datasets] Remove imprecise error handling in `PhotoTour` dataset (3488)
* [datasets, tests] Remove obsolete `test_datasets_transforms.py` (3867)
* [models] Making protected params of MobileNetV3 public (3828)
* [models] Make target argument in `transform.py` truly optional (3866)
* [models] Adding some references on MobileNetV3 implementation. (3850)
* [models] Refactored `set_cell_anchors()` in `AnchorGenerator` (3755)
* [ops] Minor cleanup of `roi_align_forward_kernel_impl` (3619)
* [ops] Replace deprecated `AutoNonVariableTypeMode` with `AutoDispatchBelowADInplaceOrView`. (3786, 3897)
* [tests] Port tests to use pytest (3852, 3845, 3697, 3907, 3749)
* [ops, tests] simplify `get_script_fn` (3541)
* [tests] Use torch.testing.assert_close in out test suite (3886) (3885) (3883) (3882) (3881) (3887) (3880) (3878) (3877) (3875) (3888) (3874) (3884) (3876) (3879) (3873)
* [tests] Clean up test accept behaviour (3759)
* [tests] Remove unused `masks` variable in `test_image.py` (3910)
* [transforms] use ternary if in `resize` (3533)
* [transforms] replaced deprecated call to `ByteTensor` with `from_numpy` (3813)
* [transforms] Remove unnecessary casting in `adjust_gamma` (3472)

Bugfixes

* [ci] set empty cxx flags as default (3474)
* [android][test_app] Cleanup duplicate dependency (3428)
* Remove leftover exception (3717)
* Corrected spelling in a `TypeError` (3659)
* Add missing device info. (3651)
* Moving tensors to the right device (3870)
* Proper error message (3725)
* [ci, io] Pin JPEG version to resolve the size_t issue on windows (3787)
* [datasets] Make LSUN OS agnostic (3455)
* [datasets] Update `squeezenet` urls (3581)
* [datasets] Add `.item()` to the `target` variable in `fakedataset.py` (3587)
* [datasets] Fix VOC datasets for 2007 (3572)
* [datasets] Add custom user agent for download_url (3498)
* [datasets] Fix LSUN dataset tests flakyness (3703)
* [datasets] Fix (Fashion|K)MNIST download and MNIST download test (3557)
* [datasets] fix check for exceeded quota on Google Drive (3710)
* [datasets] Fix redirect behavior of datasets.utils.download_url (3564)
* [datasets] Update EMNIST url (3567)
* [datasets] Redirect datasets to correct urls (3574)
* [datasets] Prevent potential bug in `DatasetFolder.make_dataset` (3733)
* [datasets, tests] Fix redirection in download tests (3568)
* [documentation] Correct the size of returned tensor in comments of `ps_roi_pool.py` and `ps_roi_align.py` (3849)
* [io] Fix ternary operator to decide to store an image in Grayscale or RGB (3553)
* [io] Fixed audio-video synchronisation problem in `read_video()` when using `pts` as unit (3791)
* [models] Fix bug on detection backbones when `trainable_layers == 0` (3906)
* [models] Removed caching of anchors from `AnchorGenerator` (3745)
* [models] Update weights of classification models with new serialization format to allow proper unpickling (3620, 3851)
* [onnx, ops] Fix `roi_align` ONNX export (3355)
* [referencescripts] Only sync cuda ifn cuda available (3674)
* [referencescripts] Add checkpoints used for preemption. (3789)
* [transforms] Fix `to_tensor` for `accimage` backend (3439)
* [transforms] Make `crop` work the same for PIL and Tensor (3770)
* [transforms, models, tests] Fix some tests in fbcode (3686)
* [transforms, tests] Fix `test_random_autocontrast` flakyness (3699)
* [utils] Fix the spacing of labels on `draw_bounding_boxes` (3895)
* [utils, tests] Fix `test_draw_boxes` (3631)

Deprecation

* [transforms] Deprecate `_transforms_video` and `_functional_video` in favor of `transforms` (3441)

Performance

* [ops] Improve performance of `batched_nms` when number of boxes is large (3426)
* [transforms] Speed up `equalize` transform by using `bincount` instead of `histc` (3493)

Contributors

We're grateful for our community, which helps us improving torchvision by submitting issues and PRs, and providing feedback and suggestions. The following persons have contributed patches for this release:

Aditya Oke, Akshay Kumar, Alessandro Melis, Avijit Dasgupta, Bruno Korbar, Caroline Chen, chengjuzhou, Edgar Andrés Margffoy Tuay, Eli Uriegas, Francisco Massa, Guillem Orellana Trullols, harishsdev, Ivan Kobzarev, Jaesun Park, James Thewlis, Jeff Daily, Jeff Yang, Jithendra Paruchuri, Jon Janzen, KAI ZHAO, Ksenija Stanojevic, Lewis Patten, Matti Picus, moto, Mustafa Bal, Nicolas Hug, Nikhil Kumar, Nikita Shulga, Philip Meier, Prabhat Roy, Sanket Thakur, scott-vsi, Sofiane Abbar, t-rutten, urmi22, Vasilis Vryniotis, vfdev, Yuchen Huang, Zhengyang Feng, Zhiqiang Wang

Thank you!

0.9.1

Highlights

This minor release bumps the pinned PyTorch version to v1.8.1, and brings a few bugfixes for datasets, including MNIST download not being available.

Bugfixes
- fix VOC datasets for 2007 (3572)
- Update EMNIST url (3567)
- Fix redirect behavior of datasets.utils.download_url (3564)
- Fix MNIST download for minor release (3559)

0.9.0

This release introduces improved support for mobile, with new mobile-friendly models, pre-compiled binaries for Android available in maven and an android demo app. It also improves image IO and provides new data augmentations including AutoAugment.

Highlights

Better mobile support

torchvision 0.9 adds support for the MobileNetV3 architecture with pre-trained weights for Classification, Object Detection and Segmentation tasks.
It also improves C++ operators so that they can be compiled and run on Android, and we are providing pre-compiled torchvision artifacts published to jcenter. An example application on how to use the torchvision ops on an Android app can be found in [here](https://github.com/pytorch/android-demo-app/tree/master/D2Go).

Classification
We provide MobileNetV3 variants (including a quantized version) pre-trained on ImageNet 2012.
python
import torch
import torchvision

Classification
x = torch.rand(1, 3, 224, 224)
m_classifier = torchvision.models.mobilenet_v3_large(pretrained=True)
m_classifier = torchvision.models.mobilenet_v3_small(pretrained=True)
m_classifier.eval()
predictions = m_classifier(x)

Quantized Classification
x = torch.rand(1, 3, 224, 224)
m_classifier = torchvision.models.quantization.mobilenet_v3_large(pretrained=True)
m_classifier.eval()
predictions = m_classifier(x)

The pre-trained models have the following accuracies on ImageNet 2012 val:

| Model | Top-1 Acc | Top-5 Acc
| --- | --- | --- |

Page 19 of 23

Releases

Has known vulnerabilities

Previous Next

Torchvision

Page 19 of 23

0.11.1

0.11.0

0.10.1

0.10.0

0.9.1

0.9.0

Page 19 of 23

Links

Releases