Gluoncv

Latest version: v0.10.5.post0

Safety actively analyzes 630217 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 5

0.6.0

More video action recognition models

https://gluon-cv.mxnet.io/model_zoo/action_recognition.html

We now provide state-of-the-art video classification networks, such as I3D, I3D-Nonlocal and SlowFast. We have a complete model zoo over several widely adopted video datasets. We provide a general video [dataloader](https://github.com/dmlc/gluon-cv/blob/master/gluoncv/data/video_custom/classification.py) (which can handle both frame format and raw video format). Users can do training, fine-tuning, prediction and feature extraction without writing complicate code. Just prepare a text file containing the video information is enough.

Below is the table of new models included in this release.


Name | Pretrained | Segments | Clip Length | Top-1 | Hashtag |
-- | -- | -- | -- | -- | -- |
inceptionv1_kinetics400 | ImageNet | 7 | 1 | 69.1 | 6dcdafb1 |
inceptionv3_kinetics400 | ImageNet | 7 | 1 | 72.5 | 8a4a6946 |
resnet18_v1b_kinetics400 | ImageNet | 7 | 1 | 65.5 | 46d5a985 |
resnet34_v1b_kinetics400  | ImageNet | 7 | 1 | 69.1 | 8a8d0d8d |
resnet50_v1b_kinetics400  | ImageNet | 7 | 1 | 69.9 | cc757e5c |
resnet101_v1b_kinetics400  | ImageNet | 7 | 1 | 71.3 | 5bb6098e |
resnet152_v1b_kinetics400  | ImageNet | 7 | 1 | 71.5 | 9bc70c66 |
i3d_inceptionv1_kinetics400  | ImageNet | 1 | 32 (64/2) | 71.8 | 81e0be10 |
i3d_inceptionv3_kinetics400  | ImageNet | 1 | 32 (64/2) | 73.6 | f14f8a99 |
i3d_resnet50_v1_kinetics400  | ImageNet | 1 | 32 (64/2) | 74.0 | 568a722e |
i3d_resnet101_v1_kinetics400  | ImageNet | 1 | 32 (64/2) | 75.1 | 6b69f655 |
i3d_nl5_resnet50_v1_kinetics400  | ImageNet | 1 | 32 (64/2) | 75.2 | 3c0e47ea |
i3d_nl10_resnet50_v1_kinetics400  | ImageNet | 1 | 32 (64/2) | 75.3 | bfb58c41 |
i3d_nl5_resnet101_v1_kinetics400  | ImageNet | 1 | 32 (64/2) | 76.0 | fbfc1d30 |
i3d_nl10_resnet101_v1_kinetics400  | ImageNet | 1 | 32 (64/2) | 76.1 | 59186c31 |
slowfast_4x16_resnet50_kinetics400  | ImageNet | 1 | 36 (64/1) | 75.3 | 9d650f51 |
slowfast_8x8_resnet50_kinetics400  | ImageNet | 1 | 40 (64/1) | 76.6 | d6b25339 |
slowfast_8x8_resnet101_kinetics400  | ImageNet | 1 | 40 (64/1) | 77.2 | fbde1a7c |
resnet50_v1b_ucf101  | ImageNet | 3 | 1 | 83.7 | d728ecc7 |
i3d_resnet50_v1_ucf101 | ImageNet | 1 | 32 (64/2) | 83.9 | 7afc7286 |
i3d_resnet50_v1_ucf101  | Kinetics400 | 1 | 32 (64/2) | 95.4 | 760d0981 |
resnet50_v1b_hmdb51  | ImageNet | 3 | 1 | 55.2 | 682591e2 |
i3d_resnet50_v1_hmdb51  | ImageNet | 1 | 32 (64/2) | 48.5 | 0d0ad559 |
i3d_resnet50_v1_hmdb51  | Kinetics400 | 1 | 32 (64/2) | 70.9 | 2ec6bf01 |
resnet50_v1b_sthsthv2  | ImageNet | 8 | 1 | 35.5 | 80ee0c6b |
i3d_resnet50_v1_sthsthv2  | ImageNet | 1 | 16 (32/2) | 50.6 | 01961e4c |


We include tutorials for how to fine-tune a pre-trained model on users' own dataset.
https://gluon-cv.mxnet.io/build/examples_action_recognition/finetune_custom.html

We include tutorials for introducing a new efficient video reader, Decord.
https://gluon-cv.mxnet.io/build/examples_action_recognition/decord_loader.html

We include tutorials for how to extract features from a pre-trained model.
https://gluon-cv.mxnet.io/build/examples_action_recognition/feat_custom.html

We include tutorials for how to make predictions from a pre-trained model.
https://gluon-cv.mxnet.io/build/examples_action_recognition/demo_custom.html

We include tutorials for how to perform distributed training on deep video models.
https://gluon-cv.mxnet.io/build/examples_distributed/distributed_slowfast.html

We include tutorials for how to prepare HMDB51 and Something-something-v2 dataset.
https://gluon-cv.mxnet.io/build/examples_datasets/hmdb51.html
https://gluon-cv.mxnet.io/build/examples_datasets/somethingsomethingv2.html

We will provide Kinetics600 and Kinetics700 pre-trained models in the next release, please stay tuned.

Mobile pose estimation models

https://gluon-cv.mxnet.io/model_zoo/pose.html#mobile-pose-models

|Model | OKS AP | OKS AP (with flip) | Hashtag |
|-- | -- | -- | -- |
|mobile_pose_resnet18_v1b  | 66.2/89.2/74.3 | 67.9/90.3/75.7 | dd6644eb |
|mobile_pose_resnet50_v1b  | 71.1/91.3/78.7 | 72.4/92.3/79.8 | ec8809df |
|mobile_pose_mobilenet1.0  | 64.1/88.1/71.2 | 65.7/89.2/73.4 | b399bac7 |
|mobile_pose_mobilenetv2_1.0  | 63.7/88.1/71.0 | 65.0/89.2/72.3 | 4acdc130 |
|mobile_pose_mobilenetv3_large  | 63.7/88.9/70.8 | 64.5/89.0/72.0 | 1ca004dc |
|mobile_pose_mobilenetv3_small  | 54.3/83.7/59.4 | 55.6/84.7/61.7 | b1b148a9 |

By replacing the backbone network, and use pixel shuffle layer instead of deconvolution, we can have models that are very fast. These models are suitable for edge device applications, tutorials on deployment will come soon.

More Int8 quantized models

https://gluon-cv.mxnet.io/build/examples_deployment/int8_inference.html
Below CPU performance is benchmarked on AWS EC2 C5.12xlarge instance with 24 physical cores.
Note that you will need nightly build of MXNet to properly use these new features.

![](https://user-images.githubusercontent.com/34727741/67351790-ecdc7280-f580-11e9-8b44-1b4548cb6031.png)

Model | Dataset | Batch Size | Speedup (INT8/FP32) | FP32 Accuracy | INT8 Accuracy
-- | -- | -- | -- | -- | --
simple_pose_resnet18_v1b | COCO Keypoint | 128 | 2.55 | 66.3 | 65.9
simple_pose_resnet50_v1b | COCO Keypoint | 128 | 3.50 | 71.0 | 70.6
simple_pose_resnet50_v1d | COCO Keypoint | 128 | 5.89 | 71.6 | 71.4
simple_pose_resnet101_v1b | COCO Keypoint | 128 | 4.07 | 72.4 | 72.2
simple_pose_resnet101_v1d | COCO Keypoint | 128 | 5.97 | 73.0 | 72.7
vgg16_ucf101 | UCF101 | 64 | 4.46 | 81.86 | 81.41
inceptionv3_ucf101 | UCF101 | 64 | 5.16 | 86.92 | 86.55
resnet18_v1b_kinetics400 | Kinetics400 | 64 | 5.24 | 63.29 | 63.14
resnet50_v1b_kinetics400 | Kinetics400 | 64 | 6.78 | 68.08 | 68.15
inceptionv3_kinetics400 | Kinetics400 | 64 | 5.29 | 67.93 | 67.92

For pose-estimation models, the accuracy metric is OKS AP w/o flip. Quantized 2D video action recognition models are calibrated with num-segments=3 (7 is for ResNet-based models).

Bug fixes and Improvements

- Performance of PSPNet using ResNet101 as backbone on Cityscapes (semantic segmentation) is improved from mIoU 77.1% to 79.9%, higher than the number reported in original paper.
- We will deprecate Python2 support in the next release.

0.5

| Model | Metric | 0.5 |
|---------------------------|--------|-----|
| vgg16_ucf101 | UCF101 Top-1 | 83.4 |
| inceptionv3_ucf101 | UCF101 Top-1 | 88.1 |
| inceptionv3_kinetics400 | Kinetics400 Top-1 | 72.5 |
| alpha_pose_resnet101_v1b_coco | OKS AP (with flip) | 76.7/92.6/82.9 |

0.5.0

0.4

| Model | Metric | 0.4 |
|---------------------------|--------|-----|
| simple_pose_resnet152_v1b | OKS AP* | 74.2 |
| simple_pose_resnet50_v1b | OKS AP* | 72.2 |
| ResNext50_32x4d | ImageNet Top-1 | 79.32 |
| ResNext101_64x4d | ImageNet Top-1 | 80.69 |
| SE_ResNext101_32x4d | ImageNet Top-1 | 79.95 |
| SE_ResNext101_64x4d | ImageNet Top-1 | 81.01 |

0.4.0

Highlights

0.3

Highlights

Added 5 new algorithms and updated 38 pre-trained models with improved accuracy
Compare 7 selected models

| Model | Metric | 0.2 | 0.3 | Reference |
| ------------------- | --------------------- | ------ | ------ | ------------------------------------------------------------ |
| [ResNet-50](https://gluon-cv.mxnet.io/model_zoo/classification.html#resnet) | top-1 acc on ImageNet | 77.07% | **79.15%** | 75.3% ([Caffe impl](https://github.com/KaimingHe/deep-residual-networks)) |
| [ResNet-101](https://gluon-cv.mxnet.io/model_zoo/classification.html#resnet) | top-1 acc on ImageNet | 78.81% | **80.51%** | 76.4% ([Caffe impl](https://github.com/KaimingHe/deep-residual-networks)) |
| [MobileNet 1.0](https://gluon-cv.mxnet.io/model_zoo/classification.html#mobilenet) | top-1 acc on ImageNet | N/A | **73.28%** | 70.9% ([tensorflow impl)](https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md) |
| [Faster-RCNN](https://gluon-cv.mxnet.io/model_zoo/detection.html#id37) | mAP on COCO | N/A | **40.1%** | 39.6% ([Detectron](https://github.com/facebookresearch/Detectron)) |
| [Yolo-v3](https://gluon-cv.mxnet.io/model_zoo/detection.html#id44) | mAP on COCO | N/A | **37.0%** | 33.0% ([paper](https://pjreddie.com/media/files/papers/YOLOv3.pdf)) |

Page 4 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.