Changelogs >

Towhee

PyUp actively tracks 429,349 Python packages for vulnerabilities to keep your Python environments secure.

Scan your dependencies

0.7.0

Add 6 video understanding/classification models

* **Video Swin Transformer**
* page: [*action-classification/video-swin-transformer*](https://towhee.io/action-classification/video-swin-transformer)
* paper: [*Video Swin Transformer*](https://arxiv.org/pdf/2106.13230v1.pdf)

* **TSM**
* page: [*action-classification/tsm*](https://towhee.io/action-classification/tsm)
* paper: [*TSM: Temporal Shift Module for Efficient Video Understanding*](https://arxiv.org/pdf/1811.08383v3.pdf)

* **Uniformer**
* page: [*action-classification/uniformer*](https://towhee.io/action-classification/uniformer)
* paper: [*UNIFORMER: UNIFIED TRANSFORMER FOR EFFICIENT SPATIOTEMPORAL REPRESENTATION LEARNING*](https://arxiv.org/pdf/2201.04676v3.pdf)

* **OMNIVORE**
* page: [*action-classification/omnivore*](https://towhee.io/action-classification/omnivore)
* paper: [*OMNIVORE: A Single Model for Many Visual Modalities*](https://arxiv.org/pdf/2201.08377.pdf)

* **TimeSformer**
* page: [*action-classification/timesformer*](https://towhee.io/action-classification/timesformer)
* paper: [*Is Space-Time Attention All You Need for Video Understanding?*](https://arxiv.org/pdf/2102.05095.pdf)

* **MoViNets**
* page: [*action-classification/movinet*](https://towhee.io/action-classification/movinet)
* paper: [*MoViNets: Mobile Video Networks for Efficient Video Recognition*](https://arxiv.org/pdf/2103.11511.pdf)

Add 4 video retrieval models

* **CLIP4Clip**
* page: [*video-text-embedding/clip4clip*](https://towhee.io/video-text-embedding/clip4clip)
* paper: [*CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval*](https://arxiv.org/pdf/2104.08860.pdf)

* **DRL**
* page: [*video-text-embedding/drl*](https://towhee.io/video-text-embedding/drl)
* paper: [*Disentangled Representation Learning for Text-Video Retrieval*](https://arxiv.org/pdf/2203.07111.pdf)

* **Frozen in Time**
* page: [*video-text-embedding/frozen-in-time*](https://towhee.io/video-text-embedding/frozen-in-time)
* paper: [*Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval*](https://arxiv.org/pdf/2104.00650.pdf)

* **MDMMT**
* page: [*video-text-embedding/mdmmt*](https://towhee.io/video-text-embedding/mdmmt)
* paper: [*MDMMT: Multidomain Multimodal Transformer for Video Retrieval*](https://arxiv.org/pdf/2103.10699.pdf)

0.6.1

Add 3 text-image multimodal models

* **CLIP**
* page: [*image-text-embedding/clip*](https://towhee.io/image-text-embedding/clip)
* paper: [*Learning Transferable Visual Models From Natural Language Supervision*](https://arxiv.org/pdf/2103.00020.pdf)

* **BLIP**
* page: [*image-text-embedding/blip*](https://towhee.io/image-text-embedding/blip)
* paper: [*BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation*](https://arxiv.org/pdf/2201.12086.pdf)

* **LightningDOT**
* page: [*image-text-embedding/lightningdot*](https://towhee.io/image-text-embedding/lightningdot)
* paper: [*LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval*](https://arxiv.org/pdf/2103.08784.pdf)

Add 6 video understanding/classification models

* **I3D** (from PyTorchVideo)
* page: [*action-classification/pytorchvideo*](https://towhee.io/action-classification/pytorchvideo)
* paper: [*Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset*](https://arxiv.org/pdf/1705.07750.pdf)

* **C2D** (from PyTorchVideo)
* page: [*action-classification/pytorchvideo*](https://towhee.io/action-classification/pytorchvideo)
* paper: [*Non-local Neural Networks*](https://arxiv.org/pdf/1711.07971.pdf)

* **Slow** (from PyTorchVideo)
* page: [*action-classification/pytorchvideo*](https://towhee.io/action-classification/pytorchvideo)
* paper: [*SlowFast Networks for Video Recognition*](https://arxiv.org/pdf/1812.03982.pdf)

* **SlowFast** (from PyTorchVideo)
* page: [*action-classification/pytorchvideo*](https://towhee.io/action-classification/pytorchvideo)
* paper: [*SlowFast Networks for Video Recognition*](https://arxiv.org/pdf/1812.03982.pdf)

* **X3D** (from PyTorchVideo)
* page: [*action-classification/pytorchvideo*](https://towhee.io/action-classification/pytorchvideo)
* paper: [*X3D: Expanding Architectures for Efficient Video Recognition*](https://arxiv.org/pdf/2004.04730.pdf)

* **MViT** (from PyTorchVideo)
* page: [*action-classification/pytorchvideo*](https://towhee.io/action-classification/pytorchvideo)
* paper: [*Multiscale Vision Transformers*](https://arxiv.org/pdf/2104.11227.pdf)

Links

Releases

Has known vulnerabilities