Changelogs » Pt-datasets

PyUp Safety actively tracks 273,605 Python packages for vulnerabilities and notifies you when to upgrade.

Pt-datasets

0.3.4

Use z-score normalization on MNIST and CIFAR10 datasets.

0.3.3

Do not import `tsnecuda` by default, only when `use_cuda=True` in `pt_datasets.encode_features`.

0.3.2


        

0.3.0

Overview
  
  Addition of AG News text classification dataset, which can be vectorized either using n-Grams or TF-IDF.
  
  python
  >>> from pt_datasets import load_dataset
  >>> train_data, test_data = load_dataset("ag_news")
  >>> train_features = train_data.tensors[0]   use index 0 to access features matrix
  >>> train_labels = train_data.tensors[1]   use index 1 to access labels matrix

0.2.0

Overview
  
  Addition of MalImg (Malware Image) classification dataset from [Nataraj et al., 2011](https://dl.acm.org/doi/10.1145/2016904.2016908).
  
  python
  >>> from pt_datasets import load_dataset
  >>> train_data, test_data = load_dataset("malimg")
  >>> train_features = train_data.tensors[0]   use index 0 to access features
  >>> train_labels = train_data.tensors[1]   use index 1 to access labels
  
  
  Citations
  
  When using the Malware Image classification dataset, kindly use the following
  citations,
  
  - BibTex
  
  
  article{agarap2017towards,
  title={Towards building an intelligent anti-malware system: a deep learning approach using support vector machine (SVM) for malware classification},
  author={Agarap, Abien Fred},
  journal={arXiv preprint arXiv:1801.00318},
  year={2017}
  }
  
  
  - MLA
  
  
  Agarap, Abien Fred. "Towards building an intelligent anti-malware system: a
  deep learning approach using support vector machine (svm) for malware
  classification." arXiv preprint arXiv:1801.00318 (2017).

0.1.3

Install this package through PyPi,
  
  shell script
  $ pip install pt-datasets
  
  
  To enable GPU support for t-SNE, run the installation script for `tsnecuda`,
  
  shell script
  $ git clone https://github.com/AFAgarap/pt-datasets.git
  $ cd pt-datasets
  $ bash setup/install_tsnecuda

0.1.0

This repository is meant for easier and faster access to the following image classification datasets: MNIST, Fashion-MNIST, EMNIST-Balanced, CIFAR10, and SVHN. Using this repository, one can load the aforementioned datasets in a ready-to-use fashion for PyTorch models. Additionally, this can be used to load the low-dimensional features of the aforementioned datasets, encoded using PCA, t-SNE, or UMAP.
  
  **Features**
  - Load MNIST, Fashion-MNIST, EMNIST-Balanced, CIFAR10, and SVHN PyTorch datasets.
  - Create a data loader for a dataset, ready for model use.
  - Encode features to lower-dimensional space using PCA, t-SNE, and UMAP.