Datasets

Latest version: v2.19.0

Safety actively analyzes 619426 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 6

4.9.4

Added

- A new [CroissantBuilder](https://www.tensorflow.org/datasets/format_specific_dataset_builders#croissantbuilder)
which initializes a DatasetBuilder based on a [Croissant](https://github.com/mlcommons/croissant)
metadata file.
- New conversion options between different bounding boxes formats.
- Better support for `HuggingfaceDatasetBuilder`.
- A [script](https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/scripts/convert_format.py)
to convert a dataset from one format to another.

Changed

Deprecated

- Python 3.9 support. TFDS now uses Python 3.10

Removed

Fixed

Security

4.9.3

Added

- [Segment Anything](https://ai.facebook.com/datasets/segment-anything-downloads)
(SA-1B) dataset.

Changed

- Hugging Face datasets accept `None` values for any features. TFDS has no
`tfds.features.Optional`, so `None` values are converted to default values.
Those default values used to be `0` and `0.0` for int and float. Now, it's
`-inf` as defined by NumPy (e.g., `np.iinfo(np.int32).min` or
`np.finfo(np.float32).min`). This avoids ambiguous values when `0` and `0.0`
exist in the values of the dataset. The roadmap is to implement
`tfds.features.Optional`.

Deprecated

- Python 3.8 support. As per
[NEP 29](https://numpy.org/neps/nep-0029-deprecation_policy.html), TFDS now
uses Python>=3.9.

Removed

Fixed

Security

4.9.2

Added

- [Experimental] A list of freeform text tags can now be attached to a
`BuilderConfig`. For example:
py
BUILDER_CONFIGS = [
tfds.core.BuilderConfig(name="foo", tags=["foo", "live"]),
tfds.core.BuilderConfig(name="bar", tags=["bar", "old"]),
]

The tags are recorded with the dataset metadata and can later be retrieved
using the info object:
py
builder.info.config_tags ["foo", "live"]

This feature is experimental and there are no guidelines on tags format.

Changed

Deprecated

Removed

Fixed

- Fixed generated proto files (see issue [4858](https://github.com/tensorflow/datasets/issues/4858)).

Security

4.9.1

Added

Changed

Deprecated

Removed

Fixed

- The installation on macOS now works (see issues
[4805](https://github.com/tensorflow/datasets/issues/4805) and
[4852](https://github.com/tensorflow/datasets/issues/4852)). The ArrayRecord
dependency is lazily loaded, so the
[TensorFlow-less path](https://www.tensorflow.org/datasets/tfless_tfds) is
not possible at the moment on macOS. A fix for this will follow soon.

Security

4.9.0

Added

- Native support for JAX and PyTorch. TensorFlow is no longer a dependency for
reading datasets. See the
[documentation](https://www.tensorflow.org/datasets/tfless_tfds).
- Added minival split to
[LVIS dataset](https://www.tensorflow.org/datasets/catalog/lvis).
- [Mixed-human](https://www.tensorflow.org/datasets/catalog/robomimic_mh) and
[machine-generated](https://www.tensorflow.org/datasets/catalog/robomimic_mg)
robomimic datasets.
- WebVid dataset.
- ImagenetPI dataset.
- [Wikipedia](https://www.tensorflow.org/datasets/catalog/wikipedia) for
20230201.

Changed

- Support for `tensorflow=2.12`.

Deprecated

Removed

Fixed

Security

4.8.3

Added

Changed

Deprecated

- Python 3.7 support: this version and future version use Python 3.8.

Removed

Fixed

- Flag `ignore_verifications` from Hugging Face's `datasets.load_dataset` is
deprecated, and used to cause errors in `tfds.load(huggingface:foo)`.

Security

Page 1 of 6

Releases

Has known vulnerabilities

Datasets

Page 1 of 6

4.9.4

4.9.3

4.9.2

4.9.1

4.9.0

4.8.3

Page 1 of 6

Links

Releases