Tensorflow-datasets

Latest version: v4.9.4

Safety actively analyzes 629639 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 5 of 6

2.0.0

Added

- Several new datasets. Thanks to all the
[contributors](https://github.com/tensorflow/datasets/graphs/contributors)!
- Support for nested `tfds.features.Sequence` and `tf.RaggedTensor`
- Custom `FeatureConnector`s can override the `decode_batch_example` method
for efficient decoding when wrapped inside a
`tfds.features.Sequence(my_connector)`.
- Beam datasets can use a `tfds.core.BeamMetadataDict` to store additional
metadata computed as part of the Beam pipeline.
- Beam datasets' `_split_generators` accepts an additional `pipeline` kwargs
to define a pipeline shared between all splits.

Changed

- The default versions of all datasets are now using the S3 slicing API. See
the [guide](https://www.tensorflow.org/datasets/splits) for details.
- `shuffle_files` defaults to False so that dataset iteration is deterministic
by default. You can customize the reading pipeline, including shuffling and
interleaving, through the new `read_config` parameter in
[`tfds.load`](https://www.tensorflow.org/datasets/api_docs/python/tfds/load).
- `urls` kwargs renamed `homepage` in `DatasetInfo`

Deprecated

- Python2 support: this is the last version of TFDS that will support
Python 2. Going forward, we'll only support and test against Python 3.
- The previous split API is still available, but is deprecated. If you wrote
`DatasetBuilder`s outside the TFDS repository, please make sure they do not
use `experiments={tfds.core.Experiment.S3: False}`. This will be removed in
the next version, as well as the `num_shards` kwargs from `SplitGenerator`.

Fixed

- Various other bug fixes and performance improvements. Thank you for all the
reports and fixes!

1.3.0

Fixed

- Misc bugs and performance improvements.

1.2.0

Added

Features

- Add `shuffle_files` argument to `tfds.load` function. The semantic is the
same as in `builder.as_dataset` function, which for now means that by
default, files will be shuffled for `TRAIN` split, and not for other splits.
Default behaviour will change to always be False at next major release.
- Most datasets now support the new S3 API
([documentation](https://github.com/tensorflow/datasets/blob/master/docs/splits.md#two-apis-s3-and-legacy)).
- Support for uint16 PNG images.

Datasets

- AFLW2000-3D
- Amazon_US_Reviews
- binarized_mnist
- BinaryAlphaDigits
- Caltech Birds 2010
- Coil100
- DeepWeeds
- Food101
- MIT Scene Parse 150
- RockYou leaked password
- Stanford Dogs
- Stanford Online Products
- Visual Domain Decathlon

Fixed

- Crash while shuffling on Windows
- Various documentation improvements

1.1.0

Added

Features

- `in_memory` option to cache small dataset in RAM.
- Better sharding, shuffling and sub-split.
- It is now possible to add arbitrary metadata to `tfds.core.DatasetInfo`
which will be stored/restored with the dataset. See `tfds.core.Metadata`.
- Better proxy support, possibility to add certificate.
- `decoders` kwargs to override the default feature decoding
([guide](https://github.com/tensorflow/datasets/tree/master/docs/decode.md)).

Datasets

- [downsampled_imagenet](https://github.com/tensorflow/datasets/tree/master/docs/datasets.md#downsampled_imagenet).
- [patch_camelyon](https://github.com/tensorflow/datasets/tree/master/docs/datasets.md#patch_camelyon).
- [coco](https://github.com/tensorflow/datasets/tree/master/docs/datasets.md#coco)
2017 (with and without panoptic annotations).
- uc_merced.
- trivia_qa.
- super_glue.
- so2sat.
- snli.
- resisc45.
- pet_finder.
- mnist_corrupted.
- kitti.
- eurosat.
- definite_pronoun_resolution.
- curated_breast_imaging_ddsm.
- clevr.
- bigearthnet.

1.0.2

Added

- [Apache Beam support](https://www.tensorflow.org/datasets/beam_datasets).
- Direct GCS access for MNIST (with `tfds.load('mnist', try_gcs=True)`).
- More datasets.
- Option to turn off tqdm bar (`tfds.disable_progress_bar()`).

Fixed

- Subsplit do not depends on the number of shard anymore
(https://github.com/tensorflow/datasets/issues/292).
- Various bugs.

1.0.1

Added

- Dataset
[`celeb_a_hq`](https://github.com/tensorflow/datasets/blob/master/docs/datasets.md#celeb_a_hq).

Fixed

- Bug 52 that was putting the process in Eager mode by default.

Page 5 of 6

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.