Added
- [CLI] `tfds build` to the CLI. See
[documentation](https://www.tensorflow.org/datasets/cli#tfds_build_download_and_prepare_a_dataset).
- [API] `tfds.features.Dataset` to represent nested datasets.
- [API] `tfds.ReadConfig(add_tfds_id=True)` to add a unique id to the example
`ex['tfds_id']` (e.g. `b'train.tfrecord-00012-of-01024__123'`).
- [API] `num_parallel_calls` option to `tfds.ReadConfig` to overwrite to
default `AUTOTUNE` option.
- [API] `tfds.ImageFolder` support for `tfds.decode.SkipDecoder`.
- [API] Multichannel audio support to `tfds.features.Audio`.
- [API] `try_gcs` to `tfds.builder(..., try_gcs=True)`
- Better `tfds.as_dataframe` visualization (ffmpeg video if installed,
bounding boxes,...).
- [TESTING] Allow `max_examples_per_splits=0` in `tfds build
--max_examples_per_splits=0` to test `_split_generators` only (without
`_generate_examples`).
- New datasets.
Changed
- [API] DownloadManager now returns
[Pathlib-like](https://docs.python.org/3/library/pathlib.html#basic-use)
objects.
- [API] Simpler `BuilderConfig` definition: class `VERSION` and
`RELEASE_NOTES` are applied to all `BuilderConfig`. Config description is
now optional.
- [API] To guarantee better deterministic, new validations are performed on
the keys when creating a dataset (to avoid filenames as keys
(non-deterministic) and restrict key to `str`, `bytes` and `int`). New
errors likely indicates an issue in the dataset implementation.
- [API] `tfds.core.benchmark` now returns a `pd.DataFrame` (instead of a
`dict`).
- [API] `tfds.units` is not visible anymore from the public API.
- Datasets updates.
Deprecated
Removed
- Configs for all text datasets. Only plain text version is kept. For example:
`multi_nli/plain_text` -> `multi_nli`.
Fixed
- [API] Datasets returned by `tfds.as_numpy` are compatible with `len(ds)`.
- Support 0-len sequence with images of dynamic shape (Fix 2616).
- Progression bar correctly updated when copying files.
- Better debugging and error message (e.g. human readable size,...).
- Many bug fixes (GPath consistency with pathlib, s3 compatibility, TQDM
visual artifacts, GCS crash on windows, re-download when checksums updated,
...).