Changelogs » Vak



- add helper function to TestLearncurve that multiple unit tests can use to assert all outputs
were generated. Now being used to make sure bug fixed in 0.1.0a8 stays fixed.
- error checking in cli that raises ValueError when cli command is `learncurve` and the option
'results_dir_made_by_main_script' is already defined in [OUTPUT] section, since running
'learncurve' would overwrite it.
- `dataset` subpackage that houses `VocalizationDataset` and related classes that facilitate creating data sets for training neural networks from heterogeneous data: audio files, files of arrays containing spectrograms, different annotation types, etc.
- also includes modules for handling each data source
+ e.g. `audio.to_spect` creates spectrograms from audio files
+ `spect.from_files` creates a `VocalizationDataset` from spectrogram files
- `core` sub-package that contains / will contain functions that do heavy lifting: `learning_curve`, `train`, `predict`
+ `learning_curve` is a sub-sub-module that does both `train` and `test` of models, instead of having a separate `learncurve` and `summary` function (i.e. train and test). Still will confuse some ML/AI people that this "learning curve" has a test data step but whatevs
+ `cli` sub-package calls / will call these functions and handle any command-line-interface specific logic
(e.g. making changes to `config.ini` files)

- change name of `vak.cli.make_data` to `vak.cli.prep`
- structure of `config.ini` file
+ now specify either `audio_format` or `spect_format` in `[DATA]` section
+ and `annot_format` for annotations
- refactor `utils` sub-package
+ move several functions from `data` and `general` into a `labels` module

- remove unused options from command-line interface: `--glob`, `--txt`, `--dataset`
- `skip_files_with_labels_not_in_labelset` option
+ now happens whenever `labelset` is specified; if no `labelset` is given then no filtering is done
- `summary` command-line option, since `learncurve` now runs trains models and also tests them on separate data set
- `silent_label_gap` option, because `VocalizationDataset` class determines if a label for unlabeled segments between other segments is needed, and if so automatically assigns this a label of 0 when mapping user labels to consecutive integers
+ this way user does not have to think about it
+ and program doesn't have to keep track of a `labels_mapping` file that saves what user specified


- Fix how main loop in `learncurve` re-loads indices for grabbing subsets of training data after
generating them, and do so in a way that still allows for re-using subsets from previous runs


- `vak.cli.summary` has `save_transformed_data` parameter and `vak.cli` passed value from
`` as the argument when calling `vak.cli.summary`

- `vak.cli.summary` only saves transformed train/test data if `save_transformed_data` is `True`
- move a test from tests/unit_tests/ into tests/unit_tests/test_utils/

- `vak.cli.summary` no longer saves copy of test data in results directory


- add test for

- learncurve gets indices for all train data subsets before starting training


- Use `attrs`-based classes to represent sections of config.ini files

- rewrite `vak.cli` so it can deal with state of config.ini files
+ e.g. doesn't throw an error if `train_data_path` not declared as an option in [TRAIN] when running `vak prep`
(since training data won't exist yet, doesn't make sense to throw an error).

- remove code about `freq_bins` in a couple of places, since the number of frequency bins
in spectrograms is now just determined programmatically
+ `` no longer has `freq_bins` field in DataConfig namedtuple
+ `make_data` no longer adds `freq_bins` option to [DATA] section after making data sets


- add missing 'save_transformed_data' option to Data config parsing


- checkpoints saved in individual directories by `learncurve` so they are more cleanly segregated,
e.g. if user wants to point to a specific checkpoint when calling `predict`
- calling `vak prep config.ini` will run `vak.cli.make_data` function
+ so to generate a learning curve, the three steps now are:
vak prep config.ini
vak learncurve config.ini
vak summary config.ini

- `vak.cli.train` runs all the way through, passes basic "does not crash" test
- `vak.cli.predict` runs all the way through, passes basic "does not crash" test


- description in (matches README + Github)
- move command-line interface logic out of, into cli/
- `make_data` and `learncurve` functions use `tqdm` for progress bars

- main() knows to look for `configfile` command-line argument (not `config`)
- `config` module expands user (on Linux/Mac) for (some) directory names


First release; still in pre-release
- name change from 'songdeck' to 'vak'