Highlights
torchaudio now includes a new model module (with wav2letter included), new functionals (contrast, cvm, dcshift, overdrive, vad, phaser, flanger, biquad), datasets (GTZAN, CMU), and a new optional sox backend with support for torchscript. torchaudio now also supports Windows, with the soundfile backend.
torchaudio requires python 3.6 or more recent.
Backwards Incompatible Changes
* We reorganized the C++ resources (630) and replaced C++ bindings for sox_effects init/list/shutdown with torch binding (748).
* We removed code specific to python 2 (691), and we no longer tests against python 2 (575) and 3.5 (577)
New Features
* We now support Windows. (604, 637, 642, 655, 743)
* We now have a model module which includes wav2letter. (462, 722)
* We added the GTZAN and CMU datasets. (668, 710)
* We now have the contrast functional (551), cvm (540), dcshift (558), overdrive (569), vad (578, 599), phaser (587, 607, 702), flanger (651, 702), biquad (661).
* We added a new sox_io backend (718, 728, 734, 727, 763, 752, 731, 732, 726, 780) that is compatible with torchscript with a new AudioMetaData class (761).
* MelSpectrogram now has power and normalized parameters (633), and slaney normalization (589, 641).
* lfilter now has a clamp option. (600)
* Griffin-Lim can now have zero momentum. (601)
* sliding_window_cmn now supports batching. (570)
* Downloaded datasets now verify checksums. (499)
Improvements
* We added ogg/vorbis/opus support to binary distribution (750, 755).
* We replaced the use of torch.norm in spectrogram to improve performance (747).
* We now use fused operations in lfilter for faster computation. (517, 564)
* STFT is now called directly from torchaudio. (531)
* We redesigned the backend mechanism to support torchscript, by restructuring the code (695, 696, 700, 706, 707, 698), adding dynamic listing (697)
* torchaudio can be built along with sox, or can use external sox. (625, 669, 739)
* We redesigned the sox_effects module. (708)
* We added more details to compilation instructions. (667)
* We updated the README with instructions on changing the backend. (553)
* We now have a version compatibility matrix in README. (685)
* We now use cmake to build third party libraries (753).
* We now use CircleCI instead of travis (576, 584, 598, 603, 636, 738) and we test on GPU (586, 777).
* We run the test suite against nightlies. (538, 678)
* We redesigned our test suite: with new helper functions (514, 519, 521, 565, 616, 690, 692, 694), standard pytorch test utilities (513, 640, 643, 645, 646, 652, 650, 712), separated CPU and GPU tests (513, 528, 644), more descriptive names (532), clearer organization (539, 541, 542, 664, 672, 687, 703, 716, 732), standardized name (559), and backend aware (719). This is detailed in a new README for testing (566, 759).
* We now support typing, for datasets (511, 522), for backends (527), for init (526), and inline (530), with mypy configuration (524, 544, 590).
Bug Fixes
* We removed in place operations so that Griffin-Lim can be backpropagated through. (730)
* We fixed kaldi MFCC on GPU. (681)
* We removed multiple definitions of SoxEffect in C++. (635)
* We fixed the docstring of masking. (612)
* We replaced views by reshape for batching. (594)
* We fixed missing conda environment when testing in python 3.8. (582)
* We ensure that sox is not exposed in windows. (579)
* We corrected the instructions to install nightlies. (547, 552)
* We fix the seed of mask_along_iid. (529)
* We correctly report GPU tests as skipped instead of passed. (516)
Deprecations
* Since sox_effects is now automatically initialized and shutdown (572, 693), we are deprecating these functions (709).
* ISTFT is migrating to torch. (523)