Espnet

Latest version: v202402

Safety actively analyzes 629855 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 7 of 7

0.0.0a4

0.0.0a3

0.0.0a2

v.202402
News
We're thrilled to announce that our latest update brings two groundbreaking features to our project: `espnetez` and `ESPnet-SPK`!

New Features
- [**New Features**][**ESPnet2**][**ESPnet1**][**Installation**][**SE**] Add diffusion-base SE model to ESPnet-SE 5572 by LiChenda
- [**New Features**][**ESPnet2**][**ESPnet1**][**CI**][**ASR**] Add Bayes Risk CTC (reworked) 5519 by jctian98
- [**New Features**][**ESPnet2**][**TTS**] TTS evaluation script and monitoring functionality using MOS prediction model 5485 by Takaaki-Saeki
- [**New Features**][**ESPnet2**][**SE**] Add USES model for speech enhancement in diverse conditions 5482 by Emrys365
- [**New Features**][**ESPnet2**][**CI**][**SID**] ESPnet-SPk: major update 5408 by Jungjee
- [**New Features**][**ESPnet2**][**TTS**][**ASR**] Add espnetez 5372 by Masao-Someki

Enhancement
- [**Enhancement**][**ESPnet2**][**OWSM**] Improving OWSM inference interface 5618 by pyf98
- [**Enhancement**][**ESPnet2**][**OWSM**] Add OWSM v3.1 5611 by pyf98
- [**Enhancement**][**ESPnet2**][**CI**] ESPnet-SPK: Additional models, supplement readme 5559 by Jungjee
- [**Enhancement**][**ESPnet2**][**CI**][**SE**] Add PyTorch & GPU support for DNSMOS calculation 5548 by Emrys365
- [**Enhancement**][**ESPnet2**][**TTS**][**SID**] Speaker embedding extractor (with ESPnet pre-trained speaker model) 5579 by ftshijt

Recipe
- [**Recipe**][**ESPnet2**][**Music**] Fix relative setting of train-dev-test 5623 by ftshijt
- [**Recipe**][**ESPnet2**][**SID**] ESPnet-SPK: add Voxblink recipe 5583 by Jungjee
- [**Recipe**][**ESPnet2**][**SID**] ESPnet-SPK: Model upload and result generation 5558 by Jungjee
- [**Recipe**][**ESPnet2**][**Music**] ACE singer recipe fixing 5551 by ftshijt
- [**Recipe**][**ESPnet2**][**TTS**] TTS2 Template 5541 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**] fix kaldi dependency in asr2 5540 by ftshijt
- [**Recipe**][**ESPnet2**][**CI**][**S2ST**] CI test for s2st 5526 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**] Added data.sh to SPRING-INX IITM Recipe 5522 by arjun-gangwar
- [**Recipe**][**ESPnet2**][**ASR**] Add Libriheavy small and medium ASR2 recipes 5512 by akreal
- [**Recipe**][**ESPnet2**][**ASR**] SPRING-INX IITM RECIPE 5505 by arjun-gangwar
- [**Recipe**][**ESPnet2**][**ASR**][**RNNT**] Add transducer conformer configuration to commonvoice recipe 5503 by zuazo
- [**Recipe**][**ESPnet2**][**ESPnet1**] add centralized data preparation for OWSM 5478 by jctian98
- [**Recipe**][**ESPnet1**] Added clean speech results 5649 by linan2
- [**Recipe**][**ESPnet2**][**Installation**][**AV**] AVSR recipe for Easycom Dataset 5630 by ms-dot-k
- [**Recipe**][**ESPnet2**] Update CHiME-7 ASR1 recipe 5555 by popcornell
- [**Recipe**][**ESPnet2**] Add E-Branchformer model checkpoint in OWSM v2 5517 by pyf98
- [**Recipe**][**ESPnet2**][**SLU**] Slue PR configs 5087 by siddhu001

Bugfix
- [**Bugfix**][**ESPnet2**] Fix path dependency in ESPnet tutorial 5645 by siddhu001
- [**Bugfix**][**ESPnet2**] Fix ESPnet tutorial 5644 by siddhu001
- [**Bugfix**] Fix CI 5642 by siddhu001
- [**Bugfix**][**ESPnet2**] Fixed bug by copying missing Kaldi scripts 5636 by VicentCano
- [**Bugfix**][**ESPnet1**][**ASR**] CTC prefix score, fix if blank == eos 5620 by albertz
- [**Bugfix**][**ESPnet2**] Fix minor OWSM data prep bug 5607 by juice500ml
- [**Bugfix**][**ESPnet2**][**ESPnet1**][**CI**] E721 5589 by sw005320
- [**Bugfix**][**ESPnet2**][**ESPnet1**] Make minlenratio effective 5581 by jctian98
- [**Bugfix**][**ESPnet2**] Fix except 5567 by takenori-y
- [**Bugfix**][**ESPnet1**][**Installation**][**CI**] Improve error robustness of unit tests 5535 by Emrys365
- [**Bugfix**][**ESPnet2**][**AV**] Fix bug in lrs3 data preprocessing 5520 by ms-dot-k
- [**Bugfix**][**ESPnet1**] replace old mustc links with new instructions 5516 by brianyan918
- [**Bugfix**][**ESPnet2**][**ST**] Fix s2st HF model uploading 5504 by tjysdsg
- [**Bugfix**][**ESPnet2**][**ESPnet1**] bug fixes for must_c v2 recipe 5640 by jasonmusespresso

Documentation
- [**Documentation**][**ESPnet2**] Add instructions for finetuning owsm 5539 by pyf98
- [**Documentation**] Updated the reference of the accepted JOSS paper 5515 by neillu23

Others
- [**Others**] Update Discord Invitation Link 5578 by Fhrozen
- [**Others**][**ESPnet2**][**CI**] Improve error robustness of unit tests 5523 by Emrys365

Acknowledgements
Special thanks to Emrys365, Fhrozen, Jungjee, LiChenda, Masao-Someki, Takaaki-Saeki, VicentCano, akreal, albertz, arjun-gangwar, brianyan918, ftshijt, jasonmusespresso, jctian98, juice500ml, linan2, ms-dot-k, neillu23, popcornell, pyf98, siddhu001, sw005320, takenori-y, tjysdsg, zuazo.



v.202310
What's Changed
* Support arbitrary language finetune for Whisper models. by pengchengguo in https://github.com/espnet/espnet/pull/5344
* Update Dipco Data URL by Fhrozen in https://github.com/espnet/espnet/pull/5391
* Update readme in TEMPLATE/svs1 by linyueqian in https://github.com/espnet/espnet/pull/5394
* add gramvaani asr recipe by bloodraven66 in https://github.com/espnet/espnet/pull/5366
* ESPnet-SPK: sampler by Jungjee in https://github.com/espnet/espnet/pull/5365
* Adding general data augmentation methods for speech preprocessing by Emrys365 in https://github.com/espnet/espnet/pull/5370
* Update of several SE recipes and some minor fixes by Emrys365 in https://github.com/espnet/espnet/pull/5401
* Reproducing MIMOIRIS by YoshikiMas in https://github.com/espnet/espnet/pull/5409
* Kathbath asr by bloodraven66 in https://github.com/espnet/espnet/pull/5369
* Add pytorch2.0.1 to CI by kamo-naoyuki in https://github.com/espnet/espnet/pull/5413
* [skip ci] Update README.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5417
* In spec_augment.py, check whether an array is writeable before modifying it inplace by mdecerbo in https://github.com/espnet/espnet/pull/5416
* Docker updates for local builds by Fhrozen in https://github.com/espnet/espnet/pull/5406
* fix typo in TEMPLATE/svs1/README.md by linyueqian in https://github.com/espnet/espnet/pull/5426
* Update install_mwerSegmenter.sh by sw005320 in https://github.com/espnet/espnet/pull/5437
* Support Whisper-style training as a new task S2T by pyf98 in https://github.com/espnet/espnet/pull/5120
* fix twice numpy installation issue by kan-bayashi in https://github.com/espnet/espnet/pull/5447
* Add Whisper SOT recipe for Librimix by LiChenda in https://github.com/espnet/espnet/pull/5371
* Update for the JOSS paper editor review by neillu23 in https://github.com/espnet/espnet/pull/5418
* Add the VOiCES recipe for ASR by Emrys365 in https://github.com/espnet/espnet/pull/5448
* Improve diacritic compatibility in data_prep.pl preprocessing scripts by zuazo in https://github.com/espnet/espnet/pull/5445
* [WIP] create recipe for acesinger by linyueqian in https://github.com/espnet/espnet/pull/5431
* Add BibleTTS recipe by wyh2000 in https://github.com/espnet/espnet/pull/5436
* ASR2 CHiME4 & Gigaspeech Recipes by yichen14 in https://github.com/espnet/espnet/pull/5434
* [pre-commit.ci] pre-commit autoupdate by pre-commit-ci in https://github.com/espnet/espnet/pull/5427
* Simple fix to reduce test_slu_inference time by siddhu001 in https://github.com/espnet/espnet/pull/5460
* Do not use root logger in Beamsearch by vsd-vector in https://github.com/espnet/espnet/pull/5454
* Fix whisper test by siddhu001 in https://github.com/espnet/espnet/pull/5464
* Add doc for OWSM by pyf98 in https://github.com/espnet/espnet/pull/5463
* Speech-to-speech translation Task by ftshijt in https://github.com/espnet/espnet/pull/4859
* AVSR recipes on LRS3 using pre-trained AV-HuBERT model by ms-dot-k in https://github.com/espnet/espnet/pull/5456
* Support LoRA based large model finetuning. by pengchengguo in https://github.com/espnet/espnet/pull/5400
* Multilingual Librispeech (MLS) refactor ASR1 recipe by juice500ml in https://github.com/espnet/espnet/pull/5323
* Add phonemized LibriTTS ASR recipe by akreal in https://github.com/espnet/espnet/pull/5466
* Update the Enh framework to support training with variable numbers of speakers by Emrys365 in https://github.com/espnet/espnet/pull/5414
* speed up TFGridNet code by zqwang7 in https://github.com/espnet/espnet/pull/5395
* [pre-commit.ci] pre-commit autoupdate by pre-commit-ci in https://github.com/espnet/espnet/pull/5468
* ASR2 recipe on Tedlium3 dataset by kohei0209 in https://github.com/espnet/espnet/pull/5331
* Create README.md in OWSM v1 by pyf98 in https://github.com/espnet/espnet/pull/5489
* Update setup.py by sw005320 in https://github.com/espnet/espnet/pull/5490
* Fix default value in ML-SUPERB by ftshijt in https://github.com/espnet/espnet/pull/5492
* Fix bugs of Whisper SOT. by pengchengguo in https://github.com/espnet/espnet/pull/5494
* Multilingual Librispeech ASR2 + ASR1 baselines by juice500ml in https://github.com/espnet/espnet/pull/5441
* Add a new SE recipe combining five public corpora by Emrys365 in https://github.com/espnet/espnet/pull/5484
* Update .mergify.yml by kamo-naoyuki in https://github.com/espnet/espnet/pull/5502
* update version to 202310 by kan-bayashi in https://github.com/espnet/espnet/pull/5501

New Contributors
* linyueqian made their first contribution in https://github.com/espnet/espnet/pull/5394
* mdecerbo made their first contribution in https://github.com/espnet/espnet/pull/5416
* zuazo made their first contribution in https://github.com/espnet/espnet/pull/5445
* wyh2000 made their first contribution in https://github.com/espnet/espnet/pull/5436
* yichen14 made their first contribution in https://github.com/espnet/espnet/pull/5434
* vsd-vector made their first contribution in https://github.com/espnet/espnet/pull/5454
* ms-dot-k made their first contribution in https://github.com/espnet/espnet/pull/5456
* juice500ml made their first contribution in https://github.com/espnet/espnet/pull/5323
* kohei0209 made their first contribution in https://github.com/espnet/espnet/pull/5331

**Full Changelog**: https://github.com/espnet/espnet/compare/v.202308...v.202310

v.202308
What's Changed
* Update tutorial by ftshijt in https://github.com/espnet/espnet/pull/4648
* Update tutorials by ftshijt in https://github.com/espnet/espnet/pull/4898
* add e-branchformer result for tedlium3 and add checker for text output length by Some-random in https://github.com/espnet/espnet/pull/5130
* Limit the Numpy version (<1.24) to fix CI error temporarily. by simpleoier in https://github.com/espnet/espnet/pull/5162
* [SVS] Add new recipes by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5158
* Update README.md of CHiME-7 DASR: fixing typos by popcornell in https://github.com/espnet/espnet/pull/5166
* Fix typo in CONTRIBUTING.md by eltociear in https://github.com/espnet/espnet/pull/5167
* CHiME-7 DASR: Update install_dependencies.sh, fix lhotse version by popcornell in https://github.com/espnet/espnet/pull/5168
* Update TD-SpeakerBeam by Emrys365 in https://github.com/espnet/espnet/pull/5155
* Add pre-trained causal speech separation model and streaming demo by LiChenda in https://github.com/espnet/espnet/pull/5172
* KSC recipe by khassanoff in https://github.com/espnet/espnet/pull/5171
* [SVS] Add new recipe by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5173
* Update AphasiaBank Recipe by tjysdsg in https://github.com/espnet/espnet/pull/5104
* fix the gradient backward issue when joint training with s3prl frontend by simpleoier in https://github.com/espnet/espnet/pull/5159
* Add installer for ParallelWaveGAN by ftshijt in https://github.com/espnet/espnet/pull/4052
* [GAN SVS] Add VISinger2, UHifiGAN, Avocodo by jerryuhoo in https://github.com/espnet/espnet/pull/5123
* [SVS] Update docs README.md by South-Twilight in https://github.com/espnet/espnet/pull/5178
* Update SVS README.md by jerryuhoo in https://github.com/espnet/espnet/pull/5180
* Adding eendss models by soumimaiti in https://github.com/espnet/espnet/pull/5157
* 2022fall new task tutorial by ftshijt in https://github.com/espnet/espnet/pull/5186
* [SVS] Updates for recipes by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5187
* [GAN SVS] fix phoneme predictor by jerryuhoo in https://github.com/espnet/espnet/pull/5188
* Update generate_librimix_sd.sh by leepeiying in https://github.com/espnet/espnet/pull/5182
* Bug fix for 5195 by YosukeHiguchi in https://github.com/espnet/espnet/pull/5196
* [SVS] Update on recipes by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5197
* Update preprocessor.py by sw005320 in https://github.com/espnet/espnet/pull/5200
* Minor fixes for ML-SUPERB by ftshijt in https://github.com/espnet/espnet/pull/5202
* Quick fix for whisper specaug by siddhu001 in https://github.com/espnet/espnet/pull/5206
* espnet-spk data preparation part by Jungjee in https://github.com/espnet/espnet/pull/5184
* Fix M4singer multi-spk recipe by ftshijt in https://github.com/espnet/espnet/pull/5201
* Update Dataset link for mlsuperb by ftshijt in https://github.com/espnet/espnet/pull/5216
* Fix bug when score_type is set to normal in ml_superb by ftshijt in https://github.com/espnet/espnet/pull/5217
* Add new functions and fix some bugs in SE by Emrys365 in https://github.com/espnet/espnet/pull/5193
* Update import order by ftshijt in https://github.com/espnet/espnet/pull/5229
* Closed CHiME-7 DASR adding evaluation inference + adding support to use diarization baseline "pre-computed" JSONs (new PR) by popcornell in https://github.com/espnet/espnet/pull/5228
* Standalone Transducer v1.1 by b-flo in https://github.com/espnet/espnet/pull/5140
* Small fixes for Transducer by b-flo in https://github.com/espnet/espnet/pull/5247
* add asr2 task and librispeech recipe as an example. by simpleoier in https://github.com/espnet/espnet/pull/5181
* fix norm compatibility in scale discriminator by kan-bayashi in https://github.com/espnet/espnet/pull/5240
* CFSD, SECS metrics for TTS by imdanboy in https://github.com/espnet/espnet/pull/5235
* Add new SE recipes: chime1/enh1, chime2/enh1, reverb/enh1, and wsj0_2mix/tse1 by Emrys365 in https://github.com/espnet/espnet/pull/5246
* Fix bugs in mfa_format.py by G-Thor in https://github.com/espnet/espnet/pull/5223
* New features for SVS by ftshijt in https://github.com/espnet/espnet/pull/5245
* re-fix norm compatibility in scale discriminator by kan-bayashi in https://github.com/espnet/espnet/pull/5249
* add conv1d subsampling 3 and egs2/librispeech/asr2 wavlm_large_21 kmeans (1000/2000) results by simpleoier in https://github.com/espnet/espnet/pull/5252
* Revise the ESPnet-SE++ Joss paper to incorporate the feedback from the reviewer. by neillu23 in https://github.com/espnet/espnet/pull/5212
* Fix a bug in score script for ML-SUPERB by ftshijt in https://github.com/espnet/espnet/pull/5254
* Refactor prep_segments in SVS by jerryuhoo in https://github.com/espnet/espnet/pull/5210
* A minor fix for num_splits_ssl for training by ftshijt in https://github.com/espnet/espnet/pull/5262
* [SVS] add singing tacotron by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5233
* Add script to use speaker averaged xvectors in TTS training by G-Thor in https://github.com/espnet/espnet/pull/5244
* Fix filling of waveform_buffer with samples for streaming inference by espnetUser in https://github.com/espnet/espnet/pull/5267
* Some name update for ml-superb by ftshijt in https://github.com/espnet/espnet/pull/5276
* Add support for K2 pruned transducer loss by b-flo in https://github.com/espnet/espnet/pull/5268
* Fix Transducer doc by b-flo in https://github.com/espnet/espnet/pull/5306
* Update installation.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5291
* Update install_nkf.sh by sw005320 in https://github.com/espnet/espnet/pull/5300
* Fix Cython version to pass the installation of libraries with Cython by kan-bayashi in https://github.com/espnet/espnet/pull/5310
* Update README.md by sw005320 in https://github.com/espnet/espnet/pull/5315
* Update setup.py by sw005320 in https://github.com/espnet/espnet/pull/5316
* Migrate recipe for nit_song070 from Muskit by wwwbxy123 in https://github.com/espnet/espnet/pull/5251
* [pre-commit.ci] pre-commit autoupdate by pre-commit-ci in https://github.com/espnet/espnet/pull/5294
* A few updates for asr2 and hubert by simpleoier in https://github.com/espnet/espnet/pull/5285
* Add decode_options and hyp_cleaner in evaluate_whisper_inference by pyf98 in https://github.com/espnet/espnet/pull/5272
* update pyworld version by kan-bayashi in https://github.com/espnet/espnet/pull/5319
* fix a data preparation issue for librimix recipe. by LiChenda in https://github.com/espnet/espnet/pull/5322
* Update README.md in egs2/librimix/tse1 and egs2/wsj0_2mix/tse1 by Emrys365 in https://github.com/espnet/espnet/pull/5289
* fix the s3prl frontend gradient backprop bug, ensuring feature_grad_mult=1.0 by simpleoier in https://github.com/espnet/espnet/pull/5297
* ESPNet-SPK part 2 - training by Jungjee in https://github.com/espnet/espnet/pull/5258
* remove some tests in espnet1 integration test by sw005320 in https://github.com/espnet/espnet/pull/5328
* Fix random segments by iamanigeeit in https://github.com/espnet/espnet/pull/5274
* Skip CI for draft PR by ftshijt in https://github.com/espnet/espnet/pull/5333
* Update cancel.yml by kan-bayashi in https://github.com/espnet/espnet/pull/5334
* Update several SE recipes and bash scripts by Emrys365 in https://github.com/espnet/espnet/pull/5327
* Add PULL_REQUEST_TEMPLATE.md by kan-bayashi in https://github.com/espnet/espnet/pull/5340
* ESPnet-Spk part 3 - inference every epoch using EER by Jungjee in https://github.com/espnet/espnet/pull/5314
* Minimize espnet2 integration test by kan-bayashi in https://github.com/espnet/espnet/pull/5324
* PR Labels for CI control by Fhrozen in https://github.com/espnet/espnet/pull/5320
* Split ci into several jobs by kan-bayashi in https://github.com/espnet/espnet/pull/5343
* Update CONTRIBUTING.md by sw005320 in https://github.com/espnet/espnet/pull/5335
* Update Scoring for Speech Summarization from NLG-Eval to Huggingface Evaluate by roshansh-cmu in https://github.com/espnet/espnet/pull/5341
* Fix documentation skip CI by Fhrozen in https://github.com/espnet/espnet/pull/5351
* Update the usage by sw005320 in https://github.com/espnet/espnet/pull/5349
* Docker Update by Fhrozen in https://github.com/espnet/espnet/pull/5321
* Update installation.md by sw005320 in https://github.com/espnet/espnet/pull/5348
* Fix doc condition by kan-bayashi in https://github.com/espnet/espnet/pull/5355
* Update issue templates by sw005320 in https://github.com/espnet/espnet/pull/5357
* Update Contribution.md by Fhrozen in https://github.com/espnet/espnet/pull/5352
* Fix .mergify condition by kan-bayashi in https://github.com/espnet/espnet/pull/5354
* Reduce ffmpeg installation time in ci by kan-bayashi in https://github.com/espnet/espnet/pull/5356
* Update CI table by kan-bayashi in https://github.com/espnet/espnet/pull/5359
* Clean workflow files by kan-bayashi in https://github.com/espnet/espnet/pull/5360
* Couple of tweaks for asr2.sh for the HF hub upload by akreal in https://github.com/espnet/espnet/pull/5362
* Update TEMPLATE_HF_Readme.md (fix bash typo) by akreal in https://github.com/espnet/espnet/pull/5361
* Add discrete-token ASR for LibriSpeech 100h by akreal in https://github.com/espnet/espnet/pull/5350
* Whisper fine-tuning recipes for CHiME-4 and WSJ by YoshikiMas in https://github.com/espnet/espnet/pull/5342
* Fix bug in ngram training in slu.sh by siddhu001 in https://github.com/espnet/espnet/pull/5364
* Add musdb18 recipe for music source separation by Emrys365 in https://github.com/espnet/espnet/pull/5338
* Bugfix: JETS CTCLoss by imdanboy in https://github.com/espnet/espnet/pull/5288
* Check the value of `n_shift` == `upsample_factor` in GAN_TTS by imdanboy in https://github.com/espnet/espnet/pull/5299
* MFA format fix by iamanigeeit in https://github.com/espnet/espnet/pull/5275
* add --num-workers 0 option to enable coverage to truck data loader by kan-bayashi in https://github.com/espnet/espnet/pull/5368
* ESPnet-SPK: fix data augment by Jungjee in https://github.com/espnet/espnet/pull/5347
* A few minor fixes for SSL by ftshijt in https://github.com/espnet/espnet/pull/5265
* remove unused file + small typo/style by b-flo in https://github.com/espnet/espnet/pull/5346
* ESPnet-SPK: EER validation efficiency improvement by Jungjee in https://github.com/espnet/espnet/pull/5358
* New Architectures for ST by brianyan918 in https://github.com/espnet/espnet/pull/4815
* [SVS] Add CI test by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5269
* Add causal LM to Hugging Face Transformers Decoder by akreal in https://github.com/espnet/espnet/pull/5313
* Make `make_pad_mask` onnx convertible by Masao-Someki in https://github.com/espnet/espnet/pull/5326
* fix numerical error of parallel wavegan compatibility test in CI by kan-bayashi in https://github.com/espnet/espnet/pull/5380
* Add LibriTTS-R recipe by ShigekiKarita in https://github.com/espnet/espnet/pull/5379
* minor fix: correct wrong comments by imdanboy in https://github.com/espnet/espnet/pull/5378
* Add quotation marks to install_datasets.sh by qmeeus in https://github.com/espnet/espnet/pull/5387

New Contributors
* khassanoff made their first contribution in https://github.com/espnet/espnet/pull/5171
* leepeiying made their first contribution in https://github.com/espnet/espnet/pull/5182
* Jungjee made their first contribution in https://github.com/espnet/espnet/pull/5184
* wwwbxy123 made their first contribution in https://github.com/espnet/espnet/pull/5251

**Full Changelog**: https://github.com/espnet/espnet/compare/v.202304...v.202308

v.202304
What's Changed
* Update collect stats stage so that less memory cost in Utt_mvn by simpleoier in https://github.com/espnet/espnet/pull/4888
* Apply the latest black by kamo-naoyuki in https://github.com/espnet/espnet/pull/4907
* Add pytorch=1.13.1 to CI configuration by kamo-naoyuki in https://github.com/espnet/espnet/pull/4906
* How2 fix README, incorrect url by roshansh-cmu in https://github.com/espnet/espnet/pull/4902
* standardized inference and number of iterations for mSuperb single lang track by DanBerrebbi in https://github.com/espnet/espnet/pull/4905
* Fix typo in lrs/README.md by eltociear in https://github.com/espnet/espnet/pull/4911
* MSUPERB setting update by ftshijt in https://github.com/espnet/espnet/pull/4913
* Update test_import.yaml to install numba by kamo-naoyuki in https://github.com/espnet/espnet/pull/4918
* update pyopenjtalk version to 0.3.0 by kan-bayashi in https://github.com/espnet/espnet/pull/4912
* CHiME-7 Task1 recipe by popcornell in https://github.com/espnet/espnet/pull/4894
* Update CHiME-7 Task 1 README.md by popcornell in https://github.com/espnet/espnet/pull/4920
* Use native CPU version of STFT on newer pytorch versions, fix librosa window size < ftt by bmilde in https://github.com/espnet/espnet/pull/4922
* Add few shot subset for mSuperb multilingual setting by guapaQAQ in https://github.com/espnet/espnet/pull/4923
* Fix existing bugs in the TSE task by Emrys365 in https://github.com/espnet/espnet/pull/4915
* IAM OCR recipe updates by kenzheng99 in https://github.com/espnet/espnet/pull/4927
* Fixing some issues with chime7-task1 baseline by popcornell in https://github.com/espnet/espnet/pull/4925
* set default none decoder for ASR by ftshijt in https://github.com/espnet/espnet/pull/4917
* Update inference and training setting for mSuperb multilingual model by guapaQAQ in https://github.com/espnet/espnet/pull/4932
* Add E-Branchformer Transducer results by pyf98 in https://github.com/espnet/espnet/pull/4933
* add tf-gridnet by zqwang7 in https://github.com/espnet/espnet/pull/4864
* Fixes + Channel Selection for CHiME-7 Task by popcornell in https://github.com/espnet/espnet/pull/4934
* fix extracted feature dummy generation by roshansh-cmu in https://github.com/espnet/espnet/pull/4926
* Fix device mismatch error in GPU decoding with PyTorch 1.13 by pyf98 in https://github.com/espnet/espnet/pull/4941
* CHiME-7 DASR MD5 checksum fix for mixer6/train_call by popcornell in https://github.com/espnet/espnet/pull/4942
* Update show_asr_result.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/4943
* CHiME-7 DASR correct development results by popcornell in https://github.com/espnet/espnet/pull/4946
* Fix '__floordiv__ is deprecated' warnings by fujimotos in https://github.com/espnet/espnet/pull/4945
* Added WSLII installation instruction by sw005320 in https://github.com/espnet/espnet/pull/4949
* Update Muskits by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4931
* Set a longer time execution threshold for related failed time-outs CI by ftshijt in https://github.com/espnet/espnet/pull/4962
* Modify data prep for mSUPERB multilingual by guapaQAQ in https://github.com/espnet/espnet/pull/4965
* Add E-Branchformer results in some recipes by pyf98 in https://github.com/espnet/espnet/pull/4958
* Add 'six' as a required Python module by fujimotos in https://github.com/espnet/espnet/pull/4964
* add msuperb linguistic analysis by hhhaaahhhaa in https://github.com/espnet/espnet/pull/4938
* Fix a 'ref_channel'-related issue in espnet2/bin/enh_inference.py by Emrys365 in https://github.com/espnet/espnet/pull/4972
* Add E-Branchformer results in slurp_entity by pyf98 in https://github.com/espnet/espnet/pull/4971
* Add Conformer and E-Branchformer results in fisher_spanish_callhome ASR by pyf98 in https://github.com/espnet/espnet/pull/4976
* [SVS] Add Joint-training by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4977
* Update the chunk iterator for the TSE task by Emrys365 in https://github.com/espnet/espnet/pull/4929
* update msuperb LID scoring script by hhhaaahhhaa in https://github.com/espnet/espnet/pull/4979
* add multilingual+lid lid score generation by hhhaaahhhaa in https://github.com/espnet/espnet/pull/4982
* Add python=3.10 to CI by kamo-naoyuki in https://github.com/espnet/espnet/pull/4627
* LID score v2 by hhhaaahhhaa in https://github.com/espnet/espnet/pull/4983
* Fix ci by kamo-naoyuki in https://github.com/espnet/espnet/pull/4985
* Change to use Ubuntu-latest instead of Ubuntu-18.04 in CI by kamo-naoyuki in https://github.com/espnet/espnet/pull/4986
* Remove six by kamo-naoyuki in https://github.com/espnet/espnet/pull/4988
* Modify format_wav_scp.py to support PCM of uint8, int32, float32, float64, etc. by kamo-naoyuki in https://github.com/espnet/espnet/pull/4997
* Fix Whisper tokenizer CI error by slSeanWU in https://github.com/espnet/espnet/pull/5004
* fix s3prl upstream attribute bug by jwrh in https://github.com/espnet/espnet/pull/5003
* [Recipe] Add iwslt22 low resource speech translation task for egs2 by freddy5566 in https://github.com/espnet/espnet/pull/4994
* Fix typeguard version by silvanocerza in https://github.com/espnet/espnet/pull/5009
* Add .pre-commit-config.yaml by kamo-naoyuki in https://github.com/espnet/espnet/pull/5011
* Copy Kaldi utils/steps/sid and add a new github action to check the consistency by kamo-naoyuki in https://github.com/espnet/espnet/pull/4998
* Modfiy .pre-commit-config.yaml by kamo-naoyuki in https://github.com/espnet/espnet/pull/5012
* Modify .pre-commit-config.yaml by kamo-naoyuki in https://github.com/espnet/espnet/pull/5014
* Modify .pre-commit-config.yaml by kamo-naoyuki in https://github.com/espnet/espnet/pull/5015
* [Tuning] iwslt22 low-resource ST decode configuration tuning by freddy5566 in https://github.com/espnet/espnet/pull/5019
* Modify asr.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5020
* [SVS] Improve visinger by jerryuhoo in https://github.com/espnet/espnet/pull/5022
* Use scripts/utils/print_args.sh instead of pyscripts/utils/print_args.py by kamo-naoyuki in https://github.com/espnet/espnet/pull/5025
* Add docstring in extra_path.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5028
* Update installation.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5029
* Update README.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5030
* Update README.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5031
* Change bc to python by kamo-naoyuki in https://github.com/espnet/espnet/pull/5032
* Update tools/Makefile and path.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5027
* Fix for format_wav_scp.py by kamo-naoyuki in https://github.com/espnet/espnet/pull/5038
* Add execute permission to install_ice_g2p.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5040
* Bug fix of 5025 by kamo-naoyuki in https://github.com/espnet/espnet/pull/5039
* [pre-commit.ci] pre-commit autoupdate by pre-commit-ci in https://github.com/espnet/espnet/pull/5041
* Update README.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5042
* Update README.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5043
* Update README.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5045
* Fix in gen_task1_data.sh from CHiME7 by boeddeker in https://github.com/espnet/espnet/pull/4953
* Update README.md by eml914 in https://github.com/espnet/espnet/pull/5044
* Add installers/install_ffmpeg.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5046
* Fix broken links reported by 5048 by ShigekiKarita in https://github.com/espnet/espnet/pull/5050
* fix: resolve upgrade issues with praatio 6.0; lock praatio version by timmahrt in https://github.com/espnet/espnet/pull/4978
* Add miniconda in gitignore by pyf98 in https://github.com/espnet/espnet/pull/5052
* CHiME-7 DASR fixes from participants feedback by popcornell in https://github.com/espnet/espnet/pull/4999
* Fix the condition for maxlen warning in beam search by pyf98 in https://github.com/espnet/espnet/pull/5055
* Fixed SQLalchemy version for MFA by Fhrozen in https://github.com/espnet/espnet/pull/5059
* Support Multi-Blank Transducer in Espnet2 by jctian98 in https://github.com/espnet/espnet/pull/4876
* Fix chime7 DASR task1 run.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5060
* CHiME-7 DASR recipe, fix display bug for scenario-wide DER and JER by popcornell in https://github.com/espnet/espnet/pull/5061
* Add test_format_wav_scp_sh.bats by kamo-naoyuki in https://github.com/espnet/espnet/pull/5062
* Update documentation by kamo-naoyuki in https://github.com/espnet/espnet/pull/5063
* Support SOT training on LibriMix data. by pengchengguo in https://github.com/espnet/espnet/pull/4861
* Update check_install.py by kamo-naoyuki in https://github.com/espnet/espnet/pull/5066
* Tedlium3 recipe by Some-random in https://github.com/espnet/espnet/pull/5068
* Bug Fix: pretrained s3prl-frontend based models loaded with parameters key mismatch error by simpleoier in https://github.com/espnet/espnet/pull/5074
* Mechanism for multi channels input using multi columns wav.scp by kamo-naoyuki in https://github.com/espnet/espnet/pull/5075
* Clean ML-SUPERB by ftshijt in https://github.com/espnet/espnet/pull/5067
* CHiME-7 DASR: first diarization system based on Pyannote. by popcornell in https://github.com/espnet/espnet/pull/5054
* Chime7-task1 diarization (updated results) by popcornell in https://github.com/espnet/espnet/pull/5088
* Add InterCTC to E-Branchformer encoder, and the ability to save InterCTC inference output to files by tjysdsg in https://github.com/espnet/espnet/pull/5084
* [SVS] Bug fix: sample rate by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5094
* [SVS] Extend SingingGenerate by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5100
* [pre-commit.ci] pre-commit autoupdate by pre-commit-ci in https://github.com/espnet/espnet/pull/5080
* Add kaldi steps/libs by kamo-naoyuki in https://github.com/espnet/espnet/pull/5106
* Fix sentencepice version to v0.1.97 by kamo-naoyuki in https://github.com/espnet/espnet/pull/5107
* Drop PyTorch<=1.9 by kamo-naoyuki in https://github.com/espnet/espnet/pull/5111
* Update installers/install_kenlm.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5110
* Merge */{scripts,pyscripts} into asr1/{scripts,pyscripts} by kamo-naoyuki in https://github.com/espnet/espnet/pull/5109
* Update ReazonSpeech training recipe for v1.1.0 by fujimotos in https://github.com/espnet/espnet/pull/5114
* Fix typo in espnet2_format_wav_scp.md by boeddeker in https://github.com/espnet/espnet/pull/5116
* Dtype for Speechbrain by Fhrozen in https://github.com/espnet/espnet/pull/5112
* Add test of soundfile for Makefile by kamo-naoyuki in https://github.com/espnet/espnet/pull/5119
* Add lm_inference for conditional text generation by pyf98 in https://github.com/espnet/espnet/pull/5122
* CHiME-7 diarization (updated README.md) by popcornell in https://github.com/espnet/espnet/pull/5102
* [WIP] Update Docker by Fhrozen in https://github.com/espnet/espnet/pull/5128
* Fix several bugs and improve function design in SE by Emrys365 in https://github.com/espnet/espnet/pull/5103
* [SVS] Update XiaoiceSing by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5124
* Add missing filter_scps scripts and note about kaldi for diarization example of mini_librispeech by toto6038 in https://github.com/espnet/espnet/pull/5139
* Bump up the debian version to 11 by kamo-naoyuki in https://github.com/espnet/espnet/pull/5144
* Bug fixing and improvement in SE functions by Emrys365 in https://github.com/espnet/espnet/pull/5143
* Add data augmentation to ReazonSpeech recipe by fujimotos in https://github.com/espnet/espnet/pull/5127
* Update error calculator for transducer by aky15 in https://github.com/espnet/espnet/pull/5097
* Add streaming speech enhancemnt inference. by LiChenda in https://github.com/espnet/espnet/pull/5049
* Update README.md about debian by sw005320 in https://github.com/espnet/espnet/pull/5146
* Fix issues in split scps by pyf98 in https://github.com/espnet/espnet/pull/5138
* fix 5148 by kamo-naoyuki in https://github.com/espnet/espnet/pull/5149
* fix format_wav_scp.py by kamo-naoyuki in https://github.com/espnet/espnet/pull/5150
* Add more stats to the training log by Emrys365 in https://github.com/espnet/espnet/pull/5147
* update version to 202304 by kan-bayashi in https://github.com/espnet/espnet/pull/5151

New Contributors
* bmilde made their first contribution in https://github.com/espnet/espnet/pull/4922
* guapaQAQ made their first contribution in https://github.com/espnet/espnet/pull/4923
* zqwang7 made their first contribution in https://github.com/espnet/espnet/pull/4864
* hhhaaahhhaa made their first contribution in https://github.com/espnet/espnet/pull/4938
* jwrh made their first contribution in https://github.com/espnet/espnet/pull/5003
* freddy5566 made their first contribution in https://github.com/espnet/espnet/pull/4994
* silvanocerza made their first contribution in https://github.com/espnet/espnet/pull/5009
* pre-commit-ci made their first contribution in https://github.com/espnet/espnet/pull/5041
* boeddeker made their first contribution in https://github.com/espnet/espnet/pull/4953
* timmahrt made their first contribution in https://github.com/espnet/espnet/pull/4978
* Some-random made their first contribution in https://github.com/espnet/espnet/pull/5068
* toto6038 made their first contribution in https://github.com/espnet/espnet/pull/5139

**Full Changelog**: https://github.com/espnet/espnet/compare/v.202301...v.202304

v.202301
What's Changed
* Initialize VISinger branch by ftshijt in https://github.com/espnet/espnet/pull/4683
* Update VISInger branch by ftshijt in https://github.com/espnet/espnet/pull/4705
* Update UASR branch with latest ESPnet functions by ftshijt in https://github.com/espnet/espnet/pull/4752
* Update uasr by ftshijt in https://github.com/espnet/espnet/pull/4770
* Shell scripts for UASR processing by ftshijt in https://github.com/espnet/espnet/pull/4769
* Uasr python scripts by DongjiGao in https://github.com/espnet/espnet/pull/4791
* Update visinger by ftshijt in https://github.com/espnet/espnet/pull/4818
* Update test_custom_transducer.py by sw005320 in https://github.com/espnet/espnet/pull/4826
* Update asr.sh by sw005320 in https://github.com/espnet/espnet/pull/4827
* Fixed pad mode for librosa.stft by Masao-Someki in https://github.com/espnet/espnet/pull/4832
* Add E-Branchformer models in some recipes by pyf98 in https://github.com/espnet/espnet/pull/4833
* Fix data prep in GigaSpeech by pyf98 in https://github.com/espnet/espnet/pull/4836
* time sync decoding for asr by brianyan918 in https://github.com/espnet/espnet/pull/4792
* Remove duplicated VOXFORGE in db.sh (line81 and line157) by pyf98 in https://github.com/espnet/espnet/pull/4840
* Fix argument parsing for non_linguistic_symbols in asr.sh by pyf98 in https://github.com/espnet/espnet/pull/4841
* Add a warning statement when the hypo length equals to the max out length. by pengchengguo in https://github.com/espnet/espnet/pull/4843
* Add target speaker extraction (TSE) functions by Emrys365 in https://github.com/espnet/espnet/pull/4823
* Multilingual superb by ftshijt in https://github.com/espnet/espnet/pull/4824
* VISinger by jerryuhoo in https://github.com/espnet/espnet/pull/4689
* Update VISInger to latest by ftshijt in https://github.com/espnet/espnet/pull/4849
* VISinger for singing voice synthesis by ftshijt in https://github.com/espnet/espnet/pull/4848
* Reduce word counts for ESPnet-SE++ Joss paper by neillu23 in https://github.com/espnet/espnet/pull/4844
* Add E-Branchformer configs and models in ASR recipes by pyf98 in https://github.com/espnet/espnet/pull/4837
* Address Muskits updates on README by ftshijt in https://github.com/espnet/espnet/pull/4850
* Minor fix for MSUPERB recipe by ftshijt in https://github.com/espnet/espnet/pull/4851
* Update for the latest changes in the draft (minor changes) by neillu23 in https://github.com/espnet/espnet/pull/4852
* Add E-Branchformer results on Librispeech by kkim-asapp in https://github.com/espnet/espnet/pull/4856
* Update hubert implementation. by simpleoier in https://github.com/espnet/espnet/pull/4747
* VISinger unit test by jerryuhoo in https://github.com/espnet/espnet/pull/4855
* Minor fix to commonvoice espnet1 by ftshijt in https://github.com/espnet/espnet/pull/4862
* [WIP] Add S4 decoder in ESPnet2 by m-koichi in https://github.com/espnet/espnet/pull/4845
* Update hubert feature and acknowledge information in related Readmes. by simpleoier in https://github.com/espnet/espnet/pull/4863
* Generating MFA aligments by Fhrozen in https://github.com/espnet/espnet/pull/4803
* [WIP] EURO uasr scripts by DongjiGao in https://github.com/espnet/espnet/pull/4846
* Update README.md related to ASR architecture by m-koichi in https://github.com/espnet/espnet/pull/4865
* Minor fix to librimix diar recipe by ftshijt in https://github.com/espnet/espnet/pull/4867
* Add Full Whisper Model for Finetuning by slSeanWU in https://github.com/espnet/espnet/pull/4793
* Add torchaudio version check for HuBERT pretraining by simpleoier in https://github.com/espnet/espnet/pull/4872
* add k2 decoder related scripts for EURO by DongjiGao in https://github.com/espnet/espnet/pull/4868
* EURO: small fix (temporarily remove support for nbest_rescoring) by DongjiGao in https://github.com/espnet/espnet/pull/4875
* Add description for Whisper ASR in homepage readme by slSeanWU in https://github.com/espnet/espnet/pull/4877
* Update README.md by eltociear in https://github.com/espnet/espnet/pull/4879
* add explanations to text tokenizing related scripts and remove unused script by DongjiGao in https://github.com/espnet/espnet/pull/4880
* update information about source and our modification for k2 related scripts by DongjiGao in https://github.com/espnet/espnet/pull/4881
* AphasiaBank ASR recipe by tjysdsg in https://github.com/espnet/espnet/pull/4860
* Multilingual SUPERB update by ftshijt in https://github.com/espnet/espnet/pull/4878
* ESPnet Unsupervised ASR (EURO project) by ftshijt in https://github.com/espnet/espnet/pull/4774
* Support ProDiff in TTS by Fhrozen in https://github.com/espnet/espnet/pull/4808
* Add E-Branchformer for GigaSpeech by pyf98 in https://github.com/espnet/espnet/pull/4882
* FLEURS - Auxillary CTC conditioning tasks by wanchichen in https://github.com/espnet/espnet/pull/4756
* Add python 3.8 requirement for Whisper & update tests by slSeanWU in https://github.com/espnet/espnet/pull/4891
* Update some ASR results in the main readme file by pyf98 in https://github.com/espnet/espnet/pull/4883
* Add Conv2dSubsampling1 module and test it in AphasiaBank ASR recipe by tjysdsg in https://github.com/espnet/espnet/pull/4892
* Support x-vector extractor based on RawNet by Takaaki-Saeki in https://github.com/espnet/espnet/pull/4884
* single language track setups by DanBerrebbi in https://github.com/espnet/espnet/pull/4895
* fixing bug deu1 by DanBerrebbi in https://github.com/espnet/espnet/pull/4900
* Fix dataprep issues based on updated data release via Google form by roshansh-cmu in https://github.com/espnet/espnet/pull/4899
* Add a new EGS2 recipe 'reazonspeech' by fujimotos in https://github.com/espnet/espnet/pull/4885
* Update version to 202301 by kan-bayashi in https://github.com/espnet/espnet/pull/4901

New Contributors
* DongjiGao made their first contribution in https://github.com/espnet/espnet/pull/4791
* jerryuhoo made their first contribution in https://github.com/espnet/espnet/pull/4689
* m-koichi made their first contribution in https://github.com/espnet/espnet/pull/4845
* fujimotos made their first contribution in https://github.com/espnet/espnet/pull/4885

**Full Changelog**: https://github.com/espnet/espnet/compare/v.202211...v.202301

v.202211
What's Changed
* Update muskits update by ftshijt in https://github.com/espnet/espnet/pull/4616
* Muskit installation by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4617
* Sync Muskits branch with Master by ftshijt in https://github.com/espnet/espnet/pull/4640
* Updates on Muskit Migration by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4631
* Update Muskits branch by ftshijt in https://github.com/espnet/espnet/pull/4662
* Add stage 5 & stage 6 by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4649
* Muskit: rename & reorganize features by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4668
* Update Muskits branch by ftshijt in https://github.com/espnet/espnet/pull/4671
* Muskits CI fixing by ftshijt in https://github.com/espnet/espnet/pull/4672
* Muskits CI fix by ftshijt in https://github.com/espnet/espnet/pull/4673
* Muskits - apply isort by ftshijt in https://github.com/espnet/espnet/pull/4677
* Muskits CI fix by ftshijt in https://github.com/espnet/espnet/pull/4678
* Muskit: Add tokenizer by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4676
* Muskits - various fix for CI test by ftshijt in https://github.com/espnet/espnet/pull/4679
* Muskit: add recipe ofuton by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4681
* Muskits (CI fix) by ftshijt in https://github.com/espnet/espnet/pull/4682
* Fix CI issue in muskits by ftshijt in https://github.com/espnet/espnet/pull/4687
* Add dns_icassp22 Speech Enhancement Recipe by slSeanWU in https://github.com/espnet/espnet/pull/4657
* Singing Voice Synthesis Task for ESPnet by ftshijt in https://github.com/espnet/espnet/pull/4670
* Documentation of Tutorial and Muskits by ftshijt in https://github.com/espnet/espnet/pull/4692
* Add tests on MacOS and Windows (only installation) by kamo-naoyuki in https://github.com/espnet/espnet/pull/4669
* Add missing entries in readme by ftshijt in https://github.com/espnet/espnet/pull/4699
* Support ST without texts in source language by sophia1488 in https://github.com/espnet/espnet/pull/4688
* Update ConvInput for Transducer by b-flo in https://github.com/espnet/espnet/pull/4720
* Small changes for standalone Transducer by b-flo in https://github.com/espnet/espnet/pull/4722
* Fix input block tutorial documentation for Transducer by b-flo in https://github.com/espnet/espnet/pull/4724
* Fix HF Pytest Errors by siddhu001 in https://github.com/espnet/espnet/pull/4737
* Update to puebla-nahuatl recipe (some minor fixes) by ftshijt in https://github.com/espnet/espnet/pull/4713
* Add espnet2 TTS recipe on M-AILABS by Takaaki-Saeki in https://github.com/espnet/espnet/pull/4701
* Update outdated enh config files by Emrys365 in https://github.com/espnet/espnet/pull/4719
* add src_sos & src_eos for mt task to address the index out of range w… by simpleoier in https://github.com/espnet/espnet/pull/4736
* Add g2pk_explicit_space tokenizer by jonghwanhyeon in https://github.com/espnet/espnet/pull/4718
* Fix JETS inference with GST (4743) by kan-bayashi in https://github.com/espnet/espnet/pull/4744
* Update on Muskit by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4700
* add fleurs conformer+sc-ctc results by wanchichen in https://github.com/espnet/espnet/pull/4746
* Add recipe for OCR task on IAM handwriting dataset by kenzheng99 in https://github.com/espnet/espnet/pull/4707
* Add Talromur2 recipe by G-Thor in https://github.com/espnet/espnet/pull/4680
* Add multi-channel enh_asr for CHiME-4 by YoshikiMas in https://github.com/espnet/espnet/pull/4706
* chunk_mask error by aky15 in https://github.com/espnet/espnet/pull/4751
* fix wav2vec2 encoder mask bug by simpleoier in https://github.com/espnet/espnet/pull/4772
* Add Hugging Face Transformers Decoder, Tokenizer and their example on SLURP by akreal in https://github.com/espnet/espnet/pull/4099
* [Recipe PR] MELD: Multimodal EmotionLines Dataset by realzza in https://github.com/espnet/espnet/pull/4771
* MultiIRIS follow up by YoshikiMas in https://github.com/espnet/espnet/pull/4765
* Add CATSLU results for XLS-R with mBART-50 by akreal in https://github.com/espnet/espnet/pull/4782
* Add MEDIA and PortMEDIA results for XLS-R with mBART-50 by akreal in https://github.com/espnet/espnet/pull/4794
* Add SLUE-VoxPopuli results for WavLM with mBART-50 by akreal in https://github.com/espnet/espnet/pull/4777
* Follow up for SLURP and CATSLU by akreal in https://github.com/espnet/espnet/pull/4796
* Update README in chime4/enh_asr1 by YoshikiMas in https://github.com/espnet/espnet/pull/4795
* fix parsing token_list by imdanboy in https://github.com/espnet/espnet/pull/4778
* Use torchaudio functions for beamforming related operations in torch 1.12.1+ by Emrys365 in https://github.com/espnet/espnet/pull/4638
* PIT E2E multi-speaker ASR and librimix recipe by simpleoier in https://github.com/espnet/espnet/pull/4753
* Fix an audio format issue in some enh recipes by YoshikiMas in https://github.com/espnet/espnet/pull/4799
* Fixing How2-2000h Data preparation and Seq Length Assert for Longformer Encoder by roshansh-cmu in https://github.com/espnet/espnet/pull/4805
* Adding MFA scripts for LJSpeech by iamanigeeit in https://github.com/espnet/espnet/pull/4801
* fix typo in espnet2_tutorial.md by eltociear in https://github.com/espnet/espnet/pull/4811
* [WIP] E-Branchformer Encoder in ESPnet2 by kkim-asapp in https://github.com/espnet/espnet/pull/4812
* Muskit update by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4783

New Contributors
* A-Quarter-Mile made their first contribution in https://github.com/espnet/espnet/pull/4617
* sophia1488 made their first contribution in https://github.com/espnet/espnet/pull/4688
* kenzheng99 made their first contribution in https://github.com/espnet/espnet/pull/4707
* realzza made their first contribution in https://github.com/espnet/espnet/pull/4771
* iamanigeeit made their first contribution in https://github.com/espnet/espnet/pull/4801
* eltociear made their first contribution in https://github.com/espnet/espnet/pull/4811
* kkim-asapp made their first contribution in https://github.com/espnet/espnet/pull/4812

**Full Changelog**: https://github.com/espnet/espnet/compare/v.202209...v.202211

v.202209
What's Changed
* Add dynamic mixing in the speech separation task. by LiChenda in https://github.com/espnet/espnet/pull/4387
* Added test script and usage for calculate_rtf.py script to ESPnet2 tutorial page by espnetUser in https://github.com/espnet/espnet/pull/4560
* Offline/Online (standalone) ESPnet2 Transducer by b-flo in https://github.com/espnet/espnet/pull/4479
* Unfix matplotlib version by kamo-naoyuki in https://github.com/espnet/espnet/pull/4576
* use torch.finfo for dtype other than float by wenzhe-nrv in https://github.com/espnet/espnet/pull/4584
* Update recipe for slurp-entity by ftshijt in https://github.com/espnet/espnet/pull/4585
* Egs2 aesrc by brianyan918 in https://github.com/espnet/espnet/pull/4592
* update checks for bias in initialization by LiChenda in https://github.com/espnet/espnet/pull/4574
* [WIP] Update to fit the recent update in s3prl. by simpleoier in https://github.com/espnet/espnet/pull/4593
* Unfix numpy version by kamo-naoyuki in https://github.com/espnet/espnet/pull/4598
* Update to fit the recent update in s3prl. by simpleoier in https://github.com/espnet/espnet/pull/4600
* Add improved results on FLEURS dataset by wanchichen in https://github.com/espnet/espnet/pull/4596
* Update mp4_to_wav.sh by jaehyun-ko in https://github.com/espnet/espnet/pull/4605
* Pass output_dir as str to wandb.init() by jonghwanhyeon in https://github.com/espnet/espnet/pull/4607
* Support enh_s2t joint training on multi-speaker data by Emrys365 in https://github.com/espnet/espnet/pull/4566
* Add ASR results for commonvoice zh_TW by slSeanWU in https://github.com/espnet/espnet/pull/4612
* Fix both utt2sid and utt2lid when removing long/short data by jonghwanhyeon in https://github.com/espnet/espnet/pull/4609
* recipe config update by ftshijt in https://github.com/espnet/espnet/pull/4621
* Add pytorch=1.12.1 to CI configurations by kamo-naoyuki in https://github.com/espnet/espnet/pull/4604
* New SLU task by siddhu001 in https://github.com/espnet/espnet/pull/4569
* Joss paper: Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing by neillu23 in https://github.com/espnet/espnet/pull/4620
* Update conformer result of AMI corpus by teinhonglo in https://github.com/espnet/espnet/pull/4629
* Offline/Online Branchformer Transducer by b-flo in https://github.com/espnet/espnet/pull/4582
* Change to install numba using pip instead of conda by kamo-naoyuki in https://github.com/espnet/espnet/pull/4637
* Add MixIT support. It is unsupervised only. Semi-supervised config is not available for now. by simpleoier in https://github.com/espnet/espnet/pull/4619
* Add 2-pass SLU code for FSC Challenge by siddhu001 in https://github.com/espnet/espnet/pull/4636
* CI fix and some other minor recipe fixes by ftshijt in https://github.com/espnet/espnet/pull/4656
* Update the title of plots to be y-label vs x-label by pyf98 in https://github.com/espnet/espnet/pull/4647
* Update VIVOS download link by hieuthi in https://github.com/espnet/espnet/pull/4644
* Add ASR recipe of MAGICDATA mandarin read speech by tjysdsg in https://github.com/espnet/espnet/pull/4635
* Amend to CI fix by ftshijt in https://github.com/espnet/espnet/pull/4663
* qasr update by massabaali7 in https://github.com/espnet/espnet/pull/4642
* Open_li110 for large-scale multilingual speech by ftshijt in https://github.com/espnet/espnet/pull/4408
* Fix the path of calculate_rft.py by sw005320 in https://github.com/espnet/espnet/pull/4660
* Fix importlib-metadata version by kan-bayashi in https://github.com/espnet/espnet/pull/4686
* Cmu arctic tts pretrain finetune by soumimaiti in https://github.com/espnet/espnet/pull/4456
* updated version to 202209 by kan-bayashi in https://github.com/espnet/espnet/pull/4685

New Contributors
* wenzhe-nrv made their first contribution in https://github.com/espnet/espnet/pull/4584
* jaehyun-ko made their first contribution in https://github.com/espnet/espnet/pull/4605
* jonghwanhyeon made their first contribution in https://github.com/espnet/espnet/pull/4607
* slSeanWU made their first contribution in https://github.com/espnet/espnet/pull/4612
* massabaali7 made their first contribution in https://github.com/espnet/espnet/pull/4642
* soumimaiti made their first contribution in https://github.com/espnet/espnet/pull/4456

**Full Changelog**: https://github.com/espnet/espnet/compare/v.202207...v.202209

v.202207
New Features
- [**New Features**][**ESPnet1**][**ASR**] Add DDP support for v1 ASR training. 4430 by lazykyama
- [**New Features**][**ESPnet2**] Support tensorboard graph 4418 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**ASR**] Branchformer Encoder in ESPnet2 4400 by pyf98
- [**New Features**][**ESPnet2**][**Diarization**][**SE**] enh_diar joint model 4339 by YushiUeda
- [**New Features**][**ESPnet2**][**ESPnet1**] Calculate RTF and latency in espnet2 4382 by espnetUser
- [**New Features**][**ESPnet2**][**ESPnet1**][**SE**] Add EnhPreprocessor for Speech Enhancement 4321 by Emrys365
- [**New Features**][**ESPnet2**][**SE**] Add DPTNet and WarmupStepLR scheduler 4449 by Emrys365
- [**New Features**][**ESPnet2**][**SE**] Add support for calculating losses on noise and dereverberated signals 4476 by Emrys365

Recipe
- [**Recipe**][**ESPnet2**] Aishell-2 GPU info 4501 by jctian98
- [**Recipe**][**ESPnet2**] Fix librispeech default path to signify auto download 4517 by karthik19967829
- [**Recipe**][**ESPnet2**] Recipe fix for PueblaNahuatl Recipe 4522 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add Aishell-2 ASR Recipe for Espnet2 4451 by jctian98
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add AmericasNLP 2022 baselines 4428 by akreal
- [**Recipe**][**ESPnet2**][**ESPnet1**][**ASR**][**Installation**] FLEURS ASR Recipe for ESPnet2 4455 by wanchichen
- [**Recipe**][**ESPnet2**][**ESPnet1**][**ASR**][**README**] tedx_spanish_corpus egs2 recipe 4523 by jessicah25
- [**Recipe**][**ESPnet2**][**ESPnet1**][**ASR**][**SE**] Adding L3DAS22 Task1 model to ESPNet-SE 3994 by popcornell
- [**Recipe**][**ESPnet2**][**ESPnet1**][**ST**] Must_C v1 and v2 in egs2 4306 by brianyan918
- [**Recipe**][**ESPnet2**][**README**] Dcase task1 Baseline 4317 by siddhu001
- [**Recipe**][**ESPnet2**][**README**] Report Aishell-2 Transducer results 4489 by jctian98
- [**Recipe**][**ESPnet2**][**README**] Update language codes in AmericasNLP 2022 baseline 4441 by akreal
- [**Recipe**][**ESPnet2**][**README**] Vox populi baseline 4478 by siddhu001
- [**Recipe**][**ESPnet2**][**SE**] L3DAS22 enhancement recipe 4269 by neillu23
- [**Recipe**][**ESPnet2**][**SE**] Update notes in the recipes for DNS challenges 4433 by YoshikiMas
- [**Recipe**][**ESPnet2**][**SE**][**SLU**][**ST**] LT-Spatialized and SLURP-Spatialized combined enhancement recipe 4268 by neillu23
- [**Recipe**][**ESPnet2**][**ST**] Add moses check for ST recipes 4417 by ftshijt
- [**Recipe**][**ESPnet2**][**TTS**] Add talromur recipe 4379 by G-Thor
- [**Recipe**][**ESPnet2**][**TTS**] Fix for issue 4401 4402 by G-Thor
- [**Recipe**][**ESPnet2**][**TTS**] add pre-trained model jets in the recipe of ljspeech, kss 4406 by imdanboy

Bugfix
- [**Bugfix**][**ESPnet1**] fix the corrupted pretrained model 4490 by wentaoxandry
- [**Bugfix**][**ESPnet1**][**ESPnet2**] Fix an4 URL 4427 by pyf98
- [**Bugfix**][**ESPnet1**][**ESPnet2**][**RNNT**] Fix mAES with big vocab size 4312 by b-flo
- [**Bugfix**][**ESPnet2**] Adding __init__.py to espnet2/diar/layers and espnet2/diar/separator 4470 by cycentum
- [**Bugfix**][**ESPnet2**] Fix tensorboard-graph creation for multi gpu mode 4431 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Update char_tokenizer.py 4499 by xiabingquan
- [**Bugfix**][**ESPnet2**][**ESPnet1**][**ASR**][**LM**][**MT**][**TTS**] Fix Transducer LM fusion and add Logging for Transducer inference 4327 by chintu619
- [**Bugfix**][**ESPnet2**][**SE**] Fix a bug in enh unit test 4435 by Emrys365

Enhancement
- [**Enhancement**][**ESPnet2**] Optionize graph creation 4551 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**Installation**][**TTS**] Add icelandic g2p 4384 by G-Thor
- [**Enhancement**][**ESPnet2**][**SE**] Add support of test-only criterions after each epoch 4381 by Emrys365
- [**Enhancement**][**ESPnet2**][**SSL**] raise more useful error in espnet2/asr/frontend/s3prl.py if s3prl is not installed 4480 by popcornell
- [**Enhancement**][**ESPnet2**][**TTS**] Add JETS AlignmentModule in calculate_all_attentions.py 4446 by seastar105

Refactoring
- [**Refactoring**][**ESPnet1**] Refactoring 'is_prefix' function 4530 by jhlee9010
- [**Refactoring**][**ESPnet2**][**ASR**] Zero_infinity option for ctc loss 4415 by kamo-naoyuki

Others
- [**CI**][**ESPnet1**][**ESPnet2**][**Installation**] Remove the version restriction for numpy 4419 by kamo-naoyuki
- [**CI**][**ESPnet2**] Canged to install espnet from wheel in the test_import CI test 4471 by kamo-naoyuki
- [**CI**][**Installation**] Temporary fixed numpy version 4464 by kamo-naoyuki
- [**Documentation**] Add notes on batch size and num of GPUs in ESPnet2 documentation 4436 by pyf98
- [**Documentation**][**ESPnet1**] Update decoder.py 4322 by sw005320
- [**Documentation**][**ESPnet2**] Add a note to follow the installation instructions 4477 by akreal

Acknowledgements
Special thanks to Emrys365, G-Thor, YoshikiMas, YushiUeda, akreal, b-flo, brianyan918, chintu619, cycentum, espnetUser, ftshijt, imdanboy, jctian98, jessicah25, jhlee9010, kamo-naoyuki, kan-bayashi, karthik19967829, lazykyama, neillu23, popcornell, pyf98, seastar105, siddhu001, sw005320, wanchichen, wentaoxandry, xiabingquan.

v.202205

New Features
- [**New Features**][**ESPnet1**][**ESPnet2**][**ASR**] Add quantization in ESPnet2 for asr inference 4349 by pyf98
- [**New Features**][**ESPnet2**][**SE**] Add svoice recipe for wsj0-2mix speech separation 4257 by nateanl
- [**New Features**][**ESPnet2**][**SE**] Merge Deep Clustering and Deep Attractor Network to enh separator 4110 by earthmanylf
- [**New Features**][**ESPnet2**][**SE**] Some improvements to current enh functions 4251 by Emrys365
- [**New Features**][**ESPnet2**][**SE**][**Installation**] Import fast_bss_eval and update some time-domain losses for enh task 4256 by LiChenda
- [**New Features**][**ESPnet2**][**TTS**] add e2e tts model: JETS 4364 by imdanboy

Bugfix
- [**Bugfix**][**ESPnet1**] Fix minimum input length for Conv2dSubsampling2 in check_short_utt 4378 by akreal
- [**Bugfix**][**ESPnet1**][**ESPnet2**] Minor fixes for the intermediate loss usage and Mask-CTC decoding 4374 by YosukeHiguchi
- [**Bugfix**][**ESPnet2**] Fix 4396 4398 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix a bug in utterance_mvn 4304 by Emrys365
- [**Bugfix**][**ESPnet2**] Minor fix for Mask-CTC forward function 4347 by YosukeHiguchi
- [**Bugfix**][**ESPnet2**] Wandb Minor Fix for Model Resume 4329 by roshansh-cmu
- [**Bugfix**][**ESPnet2**] fix the enh_s2t_task argument in espnet2/bin/st_inference.py 4323 by simpleoier
- [**Bugfix**][**ESPnet2**][**MT**][**ST**] fix bug in mt/st templates for having separate token lists 4149 by brianyan918
- [**Bugfix**][**ESPnet2**][**Recipe**] Fix aishell3 data preparation script 4277 by LanceaKing
- [**Bugfix**][**ESPnet2**][**SE**] Fix a bug in stats aggregation when PITSolver is used 4343 by Emrys365
- [**Bugfix**][**ESPnet2**][**SE**] fix for enhancement model loading compatibility 4259 by LiChenda
- [**Bugfix**][**ESPnet2**][**ST**] bug fixes in ST recipes 4341 by chintu619
- [**Bugfix**][**ESPnet2**][**TTS**] Fix optional data names for TTS 4355 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**TTS**] fix a bug in Mandarin pypinyin_g2p_phone 4206 by WeiGodHorse
- [**Bugfix**][**ESPnet2**][**TTS**] fix loss = NaN in VITS with mixed precision 4356 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**streaming**] Add unit test to streaming ASR inference 4352 by espnetUser
- [**Bugfix**][**Installation**] fix s3prl install by using legacy version. Temporal solution. 4399 by simpleoier
- [**Bugfix**][**README**] Fix typo 4338 by ftshijt

Enhancement
- [**Enhancement**][**ESPnet1**][**ESPnet2**][**ASR**][**SE**][**SLU**][**ST**] enh_s2t joint model 4226 by simpleoier
- [**Enhancement**][**ESPnet2**] Add progress bar to phonemization 4320 by G-Thor
- [**Enhancement**][**ESPnet2**][**MT**] Update show_translation_result.sh to show all decoding results under the given exp directory 4330 by pyf98

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Accented English Speech Recognition Challenge 2020 recipe (AESRC2020) 3898 by brianyan918
- [**Recipe**][**ESPnet1**][**ESPnet2**][**ASR**][**README**][**Recipe**] Add MediaSpeech ASR recipe 4183 by AshibaWu
- [**Recipe**][**ESPnet2**][**ASR**][**README**] recipee for Microsoft speech corpus for Indian Languages 4191 by navya-yarrabelly
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Accented French Openslr57 ASR recipe (ESPnet2) (part of Homework3 MNLP) 4280 by DanBerrebbi
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add Mask-CTC results 4180 by YosukeHiguchi
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add ml_openslr63 ASR recipe 4173 by bharaniuk
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Adding new recipe for Burmese (OpenSLR80) 4182 by JainSameer06
- [**Recipe**][**ESPnet2**][**ASR**][**README**] add chime6 recipe 4332 by simpleoier
- [**Recipe**][**ESPnet2**][**ASR**][**SE**][**README**] add egs2/chime4/enh_asr1 recipe and results 4316 by simpleoier
- [**Recipe**][**ESPnet2**][**README**][**RNNT**] updated librispeech-asr with rnn-t results 4281 by chintu619
- [**Recipe**][**ESPnet2**][**README**][**SE**] 2021 Clarity Challenge recipe 4210 by popcornell
- [**Recipe**][**ESPnet2**][**README**][**SE**] Add AISHELL-4 ENH recipe 4249 by Emrys365
- [**Recipe**][**ESPnet2**][**README**][**SE**] Add ConferencingSpeech 2021 recipe to egs2 4192 by Emrys365
- [**Recipe**][**ESPnet2**][**README**][**SE**] Add ICASSP2021 DNS Challenge 2 recipe 4253 by YoshikiMas
- [**Recipe**][**ESPnet2**][**README**][**SE**] Add INTERSPEECH 2021 DNS Challenge 3 recipe 4238 by YoshikiMas
- [**Recipe**][**ESPnet2**][**README**][**SE**] Add results of ICASSP2021 DNS Challenge 2 recipe 4309 by YoshikiMas
- [**Recipe**][**ESPnet2**][**README**][**SE**] Rename egs2/clarity21/enh_2021 to egs2/clarity21/enh1 4328 by Emrys365
- [**Recipe**][**ESPnet2**][**README**][**SE**] add convtasnet recipe for dns_ins20 4314 by muqiaoy
- [**Recipe**][**ESPnet2**][**README**][**SLU**] Harpervalley recipe 4315 by YushiUeda
- [**Recipe**][**ESPnet2**][**README**][**SLU**] SLUE Voxpopuli base recipe 4262 by siddhu001
- [**Recipe**][**ESPnet2**][**README**][**ST**] CoVOST2 recipes 4300 by ftshijt
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Update SLU results for ICASSP 4283 by siddhu001

Others
- [**CI**][**Docker**] Github Action Trigger Docker Build 4295 by Fhrozen
- [**CI**][**Docker**] Github Action for Docker build 4219 by Fhrozen
- [**CI**][**ESPnet1**][**ESPnet2**][**Installation**][**README**] Add isort checking to the CI tests 4372 by kamo-naoyuki
- [**CI**][**ESPnet1**][**ESPnet2**][**Installation**][**README**][**mergify**] Add pytorch=1.10.2 and 1.11.0 to ci configurations 4348 by kamo-naoyuki
- [**CI**][**ESPnet2**][**ASR**][**SE**] add integration test and fix the decoding in enh_asr and enh_st 4310 by simpleoier
- [**CI**][**ESPnet2**][**New Features**][**SLU**][**ST**][**streaming**] Add streaming ST/SLU 4243 by D-Keqi
- [**CI**][**ESPnet2**][**ST**] Add Test Functions for ST Train and Inference 4324 by ftshijt
- [**CI**][**Installation**] update install_pesq.sh 4265 by LiChenda
- [**Documentation**][**ESPnet2**][**README**][**TTS**] Minor update for JETS 4369 by kan-bayashi
- [**Documentation**][**README**] Change the order of README 4289 by ftshijt
- [**Documentation**][**README**] Update README.md 4284 by sw005320

Acknowledgements
Special thanks to AshibaWu, D-Keqi, DanBerrebbi, Emrys365, Fhrozen, G-Thor, JainSameer06, LanceaKing, LiChenda, WeiGodHorse, YoshikiMas, YosukeHiguchi, YushiUeda, akreal, bharaniuk, brianyan918, chintu619, earthmanylf, espnetUser, ftshijt, imdanboy, kamo-naoyuki, kan-bayashi, muqiaoy, nateanl, navya-yarrabelly, popcornell, pyf98, roshansh-cmu, siddhu001, simpleoier, sw005320.

v.202204
News
From this version, we decided to use date-based versioning, e.g., `v.202204`.

New Features
- [**New Features**][**ESPnet1**] added learnable fourier features 4029 by popcornell
- [**New Features**][**ESPnet1**][**ESPnet2**][**ASR**] Restricted Self Attention for E2E Speech Summarization 4071 by roshansh-cmu
- [**New Features**][**ESPnet1**][**Installation**][**README**] add lrs avsr recipe 4104 by wentaoxandry
- [**New Features**][**ESPnet1**][**README**] add lip reading sentences dataset code 4074 by wentaoxandry
- [**New Features**][**ESPnet2**][**ASR**] [ESPnet2] Intermediate/Self-conditioned CTC 4084 by YosukeHiguchi
- [**New Features**][**ESPnet2**][**ASR**] [WIP] [ESPnet2] Mask-CTC 4158 by YosukeHiguchi
- [**New Features**][**ESPnet2**][**ASR**][**README**] Add stochastic depth to conformer and share results on LibriSpeech 960h 4142 by pyf98
- [**New Features**][**ESPnet2**][**MT**] MT task for espnet2 with IWSLT14 recipe 4111 by siddalmia
- [**New Features**][**ESPnet2**][**README**][**SE**] Add DC-CRN complex masking and spectral mapping approach for speech enhancement 4127 by Emrys365
- [**New Features**][**ESPnet2**][**README**][**SE**] Add DCCRN separator 4097 by Johnson-Lsx
- [**New Features**][**ESPnet2**][**README**][**SE**] Add a new separator for speech enhancement/separation tasks 4062 by LiChenda
- [**New Features**][**ESPnet2**][**README**][**SE**] Add iFaSNet for enhancement/separation tasks. 4130 by LiChenda
- [**New Features**][**ESPnet2**][**SE**] Refactor DNN_Beamformer in espnet2 and add new beamformers 4082 by Emrys365


Enhancement
- [**Enhancement**][**ESPnet2**] Add an optional suffix to the averaged model file name 4067 by pyf98
- [**Enhancement**][**ESPnet2**] Update perturb_data_dir_speed.sh 4091 by AmirHussein96
- [**Enhancement**][**ESPnet2**][**ASR**] Add tests for Intermediate/Self-conditioned CTC 4117 by YosukeHiguchi
- [**Enhancement**][**ESPnet2**][**TTS**] Add option to use norm. feats over denorm. 4250 by G-Thor

Recipe
- [**Recipe**][**ESPnet1**][**RNNT**] [ESPNET1] Add the results of conformer-transducer for Librispeech 4080 by eesungkim
- [**Recipe**][**ESPnet2**][**ASR**] Add ASR recipe for VCTK dataset based on TTS's dataprep. 4088 by kashikashi
- [**Recipe**][**ESPnet2**][**ASR**] Add new conformer config with hop length 160 for LibriSpeech 960h 4162 by pyf98
- [**Recipe**][**ESPnet2**][**ASR**] Add new zh_openslr38 ASR recipe 4181 by cuichenx
- [**Recipe**][**ESPnet2**][**ASR**] Add transformer results for LibriSpeech 100h 4089 by pyf98
- [**Recipe**][**ESPnet2**][**ASR**] Added Marathi OpenSLR 64 recipe 4179 by SujaySKumar
- [**Recipe**][**ESPnet2**][**ASR**] Added recipe for Microsoft Speech Corpus (Indian languages) 4194 by chintu619
- [**Recipe**][**ESPnet2**][**ASR**] Automatic lyric recognition Recipe 4129 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**] ESPNET - LRS3 Recepie 4101 by gdebayan
- [**Recipe**][**ESPnet2**][**ASR**] bengali asr model with no finetuning 4047 by dzeinali
- [**Recipe**][**ESPnet2**][**MT**] IWSLT'14 Results using ESPnet2-MT 4132 by pyf98
- [**Recipe**][**ESPnet2**][**README**] Mandarin ISO id should be CMN instead of ZHO 4125 by xinjli
- [**Recipe**][**ESPnet2**][**README**] Update README.md 4037 by dzeinali
- [**Recipe**][**ESPnet2**][**README**] Update README.md 4121 by dzeinali
- [**Recipe**][**ESPnet2**][**README**] Update README.md for How2 2000h ASR,SUM 4155 by roshansh-cmu
- [**Recipe**][**ESPnet2**][**RNNT**] Create decode_rnnt_conformer.yaml 4058 by sw005320
- [**Recipe**][**ESPnet2**][**RNNT**] Create train_rnnt_conformer.yaml 4057 by sw005320
- [**Recipe**][**ESPnet2**][**SLU**] Add IEMOCAP results and configs 4100 by YushiUeda
- [**Recipe**][**ESPnet2**][**SLU**] Add new config and support for computing WER in SLUE-VoxCeleb 4152 by siddhu001
- [**Recipe**][**ESPnet2**][**SLU**] Add sentiment data preparation for IEMOCAP 4065 by YushiUeda
- [**Recipe**][**ESPnet2**][**SLU**] ESPnet2 swbd_sentiment recipe 4134 by YushiUeda
- [**Recipe**][**ESPnet2**][**ST**] egs2/iwslt22_dialect 4013 by brianyan918

Bugfix
- [**Bugfix**][**CI**][**ESPnet2**] Fix CI test failures related to torch_complex 0.4.0 4112 by Emrys365
- [**Bugfix**][**CI**][**Installation**] fix doc ci by pinning jinja version 4239 by xinjli
- [**Bugfix**][**ESPnet2**] Fix n-gram decoding 4168 by sw005320
- [**Bugfix**][**ESPnet2**] bug fixes and efficient train/dev split in data prep of Microsoft Indian Languages recipe 4196 by chintu619
- [**Bugfix**][**ESPnet2**] fix errors in configs of librispeech ssl frontends 4098 by simpleoier
- [**Bugfix**][**ESPnet2**][**ASR**][**ST**] [bug patch] egs2/iwslt22_dialect 4049 by brianyan918
- [**Bugfix**][**ESPnet2**][**MT**][**ST**] Fix joint tokenization in st.sh 4143 by pyf98
- [**Bugfix**][**ESPnet2**][**MT**][**ST**] scoring fixes MT and ST 4146 by siddalmia
- [**Bugfix**][**ESPnet2**][**TTS**] Fix speaker normalization 4229 by LanceaKing
- [**Bugfix**][**Installation**] set gtn version 4122 by brianyan918
- [**Bugfix**][**ESPnet1**][**ESPnet2**] minor fixes in ST in espnet2 4056 by siddalmia

Others
- [**CI**] Simplify vocoder compatibility test 4061 by kan-bayashi
- [**CI**][**Documentation**] Fix notebook in the official doc. 4171 by ShigekiKarita
- [**Docker**] Docker Updates 4064 by Fhrozen
- [**Documentation**] Add a checklist for PRs on recipe 4053 by ftshijt
- [**Documentation**] README Update for E2E Speech Summarization 4071 4150 by roshansh-cmu
- [**Documentation**] Update the example PyTorch version in Installation doc 4116 by pyf98
- [**Documentation**] [documentation] fix minor typo in installation.md 4164 by JDongian
- [**Documentation**][**ESPnet1**] fix typo 4044 by ooyamatakehisa
- [**Documentation**][**ESPnet1**][**ESPnet2**][**ASR**] Add Huggingface-cli usage 4027 by karthik19967829

Acknowledgements
Special thanks to AmirHussein96, Emrys365, Fhrozen, G-Thor, JDongian, Johnson-Lsx, LanceaKing, LiChenda, ShigekiKarita, SujaySKumar, YosukeHiguchi, YushiUeda, brianyan918, chintu619, cuichenx, dzeinali, eesungkim, ftshijt, gdebayan, kan-bayashi, karthik19967829, kashikashi, ooyamatakehisa, popcornell, pyf98, roshansh-cmu, siddalmia, siddhu001, simpleoier, sw005320, wentaoxandry, xinjli.

v.0.10.6
New Features
- [**New Features**][**ESPnet2**][**TTS**][**Installation**][**README**] [TTS] Support python-based toolkit for xvector extractors 4016 by Fhrozen
- [**New Features**][**ESPnet2**] Add SpecAug2 which supports variable maximum width in time masking 3902 by pyf98

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Add librispeech-100h recipe 3997 by YosukeHiguchi
- [**Recipe**][**ESPnet1**][**ASR**] Update egs/librispeech_100 4036 by YosukeHiguchi
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Scoring Mandarin / English separately for the SEAME corpus 3976 by vectominist
- [**Recipe**][**ESPnet2**][**ASR**][**README**] update LibriSpeech Pretrained models with SSLRs: results and huggingf… 3979 by simpleoier
- [**Recipe**][**ESPnet2**][**ASR**][**README**][**ST**] Speech translation framework (merging into master) 3987 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**][**TTS**] Update two recipes (googlei18n and hub4_spanish) 3895 by ftshijt
- [**Recipe**][**ESPnet2**][**SLU**][**README**] updated the results of Slue voxceleb 3929 by siddhu001
- [**Recipe**][**ESPnet2**][**ST**] Update the default setting for st 3993 by ftshijt

Bugfix
- [**Bugfix**][**ESPnet1**][**RNNT**] Fix bug for Conformer-T 4020 by YosukeHiguchi
- [**Bugfix**][**ESPnet2**][**Diarization**] Diarization: fix for convolutional input layer in the encoder 3957 by alumae
- [**Bugfix**][**ESPnet2**][**Diarization**] Two fixes to diarization evaluation scripts 3938 by alumae
- [**Bugfix**][**ESPnet2**][**Diarization**][**Recipe**] Fix issues in EEND-EDA & add Librimix_diar recipe 3900 by YushiUeda
- [**Bugfix**][**ESPnet2**][**ESPnet1**][**ASR**][**streaming**] streaming conformer bugfix 4025 by jeon30c
- [**Bugfix**][**ESPnet2**][**LM**] Bugfix for espnet2 ngram 4002 by yaochie
- [**Bugfix**][**ESPnet2**][**RNNT**] espnet2 asr inference bugfix for transducer 3943 by jeon30c
- [**Bugfix**][**ESPnet2**][**ST**] Bugfix for ST scoring 3972 by ftshijt

Enhancement
- [**Enhancement**][**ESPnet2**] cleaned tensorboard and stats logging for espnet2 3910 by siddalmia
- [**Enhancement**][**ESPnet2**][**Diarization**] Add test codes for diarization 3953 by YushiUeda
- [**Enhancement**][**ESPnet2**][**streaming**] Add reference for streaming ASR 4014 by D-Keqi

Ohter
- [**CI**] remove the support of pytorch 1.3.1 4038 by sw005320
- [**CI**][**ESPnet1**][**ESPnet2**] fix ci for librosa update 4043 by ftshijt
- [**CI**][**Installation**] Fix numpy version 3965 by kan-bayashi
- [**CI**][**Installation**] temporary fixed pypinyin version 3995 by kan-bayashi
- [**Documentation**][**ESPnet1**][**ESPnet2**][**README**][**SLU**] Add Sinhala E2E SLU Recipe 3890 by karthik19967829
- [**Documentation**][**README**] Update README.md 4039 by sw005320
- [**ESPnet2**][**README**] Update README.md 3931 by sw005320
- [**ESPnet2**][**README**][**TTS**][**Typo**] Fix typo in README.md 4024 by kan-bayashi

Acknowledgements
Special thanks to D-Keqi, Fhrozen, YosukeHiguchi, YushiUeda, alumae, ftshijt, jeon30c, kan-bayashi, karthik19967829, pyf98, siddalmia, siddhu001, simpleoier, sw005320, vectominist, yaochie.

Full Changelog
https://github.com/espnet/espnet/compare/v.0.10.5...v.0.10.6

v.0.10.5
New Features
- [**New Features**][**ESPnet1**][**ASR**] Implement self-conditioned CTC 3856 by komatta-san
- [**New Features**][**ESPnet2**][**ASR**][**CI**][**Installation**] GTN CTC for ESPnet2 3778 by brianyan918
- [**New Features**][**ESPnet2**][**ASR**][**Refactoring**] [ESPnet2] Transducer 2533 by b-flo
- [**New Features**][**ESPnet2**][**README**][**Recipe**] Frontends fusion (any type, any number, linear fusion only for now) for ASR in espnet2 3824 by DanBerrebbi
- [**New Features**][**ESPnet2**][**SE**] Refactor loss computation in enhancement tasks. 3838 by LiChenda

Recipe
- [**Recipe**][**ESPnet1**][**ESPnet2**][**ASR**][**README**] updated the results of aidatatang_200zh 3925 by sw005320
- [**Recipe**][**ESPnet1**][**VC**] Various fixes of voice conversion recipes 3800 by unilight
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Expanding egs2 of Tedlium2 3795 by D-Keqi
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update an4 config 3913 by pyf98
- [**Recipe**][**ESPnet2**][**ASR**][**README**] aidatatang_200zh recipe 3892 by sw005320
- [**Recipe**][**ESPnet2**][**README**] Update README.md 3881 by daisylab
- [**Recipe**][**ESPnet2**][**README**] Update egs2/TEMPLATE/README.md 3793 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**README**] fix readme 3827 by seastar105
- [**Recipe**][**ESPnet2**][**README**][**Recipe**] Add ASR Recipe: Primewords_Chinese 3903 by pyf98
- [**Recipe**][**ESPnet2**][**README**][**Recipe**] Update MISP challenge ASR baseline and add AVSR baseline 3819 by neillu23
- [**Recipe**][**ESPnet2**][**README**][**SLU**] Fsc Maseeval scripts 3769 by siddhu001
- [**Recipe**][**ESPnet2**][**README**][**SLU**] Update Google Speechcommands (SLU recipe) 3915 by pyf98
- [**Recipe**][**ESPnet2**][**README**][**TTS**] ESPnet2 ARCTIC TTS 3791 by peter-yh-wu
- [**Recipe**][**ESPnet2**][**README**][**TTS**] Update README and add missing config 3917 by kan-bayashi
- [**Recipe**][**ESPnet2**][**Recipe**][**SLU**] Slue voxceleb Sentiment Analysis 3894 by siddhu001
- [**Recipe**][**ESPnet2**][**SE**] modified data type in enh.sh 3768 by simpleoier

Bugfix
- [**Bugfix**][**ESPnet1**][**README**][**RNNT**] Fix cache for Transducer search strategies + doc 3869 by b-flo
- [**Bugfix**][**ESPnet1**][**RNNT**] Fix recombine_hyps 3908 by b-flo
- [**Bugfix**][**ESPnet1**][**RNNT**] fix rnn-t ALSD beam search index bug 3794 by maxwellzh
- [**Bugfix**][**ESPnet1**][**RNNT**] fix the sort order in select_k_expansions() 3864 by freewym
- [**Bugfix**][**ESPnet2**] Bug fix for .gitignore and db fill up for CMU cluster 3891 by siddalmia
- [**Bugfix**][**ESPnet2**] Fix 3716 3849 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Merging asr_streaming.sh into asr.sh for laborotv egs2 3868 by D-Keqi
- [**Bugfix**][**ESPnet2**] add init.py 3928 by sw005320
- [**Bugfix**][**ESPnet2**] fix small problem that used before defined in step 12 3871 by simpleoier
- [**Bugfix**][**ESPnet2**] fix stft olens when win_lengths is not equal to n_fft 3812 by IceCreamWW
- [**Bugfix**][**ESPnet2**] update s3prl frontend w.r.t. recent modification in s3prl interface 3839 by simpleoier
- [**Bugfix**][**ESPnet2**][**TTS**] bugfix lang2lid in tts.sh 3906 by imdanboy
- [**Bugfix**][**Installation**] Fix 3783 3786 by kamo-naoyuki

Others
- [**CI**] Fix G2P test failure in CI due to the dict update 3848 by kan-bayashi
- [**CI**][**Documentation**][**ESPnet1**][**ESPnet2**] Fixing issues about streaming Transformer/Conformer training 3880 by D-Keqi
- [**CI**][**ESPnet1**][**ESPnet2**][**Installation**][**New Features**][**README**] nbest rescoring with k2 3567 by glynpu
- [**Documentation**][**README**] Update README.md 3893 by sw005320
- [**Documentation**][**README**][**SSL**] Add more docs about s3prl frontend 3796 by simpleoier
- [**Documentation**][**README**][**streaming**] Updating main README.md about streaming transformer 3855 by D-Keqi
- [**ESPnet1**][**RNNT**] Add exception for conformer decoder 3801 by b-flo
- [**ESPnet2**][**README**][**Typo**] Fix typo in README.md 3852 by kan-bayashi
- [**ESPnet2**][**SE**] add eps in beam-forming reference channel selection 3904 by LiChenda
- [**ESPnet2**][**SLU**] Add unit test for score_intent.py 3759 by siddhu001
- [**ESPnet2**][**ST**] Speech Translation Update 3860 by ftshijt
- [**ESPnet2**][**TTS**][**Installation**][**Refactoring**] Refactor Phonemizer-based G2P 3916 by kan-bayashi

Acknowledgements
Special thanks to D-Keqi, DanBerrebbi, IceCreamWW, LiChenda, b-flo, brianyan918, daisylab, freewym, ftshijt, glynpu, imdanboy, kamo-naoyuki, kan-bayashi, komatta-san, maxwellzh, neillu23, peter-yh-wu, pyf98, seastar105, siddalmia, siddhu001, simpleoier, sw005320, unilight.

v.0.10.4
New Features
- [**New Features**][**ESPnet1**][**ESPnet2**][**ASR**][**README**] The code for Emiru's real streaming Transformer 3614 by D-Keqi
- [**New Features**][**ESPnet1**][**MT**][**ST**][**Installation**] Support sacreBLEU 3698 by hirofumi0810
- [**New Features**][**ESPnet2**][**ST**] ESPNet2 speech translation 3587 by ftshijt

Enhancement
- [**Enhancement**][**ESPnet1**][**ASR**] Fix e2e_asr_maskctc.py to make RTF computable 3634 by eddiewng
- [**Enhancement**][**ESPnet2**][**Installation**][**README**] HuggingFace Upload support for ESPnet2 tasks [cont.] 3677 by Fhrozen
- [**Enhancement**][**ESPnet2**][**TTS**][**Installation**] Add korean_jaso tokenizer and korean_cleaner 3588 by windtoker

Bugfix
- [**Bugfix**][**ESPnet1**][**ASR**][**RNNT**] Fix quantization for Transducer 3616 by b-flo
- [**Bugfix**][**ESPnet2**][**ASR**][**Recipe**] added download test set, small modifications for path of aishell 3663 by teinhonglo
- [**Bugfix**][**ESPnet2**] Do stft with librosa when neither MKL nor CUDA is available. 3668 by CTinRay
- [**Bugfix**][**ESPnet2**] [bug fixed] allow adding noise independently of rir, bug fixed in 3692 by ranchlai
- [**Bugfix**][**ESPnet2**][**Recipe**] Create Symlinks for 1-channel/2-channel tracks in chime4 3699 by neillu23
- [**Bugfix**][**ESPnet2**][**Recipe**] Fix SWBD Data Prep Bug 3742 by brianyan918

Recipe
- [**Recipe**][**ESPnet1**][**ASR**][**MT**][**ST**] Add CoVoST2 recipe 3720 by hirofumi0810
- [**Recipe**][**ESPnet2**][**ASR**][**README**] MISP2021 E2E ASR Baseline 3738 by neillu23
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Wenetspeech 3686 by pengchengguo
- [**Recipe**][**ESPnet2**][**SLU**] Add snips hubert feature training 3619 by yuekaizhang
- [**Recipe**][**ESPnet2**][**SLU**] Make scoring part more general 3715 by siddhu001
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Add ESPnet-SLU Recipe: Google Speech Commands 3693 by pyf98
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Add an ESPnet2 recipe for the Grabo SLU dataset 3669 by pyf98
- [**Recipe**][**ESPnet2**][**SLU**][**README**] CATSLU-MAPS: Added recipe 3685 by SujaySKumar
- [**Recipe**][**ESPnet2**][**SLU**][**README**] ESPnet2 Japanese dialogue act classification recipe 3667 by YushiUeda
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Slurp SLU with bpe encoded transcripts 3674 by siddhu001
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Slurp entity classification 3739 by siddhu001
- [**Recipe**][**ESPnet2**][**SSL**] Add eps in acc computation of HuBERT model 3713 by simpleoier
- [**Recipe**][**ESPnet2**][**TTS**] Change the timing of srctexts creation 3734 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] update kss recipe with VITS configuration 3660 by windtoker

Others
- [**CI**][**ESPnet2**][**Installation**] Fix tests in CI 3700 by kan-bayashi
- [**CI**][**ESPnet2**][**SLU**][**README**] Add Hubert pretrained ASR in FSC SLU 3653 by siddhu001
- [**CI**][**Installation**] Minor update for CI 3656 by kan-bayashi
- [**Documentation**][**ESPnet1**][**README**][**RNNT**][**Refactoring**] Refactor custom Transducer build 3697 by b-flo
- [**Documentation**][**ESPnet2**][**README**] Hugging Face support - Doc [cont.] 3709 by Fhrozen
- [**Installation**] Update pyopenjtalk version 3733 by kan-bayashi
- [**README**] Huggingface spaces ESPnet2-TTS web demo 3673 by AK391
- [**README**][**ESPnet2**] Add Huggingface model documentation 3714 by siddhu001
- [**README**][**ESPnet2**] Fix readme 3750 by takenori-y


Acknowledgements
Special thanks to AK391, CTinRay, D-Keqi, Fhrozen, SujaySKumar, YushiUeda, b-flo, brianyan918, eddiewng, ftshijt, hirofumi0810, kan-bayashi, neillu23, pengchengguo, pyf98, ranchlai, siddhu001, simpleoier, takenori-y, teinhonglo, windtoker, yuekaizhang.

v.0.10.3
New Features
- [**New Features**][**ESPnet1**][**RNNT**][**Installation**][**README**] FastEmit support 3591 by b-flo
- [**New Features**][**ESPnet2**][**ASR**] Add ASR portable evaluation script 3569 by kan-bayashi
- [**New Features**][**ESPnet2**][**README**] EEND-EDA model for diarization task 3621 by YushiUeda

Bugfix
- [**Bugfix**][**ESPnet1**] Fix /usr/bin/env bash -e 3651 by kamo-naoyuki
- [**Bugfix**][**ESPnet1**] ctc loss using dropout layer since .eval() will not work for F.dropout 3539 by zh794390558
- [**Bugfix**][**ESPnet2**] Minor fix of `evaluate_asr.sh` 3596 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**ASR**] wav2vec2_encoder bug fix 3545 by simpleoier
- [**Bugfix**][**ESPnet2**][**README**][**SSL**] Fix some issues of 3512 and add README.md to librispeech/ssl1 recipe. 3572 by Jzmo
- [**Bugfix**][**ESPnet2**][**TTS**] Bug fix the attribute registration in VITS generator 3573 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**TTS**] Fix pyopenjtalk_g2p_accent(_with_pause) 3555 by zzxiang

Recipe
- [**Recipe**][**ESPnet1**][**ASR**][**RNNT**] Update Transducer recipes 3465 by b-flo
- [**Recipe**][**ESPnet1**][**ST**] Clean libri-trans 3540 by hirofumi0810
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Dan aishell4 branch 3585 by DanBerrebbi
- [**Recipe**][**ESPnet2**][**ASR**][**README**] update pretrained models of librispeech using hubert/wav2vec2 3568 by simpleoier
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Add slu snips data receipe 3407 by yuekaizhang
- [**Recipe**][**ESPnet2**][**TTS**] Update GAN-TTS based configurations 3570 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add initial VITS results for JSUT 3550 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add つくよみちゃんコーパス recipe 3552 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] IndicSpeech TTS Scripts 3435 by peter-yh-wu
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update ESPnet2-TTS results 3578 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update JSUT and JVS results 3553 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update LJSpeech and CSMSC results 3560 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update TTS results 3615 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update TTS results 3648 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update VCTK results 3581 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update pret-trained model for TTS recipes 3590 by ftshijt
- [**Recipe**][**ESPnet2**][**TTS**][**README**] update kss recipe with new result. 3589 by windtoker
- [**Recipe**][**ESPnet2**][**TTS**][**Typo**] Fix typo `egs2/jtubespeech/tts1` 3564 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**Typo**] Update JVS README 3554 by kan-bayashi

Enhancement
- [**Enhancement**][**ESPnet2**][**SE**][**Refactoring**] Add PyTorch Builtin Complex Support in the Speech Enhancement Task 3355 by Emrys365
- [**Enhancement**][**ESPnet2**][**TTS**] Hindi g2p 3579 by peter-yh-wu
- [**Enhancement**][**ESPnet2**][**TTS**] Unify spks / lids / spk_embed_dim type 3551 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Update `evaluate_mcd.py` script 3566 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**][**Installation**] Add the installer of tdmelodic pyopenjtalk 3561 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**][**Installation**][**README**] Update TTS objective eval scripts 3650 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**][**README**] Add a new Japanese G2P for TTS 3558 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**][**README**] Add a new english G2P 3597 by kan-bayashi

Others
- [**CI**] Add codecov config and flags. 3603 by ShigekiKarita
- [**CI**] Omit tools/ from code coverage. 3600 by ShigekiKarita
- [**CI**] Split test_integration.sh 3599 by ShigekiKarita
- [**CI**][**ESPnet2**][**Installation**][**Refactoring**] Make the installation of transformers optional 3622 by kan-bayashi
- [**CI**][**Installation**] Add no-check-certificate option in PESQ installation 3649 by kan-bayashi
- [**CI**][**Installation**][**README**][**mergify**] Change setup.py for pytorch1.9.1 3636 by kamo-naoyuki
- [**Documentation**][**ESPnet1**][**RNNT**] Fix/improve doc(string)s related to Transducer model 3623 by b-flo
- [**Documentation**][**ESPnet2**][**TTS**][**README**] Update README of ESPnet2-TTS 3546 by kan-bayashi
- [**Documentation**][**ESPnet2**][**TTS**][**README**] Update TTS README 3565 by kan-bayashi
- [**Documentation**][**ESPnet2**][**TTS**][**README**] Update TTS fine-tuning README 3549 by kan-bayashi
- [**Typo**][**ESPnet2**] Minor bug in format_wav_scp.py 3575 by ftshijt
- [**Typo**][**ESPnet2**][**TTS**] update mismatch help info for tts 3602 by ftshijt


Acknowledgements
Special thanks to DanBerrebbi, Emrys365, Jzmo, ShigekiKarita, YushiUeda, b-flo, ftshijt, hirofumi0810, kamo-naoyuki, kan-bayashi, peter-yh-wu, simpleoier, windtoker, yuekaizhang, zh794390558, zzxiang.

v.0.10.2
News

- Hubert training is now available!
- Try with `egs2/librispeech/ssl1`
- GAN-based TTS model is now available!
- Joint text2mel and vocoder training
- End-to-end text-to-wave model (VITS) training
- Try with `egs2/ljspeech/tts1`
- Support `from_pretrained` function!
python
e.g.
from espnet2.bin.asr_inference import Speech2Text
asr = Speech2Text.from_pretrained("model_tag")

from espnet2.bin.tts_inference import Text2Speech
tts = Text2Speech.from_pretrained("model_tag")

from espnet2.bin.enh_inference import SeparateSpeech
enh = SeparateSpeech.from_pretrained("model_tag")

from espnet2.bin.diar_inference import DiarizeSpeech
diar = DiarizeSpeech.from_pretrained("model_tag")

Please check the available pretrained models in [espnet_model_zoo](https://github.com/espnet/espnet_model_zoo)!

New Features
- [**New Features**][**ESPnet1**] Intermediate CTC + Stochastic depth 3274 by jaesong
- [**New Features**][**ESPnet2**] Add new trainer for GAN-based training 3436 by kan-bayashi
- [**New Features**][**ESPnet2**][**ASR**] Add Hubert model in Espnet2/Refactor from 3458 3512 by Jzmo
- [**New Features**][**ESPnet2**][**ASR**] batch decode with k2 ctc 3433 by glynpu
- [**New Features**][**ESPnet2**][**ASR**][**SE**] Support `from_pretrained` for ASR and ENH 3535 by kan-bayashi
- [**New Features**][**ESPnet2**][**DIAR**] Support `from_pretrained` for DIAR 3537 by YushiUeda
- [**New Features**][**ESPnet2**][**SE**] Adding portable speech enhancement scripts for other tasks 3487 by Emrys365
- [**New Features**][**ESPnet2**][**TTS**] Add GAN-TTS task with VITS 3449 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support SID and LID inputs for TTS models 3490 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support `from_pretrained` function in `Text2Speech` 3532 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support `parallel_wavegan` vocoders in `tts_inference.py` 3513 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support joint training of text2mel and vocoder 3501 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support language ID input for espnet2 TTS 3489 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support speaker id input for TTS models 3452 by kan-bayashi

Enhancement
- [**Enhancement**][**ESPnet2**][**CTC segmentation**][**README**] Fix CTC Segmentation 3500 by shirayu
- [**Enhancement**][**ESPnet2**][**TTS**] Add VITS-related modules 3448 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Add cython code for VITS 3483 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Add joint training config example 3508 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Add melgan module for joint training 3516 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Add parallel wavegan module for joint training 3515 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Add style melgan module for joint training 3517 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Add vocoder modules related to VITS 3439 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Change Text2Speech class output format 3437 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Follow up of the support speaker id input 3453 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support cleaner option in phn converter util 3450 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support language id in VITS 3499 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support linear spectrogram 3438 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support new g2p functions for various languages 3463 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Update the TTS inference 3498 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**SLU**][**README**] Add support for intent classification on SLURP dataset 3482 by siddhu001
- [**Enhancement**][**ESPnet2**][**SLU**][**README**] Add NLU post-encoder using Hugging Face Transformers 3410 by akreal

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Mucs21 subtask1 3376 by sanket0211
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add Swahili ASR recipe 3485 by akreal
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Rename `swahili` recipe to `iwslt21_low_resource` 3522 by akreal
- [**Recipe**][**ESPnet2**][**DIAR**][**README**] Modify ESPnet2 diarization recipe 3524 by YushiUeda
- [**Recipe**][**ESPnet2**][**ESPnet1**][**ASR**] Espnet2 mucs_subtask2 3415 by bloodraven66
- [**Recipe**][**ESPnet2**][**ESPnet1**][**ASR**] mucs subtask1 3417 by bloodraven66
- [**Recipe**][**ESPnet2**][**SE**] Add Voicebank (vctk_noisy) script 3486 by neillu23
- [**Recipe**][**ESPnet2**][**TTS**] Add missing configs for LibriTTS recipe 3455 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update VITS config comments and settings 3528 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] aishell3 dataset preparation 3505 by actboy
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add CSS10 recipe for ESPnet2-TTS 3464 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add JtubeSpeech Recipe 3459 by Takaaki-Saeki
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add SIWIS recipe 3460 by takenori-y
- [**Recipe**][**ESPnet2**][**TTS**][**README**] TTS recipe for J-KAC corpus 3468 by TanUkkii007
- [**Recipe**][**ESPnet2**][**TTS**][**README**] TTS recipes for thchs30 and aishell3 3470 by ftshijt
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update JMD README 3531 by takenori-y
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update SIWIS README 3509 by takenori-y
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Predict ASR transcript along with Intent for SLU 3480 by siddhu001
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Update SWBD DA configuration 3425 by akreal

Bugfix
- [**Bugfix**][**ESPnet2**] Add return_complex=False for stft 3476 by D-X-Y
- [**Bugfix**][**ESPnet2**] Dynamic import for the ngram function 3420 by ftshijt
- [**Bugfix**][**ESPnet2**][**README**][**Recipe**] Add the GigaSpeech normalization and fix the WER 3519 by chaisz19
- [**Bugfix**][**ESPnet2**][**TTS**] Add duration and focus_rate in output dict 3469 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**TTS**] Add missing symlink to trim_silence.py for ESPnet2 3467 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**TTS**] Fix wrong arguments in pretrained vococder wrapper 3525 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**TTS**] Revert wrongly removed lines in `tts.sh` 3503 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**TTS**][**Typo**] Fix typo in hifigan 3504 by kan-bayashi

Refactoring
- [**Refactoring**][**ESPnet1**][**ASR**][**RNNT**][**README**] Transducer v5 3217 by b-flo
- [**Refactoring**][**ESPnet2**][**SE**][**DIAR**] Remove prefix `enh_` and `diar_` 3538 by kan-bayashi
- [**Refactoring**][**ESPnet2**][**TTS**] Refactor TTS modules in ESPnet2 3497 by kan-bayashi
- [**Refactoring**][**ESPnet2**][**TTS**] Remove the support of feats_type=fbank/stft in ESPnet2-TTS 3514 by kan-bayashi

Others
- [**CI**] Fix k2 version in CI using conda 3493 by kan-bayashi
- [**CI**] Fix test condition 3527 by kan-bayashi
- [**CI**][**Installation**] Update Sentencepiece and add python 3.9 to CI 3422 by shirayu
- [**Docker**] Docker Updates 3393 by Fhrozen
- [**Documentation**] Update the tutorial about maxlenratio usage 3523 by akreal
- [**Documentation**][**ESPnet2**][**TTS**] Update README.md 3502 by kan-bayashi
- [**Installation**][**README**] Added a link and a classifier for Python 3.9 3440 by shirayu
- [**Typo**] Fix typos in "egs" 3447 by shirayu
- [**Typo**][**Documentation**] Fix typos in "doc" 3441 by shirayu
- [**Typo**][**Documentation**] Fix typos in "utils" 3442 by shirayu
- [**Typo**][**ESPnet1**][**MT**] Fix typos in "espnet" 3444 by shirayu
- [**Typo**][**ESPnet2**] Fix typos in "espnet2" 3443 by shirayu
- [**Typo**][**ESPnet2**][**README**] Fix typos in "egs2" 3445 by shirayu


Acknowledgements

Special thanks to D-X-Y, Emrys365, Fhrozen, Jzmo, Takaaki-Saeki, TanUkkii007, YushiUeda, actboy, akreal, b-flo, bloodraven66, chaisz19, ftshijt, glynpu, jaesong, kan-bayashi, neillu23, sanket0211, shirayu, siddhu001, takenori-y.

v.0.10.1
New Features
- [**New Features**][**ESPnet2**] Porting existing pre-trained models to hugging face 3321 by siddhu001
- [**New Features**][**ESPnet2**][**ASR**][**CI**][**Installation**] k2_and_espnet2 3358 by glynpu
- [**New Features**][**ESPnet2**][**ASR**][**LM**][**CI**] espnet2 ngram 3345 by qmpzzpmq
- [**New Features**][**ESPnet2**][**Installation**] add s3prl frontend 3187 by simpleoier

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Fix the iconv error in hkust data prep 3397 by sw005320
- [**Recipe**][**ESPnet1**][**ASR**] mucs subtask2 baseline recipes (e2e and kaldi) 3362 by bloodraven66
- [**Recipe**][**ESPnet1**][**ESPnet2**][**ASR**] JTubeSpeech recipe and hkust espnet1 3406 by sw005320
- [**Recipe**][**ESPnet1**][**TTS**] CMU INDIC TTS 3347 by peter-yh-wu
- [**Recipe**][**ESPnet2**][**ASR**] ESPnet2 Recipe for Ksponspeech 3387 by YushiUeda
- [**Recipe**][**ESPnet2**][**ASR**] Fix gigaspeech pre-trained model link 3317 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**] LRS2 lipreading recipe 3346 by LiChenda
- [**Recipe**][**ESPnet2**][**ASR**] OpenSLR Sundanese ASR 3344 by peter-yh-wu
- [**Recipe**][**ESPnet2**][**ASR**] Recipe of JTubeSpeech 3311 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**] fix path error in local/score.sh in swbd 3349 by wonkyuml
- [**Recipe**][**ESPnet2**][**ASR**] updated javanese and sundanese readmes 3369 by peter-yh-wu
- [**Recipe**][**ESPnet2**][**ASR**][**Installation**] OpenSLR Javanese ASR 2960 by peter-yh-wu
- [**Recipe**][**ESPnet2**][**SLU**] Add initial Switchboard Dialogue Act classification recipe 3395 by akreal
- [**Recipe**][**ESPnet2**][**SLU**] FSC Espnet2 data preparation 3352 by siddhu001
- [**Recipe**][**ESPnet2**][**TTS**] Add HUI-audio-corpus-german recipe for ESPnet2-TTS 3375 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Add JMD recipe 3394 by takenori-y
- [**Recipe**][**ESPnet2**][**TTS**] Add RUSLAN recipe for ESPnet2-TTS 3378 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Support KSS dataset recipe for ESPnet2-TTS 3383 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update HUI audio corpus german recipe 3381 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update HUI-audio-corpus-german recipe results of ESPnet2-TTS 3391 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update KSS dataset recipe results of ESPnet2-TTS 3400 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update RUSLAN recipe results of ESPnet2-TTS 3390 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] indic tts without pretrained model 3401 by peter-yh-wu

Enhancement
- [**Enhancement**][**ESPnet2**] Update wav2vec2_encoder.py 3312 by brotheroak
- [**Enhancement**][**ESPnet2**][**TTS**] Add trim_silence for ESPnet2-TTS 3380 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Allow override default 'speed_control_alpha' parameter 3316 by airenas
- [**Enhancement**][**ESPnet2**][**TTS**] Support French G2P 3372 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support German G2P 3371 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support Korean G2P 3382 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support Russian G2P 3377 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support Spanish G2P 3373 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Update README about G2P 3374 by kan-bayashi

Bugfix
- [**Bugfix**][**ESPnet1**][**ESPnet2**] Fix a type error of swbd data preparation. 3324 by pengchengguo
- [**Bugfix**][**ESPnet1**][**ESPnet2**][**TTS**] Fixed label modification in Taco2 or Transformer-TTS with R > 1 3392 by kan-bayashi
- [**Bugfix**][**ESPnet2**] fix a bug in OneCycleLR and CyclicLR 3319 by sw005320

Others
- [**Typo**][**ESPnet1**] Update batch_beam_search_online_sim.py 3367 by aky15
- [**Typo**][**ESPnet2**] Fixed typo in model name 3364 by kan-bayashi
- [**Typo**][**ESPnet2**] Update contextual_block_transformer_encoder.py 3354 by aky15

Acknowledgements
Special thanks to LiChenda, YushiUeda, airenas, akreal, aky15, bloodraven66, brotheroak, glynpu, kan-bayashi, pengchengguo, peter-yh-wu, qmpzzpmq, siddhu001, simpleoier, sw005320, takenori-y, wonkyuml.

v.0.10.0
From v.0.10.x, we drop the support pytorch < 1.3.
See more info in https://github.com/espnet/espnet/issues/3300

New Features and Enhancement
- [**New Features**][**ESPnet1**][**ASR**][**CI**] Dynamic quantization for decoding 3210 by xu-gaopeng
- [**New Features**][**ESPnet1**] Add quantize args 3280 by xu-gaopeng
- [**Enhancement**][**ESPnet2**][**README**] Update W&B integration 3278 by AyushExel
- [**Enhancement**][**ESPnet2**][**README**] Change the default value of use_wandb to False 3287 by kamo-naoyuki

Bugfix
- [**Bugfix**][**ESPnet1**] Fix some bugs in xml2stm.py 3252 by AshrafMahdhi
- [**Bugfix**][**ESPnet1**][**Recipe**] fix the required number of arguments 3249 by AshrafMahdhi
- [**Bugfix**][**ESPnet2**] Bug fix of accum_grad when grad-nan 3283 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix 3255 3257 by tjysdsg
- [**Bugfix**][**ESPnet2**] Fix bug when "--field -5" is passed to espnet2.bin.tokenize_text 3262 by tjysdsg
- [**Bugfix**][**ESPnet2**] Fix typo in asr.sh (espnet2) that might cause bug 3264 by tjysdsg
- [**Bugfix**][**ESPnet2**] Warn ignore_nan_grad with warpctc instead of error. 3298 by ShigekiKarita
- [**Bugfix**][**ESPnet2**][**TTS**] Fix a bug in the TTS transformer initialization 3251 by sw005320

Recipe
- [**Recipe**][**ESPnet1**][**ST**] Minor fix of Fisher-Callhome recipe 3305 by hirofumi0810
- [**Recipe**][**ESPnet2**][**ASR**] ESPnet2 Receipe for swbd 3269 by yuekaizhang
- [**Recipe**][**ESPnet2**][**ASR**][**README**] SWBD Result Update 3308 by roshansh-cmu
- [**Recipe**][**ESPnet2**][**SE**] Add scripts for DNS Interspeech 2020 in ESPNet-se 3259 by neillu23
- [**Recipe**][**ESPnet2**][**SE**][**README**] Pretrained model for vctk noisy reverberant recipe 3273 by LiChenda
- [**Recipe**][**ESPnet2**][**SE**][**README**] dns_ins20: Add README.md and real_recording testing data. 3281 by neillu23

Refactoring
- [**Refactoring**][**ESPnet2**][**ASR**] Update ctc.py 3292 by 200987299
- [**Refactoring**][**ESPnet1**][**ASR**][**MT**][**CI**][**README**] Delete old pytorch dispatch in espnet1 3301 by ShigekiKarita
- [**Refactoring**][**CI**][**Documentation**][**Installation**][**README**] Remove travis and add .github/workflows/doc.yml to deploy doc 3294 by ShigekiKarita
- [**Refactoring**][**CI**][**Installation**][**README**] Add pytorch 1.9.0 support and remove 1.0.1, 1.1.0, and 1.2.0 3299 by ShigekiKarita

Others
- [**Documentation**][**ESPnet2**] Add a comment for disabling the attention plot 3258 by sw005320
- [**ESPnet2**][**Installation**][**mergify**] Follow up for 3299, about pytorch1.9.0 in ci 3310 by kamo-naoyuki

Acknowledgements
Special thanks to 200987299, AshrafMahdhi, AyushExel, LiChenda, ShigekiKarita, hirofumi0810, kamo-naoyuki, neillu23, roshansh-cmu, sw005320, tjysdsg, xu-gaopeng, yuekaizhang.

v.0.9.10
New Features
- [**New Features**][**ESPnet1**][**ESPnet2**][**Installation**][**README**] CTC Segmentation for ESPnet 2 3087 by lumaku

Bugfix
- [**Bugfix**][**ESPnet1**] Fix merge_short_segments.py 3171 by hirofumi0810
- [**Bugfix**][**ESPnet1**] update layer norm to reflect the dimension variable 3193 by sw005320
- [**Bugfix**][**ESPnet1**][**ASR**] Fix a bug about variable spelling errors 3208 by lzm0706
- [**Bugfix**][**ESPnet1**][**ST**] Fix ST-TED data preparation 3167 by hirofumi0810
- [**Bugfix**][**ESPnet2**] Fix a bug of adding noise to the training data. 3220 by pengchengguo
- [**Bugfix**][**ESPnet2**] fix a bug in the CTC mode 3190 by sw005320
- [**Bugfix**][**ESPnet2**] fix typo for AdapterForSoundScpReader 3096 by deciding
- [**Bugfix**][**ESPnet2**] remove find_unused_parameters from DataParallel 3149 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**ASR**] Changed to include nlsyms.txt in the pretrained model 3236 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**ASR**] Fix missing nlsyms.txt for pretrained models 3234 by lumaku
- [**Bugfix**][**ESPnet2**][**ASR**] Workaround for missing nlsyms.txt 3235 by kamo-naoyuki
- [**Bugfix**][**ESPnet1**][**ASR**][**Installation**] GTN CTC bug fix, unit test, and installer 3199 by brianyan918
- [**Bugfix**][**ESPnet2**][**README**] Update README.md, edit wrong file link. 3164 by xxjjvxb

Enhancement
- [**Enhancement**] Added "trans_type" to utils/remove_longshortdata.sh and utils/update_json.sh 3148 by teinhonglo
- [**Enhancement**][**ESPnet2**][**SE**][**README**] Update the readme file for the SE demo page. 3225 by LiChenda
- [**Enhancement**][**ESPnet2**][**ASR**][**README**] update asr demo 3192 by ftshijt

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Fix segmentation in IWSLT21 ASR 3169 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**] Fix tokenization on TEDLIUM2 in IWSLT21 ASR recipe 3142 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**] fix add_to_datadir.py in mgb2 recipe 3238 by AshrafMahdhi
- [**Recipe**][**ESPnet1**][**ASR**] fix receipe bug for swbd 3174 by yuekaizhang
- [**Recipe**][**ESPnet1**][**ASR**][**RNNT**] Transducer configs & results for AISHELL-1 3240 by yusshino
- [**Recipe**][**ESPnet1**][**ASR**][**ST**] Fix IWSLT21 recipe for test set evaluation 3155 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ESPnet2**][**README**] endangered language recognition espnet2 recipe 3214 by ftshijt
- [**Recipe**][**ESPnet1**][**MT**] Add IWSLT21 MT recipe 3140 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**] Add IWSLT21 ST recipe 3150 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**] Fix IWSLT evaluation data preparation 3168 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**] IWSLT21 punctuation restoration recipe 3145 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**] Merge short segments in IWSLT test sets 3162 by hirofumi0810
- [**Recipe**][**ESPnet1**][**TTS**] Fix misspelling in ./egs/jsut/tts1/local/download.sh 3227 by muramasa2
- [**Recipe**][**ESPnet2**][**ASR**] Normalization for Open_li52 3215 by ftshijt
- [**Recipe**][**ESPnet2**][**SE**] ESPnet-SE Recipe for noisy reverberant dataset 3243 by LiChenda
- [**Recipe**][**ESPnet2**][**SE**][**README**] Update recipes for speech enhancement task 3153 by LiChenda

Acknowledgements
Special thanks to AshrafMahdhi, LiChenda, brianyan918, deciding, ftshijt, hirofumi0810, kamo-naoyuki, lumaku, lzm0706, muramasa2, pengchengguo, sw005320, teinhonglo, xxjjvxb, yuekaizhang, yusshino.

v.0.9.9
New Features

- [**New Features**][**ESPnet2**] Speaker diarization implementation in ESPnet 2939 by ftshijt
- [**New Features**][**ESPnet2**] Adding gpu_max_cached_mem_GB in reporter's stats 3057 by kamo-naoyuki
- [**New Features**][**ESPnet2**] add --detect_anomaly option 3035 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**SE**] Further update to speech enhancement task 2929 by shincling

Bugfix

- [**Bugfix**][**ESPnet1**] Fix a typo in the aishell config 3089 by sw005320
- [**Bugfix**][**ESPnet1**] Fix utils/speed_perturb.sh 3062 by hirofumi0810
- [**Bugfix**][**ESPnet1**] fix 3017 3022 by kamo-naoyuki
- [**Bugfix**][**ESPnet1**][**RNNT**] Fix+update RNN encoder 3048 by b-flo
- [**Bugfix**][**ESPnet1**][**RNNT**] Minor fix for NSC 3030 by b-flo
- [**Bugfix**][**ESPnet2**] Fix 3072 3073 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix ESPnet2-TTS conformer backward compatibility 3108 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix a bug when use_amp=True without fairscale 3029 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix logging for pytorch>=1.8 3056 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fixed backward compatibility issue of new conformer definition 3068 by hfujihara
- [**Bugfix**][**Installation**] Fix a bug of uninstalling typing 3058 by kamo-naoyuki
- [**Bugfix**][**Installation**] Fix setup.py to install filelock 3074 by kamo-naoyuki
- [**Bugfix**][**Installation**] fix the condition to install fairscale 3050 by kamo-naoyuki
- [**Bugfix**][**Recipe**][**ESPnet1**] Typo fixed for nahuatl recipe 3044 by ftshijt
- [**Bugfix**][**Recipe**][**ESPnet1**][**ASR**] Bugfix for download_and_untar for nahuatl 3049 by ftshijt
- [**Bugfix**][**Recipe**][**ESPnet1**][**ESPnet2**][**TTS**] Fix CSMSC download script 3109 by kan-bayashi
- [**Bugfix**][**Recipe**][**ESPnet2**][**TTS**][**README**] fixed typo 3121 3123 by kan-bayashi

Enhancement

- [**Enhancement**][**ASR**][**ESPnet1**][**RNNT**] Update loss report 3110 by b-flo
- [**Enhancement**][**ESPnet1**][**RNNT**] Fix related to custom encoder and aux task 3045 by b-flo
- [**Enhancement**][**ESPnet2**][**Documentation**][**Installation**][**README**] modification of freezing option for Wav2Vec encoder, add documents 3036 by simpleoier

Recipe

- [**Recipe**][**ESPnet1**][**ASR**] added results and uploaded models 3063 by sw005320
- [**Recipe**][**ESPnet1**][**ASR**][**ST**] fix download for puebla-nahuatl 3039 by ftshijt
- [**Recipe**][**ESPnet1**][**MT**] Update IWSLT18 MT recipe 3071 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**] IWSLT21-low-resource recipe 3023 by ftshijt
- [**Recipe**][**ESPnet1**][**ST**] Nahuatl Speech Translation 3034 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Added spgispeech recipe in espnet2 2986 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update librispeech result 3082 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Updated ami ihm result 3091 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] added a bpe10000 model and result 3060 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] gigaspeech 3077 by sw005320

Refactoring

- [**Refactoring**][**ESPnet1**] Refactor layer selection in Transformer 3024 by hirofumi0810
- [**Refactoring**][**ESPnet1**][**MT**][**ST**] Unify divide_lang.sh 3066 by hirofumi0810
- [**Refactoring**][**ESPnet2**] Make batch bins sampler faster 3106 by kamo-naoyuki
- [**Refactoring**][**Installation**] Use new pyopenjtalk version 3107 by kan-bayashi
- [**Refactoring**][**ESPnet1**][**ESPnet2**][**Installation**][**Docker**][**Documentation**] Change '!/bin/bash' to '!/usr/bin/env bash' 3059 by kamo-naoyuki

Other

- [**CI**][**Installation**][**README**][**mergify**] Using torch=1.8.1 in ci tests 3122 by kamo-naoyuki
- [**CI**][**Installation**][**README**][**mergify**] Adding pytorch=1.8.0 to the ci 3046 by kamo-naoyuki

Acknowledgements
Special thanks to b-flo, ftshijt, hfujihara, hirofumi0810, kamo-naoyuki, kan-bayashi, shincling, simpleoier, sw005320.

v.0.9.8
New Features
- [**New Features**][**ESPnet1**][**ASR**][**RNNT**] Auxiliary task 2951 by b-flo
- [**New Features**][**ESPnet1**][**Recipe**] RTF calculation 2942 by hirofumi0810
- [**New Features**][**ESPnet2**] Supporting multiple optimizers in the default trainer 3014 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**ASR**] Streaming Transformer ASR 2907 by eml914
- [**New Features**][**ESPnet2**][**ASR**][**Installation**] add wav2vec_encoder 2889 by simpleoier
- [**New Features**][**ESPnet2**][**Documentation**][**Installation**][**README**] Support sharded training of fairscale 2980 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**SE**] Add SeparateSpeech API in espnet2/bin/enh_inference.py 2878 by Emrys365
- [**New Features**][**ESPnet2**][**TTS**][**Installation**][**README**] Support phonemizer for vairous language G2P 2959 by kan-bayashi

Bugfix
- [**Bugfix**][**CI**][**Installation**] Install warp-ctc using pip>=21.0 2999 by ysk24ok
- [**Bugfix**][**ESPnet1**] Integration testing for asr_mix was using the wrong config. 3006 by siddalmia
- [**Bugfix**][**ESPnet1**][**ASR**] Fix model averaging 2910 by b-flo
- [**Bugfix**][**ESPnet1**][**ASR**] bug fixed for streaming transformer ASR 2981 by eml914
- [**Bugfix**][**ESPnet1**][**ASR**] builtin ctc modification 3001 by siddalmia
- [**Bugfix**][**ESPnet1**][**ASR**][**CI**] Fix transfer learning w/ pre-trained LM + finetuning tutorial 2967 by b-flo
- [**Bugfix**][**ESPnet1**][**ASR**][**RNNT**] Fix a condition in TSD 2965 by b-flo
- [**Bugfix**][**ESPnet1**][**ASR**][**Recipe**] fix egs/ljspeech/asr1 2865 2884 by kan-bayashi
- [**Bugfix**][**ESPnet1**][**ASR**][**Recipe**][**ST**] Fix bug in How2 recipe 2933 by hirofumi0810
- [**Bugfix**][**ESPnet1**][**ASR**][**Refactoring**] Fix data sorting in attention/CTC visualization 2883 by hirofumi0810
- [**Bugfix**][**ESPnet1**][**Docker**] Fix docker error caused by BeamSearchTransducer 2973 by b-flo
- [**Bugfix**][**ESPnet1**][**ESPnet2**] Fix bugs of our Conformer implementation. 2816 by pengchengguo
- [**Bugfix**][**ESPnet1**][**ESPnet2**][**Refactoring**] Fix arguments in dynamic and lightweight conv 3004 by hirofumi0810
- [**Bugfix**][**ESPnet1**][**RNNT**] fix out_dim definition 2915 by b-flo
- [**Bugfix**][**ESPnet1**][**TTS**] Fix attention plot bug 2984 2985 by kan-bayashi
- [**Bugfix**][**ESPnet1**][**mergify**] swbd run.sh is including dev data in the training set 2977 by brianyan918
- [**Bugfix**][**ESPnet2**] Fix sharded_ddp mode 3015 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] bug fix for Wav2Vec encoder 2997 by simpleoier
- [**Bugfix**][**ESPnet2**][**Documentation**] Fix for sharded training with amp 2993 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**Documentation**] Fix sharded training for multiple nodes 2994 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**SE**] quick fix for librimix (SE) data preparation 2982 by LiChenda

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Fix dev set in IWSLT21 ASR recipe 3000 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**] IWSLT'21 ASR recipe 2934 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**] Update IWSLT21 ASR recipe 2987 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**] Update the pre-trained Conformer model link of Aishell-1 corpus. 2924 by pengchengguo
- [**Recipe**][**ESPnet1**][**ASR**] Update transformer training results on common vioce dataset 2927 by wenjie-p
- [**Recipe**][**ESPnet1**][**ASR**][**CI**][**Installation**][**Refactoring**] Update IWSLT18 (ST-TED) ASR recipe 2916 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**][**MT**][**ST**][**README**] Must-C v2 recipe 2963 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**][**MT**][**ST**][**Refactoring**] Refactor Fisher-CallHome recipe 2904 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**][**MT**][**ST**][**Refactoring**] Refactor How2 recipe 2906 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**][**MT**][**ST**][**Refactoring**] Refactor Must-C recipe 2901 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**][**MT**][**ST**][**Refactoring**] Refactor libri-trans recipe 2903 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**][**ST**][**Refactoring**] Update IWSLT'19 recipe 2940 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**][**CI**][**Refactoring**] Refactor ST recipes 2975 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**][**Refactoring**] Refactor Mboshi-French corpus 2911 by hirofumi0810
- [**Recipe**][**ESPnet2**][**ASR**] Open-li52(add language id scoring & text case align for test set) 2938 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add Russian open STT recipe for ESPnet2 2972 by akreal
- [**Recipe**][**ESPnet2**][**ASR**][**README**] MLS (multi-lingual librispeech) recipe 2869 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update espnet2 librispeech result 2966 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] added nsc results 2937 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] fix librispeech model url 2976 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] minor fix of li52 and nsc recipes 2936 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] update the results of open li52 recipe 2974 by sw005320
- [**Recipe**][**ESPnet2**][**SE**] Librimix separation results for Conv-Tasnet, 8k, min 2928 by anogkongda
- [**Recipe**][**ESPnet2**][**SE**][**README**] Espnet-SE, Speech enhancement recipes 2888 by LiChenda

Enhancement
- [**Enhancement**][**ESPnet1**][**ASR**] Auto Resampling to 16khz for pretrained models 2969 by siddalmia
- [**Enhancement**][**ESPnet1**][**ASR**][**RNNT**] Minor refactoring 2932 by b-flo
- [**Enhancement**][**ESPnet1**][**ASR**][**RNNT**][**README**][**CI**][**Documentation**] Refactoring RNNT 2887 by b-flo
- [**Enhancement**][**ESPnet1**][**ESPnet2**][**ASR**][**LM**][**MT**][**TTS**] Print total params and trainable params. 2996 by siddalmia
- [**Enhancement**][**ESPnet1**][**LM**] Add LM options like embedding dropout and tie weights 3010 by siddalmia
- [**Enhancement**][**ESPnet1**][**ST**][**Refactoring**] Add the latest RPE implementation to the ST task. 3005 by pengchengguo

Other
- [**CI**][**README**][**mergify**] Stop circle ci 2978 by kamo-naoyuki
- [**Documentation**] Update docs for ESPnet contributing (especially for recipes part) 2905 by ftshijt
- [**Documentation**] fix a typo 3016 by Huang17
- [**Installation**] Uninstall typing 2979 by kamo-naoyuki

Acknowledgements
Special thanks to Emrys365, Huang17, LiChenda, akreal, anogkongda, b-flo, brianyan918, eml914, ftshijt, hirofumi0810, kamo-naoyuki, kan-bayashi, pengchengguo, siddalmia, simpleoier, sw005320, wenjie-p, ysk24ok.

v.0.9.7
New Feature

- [**New Features**][**ESPnet1**][**ASR**] Option for GTN CTC mode 2866 by brianyan918
- [**New Features**][**ESPnet2**][**SE**][**README**] Update to speech enhancement task 2649 by LiChenda
- [**New Features**][**ESPnet2**][**ASR**][**README**] Lightweight Sinc Convolutions for Espnet2 2768 by lumaku
- [**New Features**][**ESPnet2**][**Documentation**] --freeze_param option 2787 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**TTS**][**README**] Add a new G2P `pyopenjtalk_accent_with_pause` 2843 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**][**README**] Add pyopenjtalk_accent g2p for ESPnet2 TTS 2781 by ota
- [**New Features**][**ESPnet2**][**TTS**][**README**] Support X-vector based multi-speaker TTS model in ESPnet2 2800 by kan-bayashi

Enhancement

- [**Enhancement**][**ESPnet1**][**ESPnet2**] Add version info in args 2841 by kan-bayashi
- [**Enhancement**][**ESPnet1**][**ESPnet2**][**ASR**] AMI Recipe (Short UTT checker) 2802 by ftshijt
- [**Enhancement**][**Installation**] add default activate_python.sh 2788 by kamo-naoyuki
- [**Enhancement**][**Installation**] modified: check_install.py 2834 by kamo-naoyuki
- [**Enhancement**][**Installation**][**Documentation**][**ESPnet1**][**ESPnet2**] Change version info location 2840 by kan-bayashi

Bugfix

- [**Bugfix**][**ESPnet1**][**ASR**] fix greedy decoding 2812 by b-flo
- [**Bugfix**][**ESPnet2**][**ASR**] Fix the compatibility of the pretrained ASR model 2794 by kan-bayashi
- [**Bugfix**][**Installation**] Fix 2799 2830 by kamo-naoyuki
- [**Bugfix**][**Installation**] Fix HTS engine installation 2825 by kan-bayashi
- [**Bugfix**][**Installation**] fix the incorrect $PATH setting in tools/extra_path.sh 2833 by jumon
- [**Bugfix**][**Recipe**][**ESPnet1**][**ASR**] Minor fixes in CSJ 2837 by YosukeHiguchi
- [**Bugfix**][**Recipe**][**ESPnet1**][**ASR**] fix receipe bug for librispeech 2735 by yuekaizhang
- [**Bugfix**][**Recipe**][**ESPnet2**][**ASR**] fix a config name 2729 by sw005320
- [**Bugfix**][**Recipe**][**ESPnet2**][**ASR**][**README**] Fix dirha_wsj recipe 2747 by kamo-naoyuki
- [**Bugfix**][**Recipe**][**ESPnet2**][**TTS**] Add missing decoding configs in LibriTTS recipe 2827 by kan-bayashi

Recipe

- [**Recipe**][**ESPnet1**][**ASR**] Add LibriSpeech Conformer results for LibriCSS 2861 by akreal
- [**Recipe**][**ESPnet1**][**ASR**] Update Commonvoice Recipe with Conformer Settings 2739 by ftshijt
- [**Recipe**][**ESPnet1**][**ASR**] Update Russian open STT recipe for v1.01 of the dataset 2776 by akreal
- [**Recipe**][**ESPnet1**][**ASR**] Update models and results of Conformer. 2765 by pengchengguo
- [**Recipe**][**ESPnet1**][**ESPnet2**][**ASR**][**README**] ESPnet2 recipe for commonvoice 2793 by hchung12
- [**Recipe**][**ESPnet1**][**VC**][**README**] VCC2020 database 2754 by unilight
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update Dirha WSJ result 2756 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] espnet2 hkust recipe 2863 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] update the AMI result in espnet2 2817 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] updated the laborotv result 2750 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update reverb result 2876 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**] Minor fix of laborotv recipe 2877 by hfujihara
- [**Recipe**][**ESPnet2**][**TTS**] Fix total number of iterations 2813 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add libritts recipe for ESPnet2 2807 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add x-vector based configs for VCTK 2808 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Minor update TTS README 2818 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update JSUT TTS results 2792 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update JSUT results 2809 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update JSUT results 2871 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update LibriTTS results 2842 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update VCTK results 2814 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update libritts results 2828 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] update latest CSMSC link address 2777 by meowtech

Other

- [**CI**][**Documentation**][**Installation**] Change warp-ctc and warp-transducer to extra 2748 by kamo-naoyuki
- [**CI**][**README**] Update ci setting 2848 by kan-bayashi
- [**ASR**][**Documentation**][**ESPnet2**] Sinc Convolutions - add documentation for plot_sinc_filters.py 2782 by lumaku
- [**Documentation**][**ESPnet1**] fixed some typos 2855 by jumon
- [**Documentation**][**Installation**] Update documentation 2757 by kamo-naoyuki
- [**Installation**][**Refactoring**] Move the dependencies coming from recipes 2740 by kamo-naoyuki

Acknowledgements

Special thanks to AdolfVonKleist, LiChenda, YosukeHiguchi, akreal, b-flo, brianyan918, ftshijt, hchung12, hfujihara, jumon, kamo-naoyuki, kan-bayashi, lumaku, meowtech, ota, pengchengguo, sw005320, unilight, yuekaizhang.



v.0.9.6
New Feature
- [**New Features**][**ESPnet2**] Wandb integration 2707 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**ASR**] Add ignore_nan_grad option for CTC 2699 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**SE**] Touching common modules before the main Enh PR 2705 by LiChenda

Bug fix
- [**Bugfix**][**ESPnet1**] bug fix for pytorch1.7 2656 by kamo-naoyuki
- [**Bugfix**][**ESPnet1**][**ESPnet2**][**TTS**] Use `nkf` in CSMSC data prep 2726 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix flooring for global_mvn.py 2623 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix small bug of tensorboard part 2702 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix wandb mode with multi gpus 2709 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**TTS**] Fix token averaged feature the case when r > 1 2704 by kan-bayashi

Recipe
- [**Recipe**][**ESPnet1**] Extend model averaging condition in run scripts 2613 by b-flo
- [**Recipe**][**ESPnet1**][**ASR**] Enable multi-thread processing of json files. 2681 by Peidong-Wang
- [**Recipe**][**ESPnet1**][**ASR**] Update KsponSpeech conformer results 2624 by jubang0219
- [**Recipe**][**ESPnet1**][**ASR**] Update Voxforge with Conformer results 2642 by YosukeHiguchi
- [**Recipe**][**ESPnet1**][**ASR**] lang was being used before being parsed for user input 2654 by siddalmia
- [**Recipe**][**ESPnet1**][**ASR**][**ESPnet2**][**Installation**][**README**] espnet2 reverb recipe 2691 by kamo-naoyuki
- [**Recipe**][**ESPnet1**][**ASR**][**README**] Update Switchboard with conformer results 2697 by Emrys365
- [**Recipe**][**ESPnet1**][**ASR**][**README**] add librispeech conformer w/ speed perturbation + specaug 2617 by yuekaizhang
- [**Recipe**][**ESPnet2**][**ASR**] ASR template recipe: --srctexts -> --lm_train_text, --bpe_train_text 2660 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**] Add $token_type to asr_tag and lm_tag 2625 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**Installation**][**README**][**Recipe**] Laborotv recipe 2703 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add AISHELL w/o LM result 2718 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] ESPnet2 recipe for TIMIT 2568 by sknadig
- [**Recipe**][**ESPnet2**][**ASR**][**README**] JSUT conformer recipe achieving 12.0/13.9 CER(%) for dev/eval1 2720 by hchung12
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update README.md 2659 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update WSJ result 2628 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] espnet2 librispeech with conformer 2687 by sw005320
- [**Recipe**][**ESPnet2**][**README**] Corpus README in egs2 2713 by sw005320
- [**Recipe**][**ESPnet2**][**README**] update egs2/README.md 2719 by Emrys365

Enhancement
- [**Enhancement**][**Documentation**][**ESPnet2**] Add --init_param option 2680 by kamo-naoyuki
- [**Enhancement**][**ESPnet1**][**ASR**] Save model snapshot at every epoch even if save_interval_iters > 0 - for model averaging 2637 by sknadig
- [**Enhancement**][**ESPnet2**] Update wandb part 2708 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**][**ASR**] Add *_stats_dir options in asr.sh 2724 by kan-bayashi


Documentation
- [**Documentation**][**ESPnet2**][**README**] Update egs2 README 2723 by kan-bayashi
- [**Documentation**][**ESPnet2**][**README**][**TTS**] Update README about fine-tuning 2685 by kan-bayashi
- [**Documentation**][**ESPnet2**][**README**][**TTS**] Update TTS README.md 2650 by kan-bayashi

Refactoring
- [**Refactoring**][**ESPnet1**][**ASR**][**README**] Refactor Mask CTC non-autoregressive ASR 2223 by YosukeHiguchi
- [**Refactoring**][**ESPnet2**] Added unicode support for generated configs 2672 by Piteryo

Others
- [**Installation**] python setup.py install -> pip install -e 2619 by kamo-naoyuki
- [**Installation**][**Refactoring**] modify for zsh: tools/extra_path.sh 2696 by kamo-naoyuki
- [**Docker**] Docker flags for extra libraries (VC) 2622 by Fhrozen

Acknowledgements
Special thanks to Emrys365, Fhrozen, LiChenda, Peidong-Wang, Piteryo, YosukeHiguchi, b-flo, hchung12, jubang0219, kamo-naoyuki, kan-bayashi, siddalmia, sknadig, sw005320, yuekaizhang.

v.0.9.5
New Features
- [**New Features**][**ESPnet2**][**TTS**] Support `g2p=none` for text with phonemes 2551 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Add MCD evaluation script for ESPnet2-TTS 2554 by kan-bayashi
- [**New Features**][**ESPnet1**][**ST**] Conformer End-to-End Speech Translation 2523 by hirofumi0810

Bugfix
- [**Bugfix**][**ESPnet1**] CTC segmentation - package update 2566 by lumaku
- [**Bugfix**][**ASR**][**ESPnet1**] fix bug about att_ws in multi-enc case 2549 by lzm0706
- [**Bugfix**][**ESPnet1**] Conformer averaging model support for pytorch 1.6 2604 by siddalmia
- [**Bugfix**][**ESPnet1**][**ASR**] Set built-in CTC for asr_recog 2588 by lumaku
- [**Bugfix**][**ESPnet1**][**ASR**][**Installation**] Transducer float16 loss bug fix 2496 by GNroy

Refactoring
- [**Refactoring**][**ESPnet1**][**ASR**] Refactor BeamSearchTransducer and ErrorCalculatorTrans 2538 by b-flo

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Alignment recipe for CSJ. 2531 by jnishi
- [**Recipe**][**ESPnet1**][**ASR**] New Recipe for KsponSpeech (Korean spontaneous speech; 969 hours) 2555 by jubang0219
- [**Recipe**][**ESPnet1**][**ASR**] Update TedLium3 conformer results 2600 by LiChenda
- [**Recipe**][**ESPnet1**][**ASR**] Update VIVOS models 2574 by b-flo
- [**Recipe**][**ESPnet1**][**ASR**] Update model link in Puebla-Nahuatl 2607 by ftshijt
- [**Recipe**][**ESPnet1**][**ASR**] Update tedlium2 with conformer results 2599 by Emrys365
- [**Recipe**][**ESPnet1**][**ASR**] update the JSUT recipe with conformer 2546 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**] Add CSJ conformer config 2560 by kan-bayashi
- [**Recipe**][**ESPnet2**][**ASR**] Add CSJ conformer results 2552 by kan-bayashi
- [**Recipe**][**ESPnet2**][**ASR**] Small changes for aishell config 2586 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**] Update espnet2 AISHELL results 2580 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**] update JSUT espnet2 with pre-trained models 2563 by sw005320
- [**Recipe**][**ESPnet2**][**TTS**] Add JSSS recipe for ESPnet2-TTS 2558 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update ESPnet2 TTS result 2542 by kan-bayashi

CI
- [**CI**][**Documentation**] Support espnet2/bin in sphinx doc. 2544 by ShigekiKarita
- [**CI**][**Installation**][**README**] Add pytorch1.7.0 ci test 2605 by kamo-naoyuki

Other
- [**Installation**] Install warpctc-pytorch wheel when torch version is 1.1 - 1.6 2547 by ysk24ok
- [**Installation**] Modified requirements: "dataclasses; python_version < '3.7'", 2541 by kamo-naoyuki
- [**Installation**] Remove pip3 check in setup_python.sh 2567 by ShigekiKarita

Acknowledgements
Special thanks to Emrys365, GNroy, LiChenda, ShigekiKarita, b-flo, ftshijt, hirofumi0810, jnishi, jubang0219, kamo-naoyuki, kan-bayashi, lumaku, lzm0706, siddalmia, sw005320, ysk24ok.

v.0.9.4
New Features

- [**New Features**][**ESPnet1**][**ASR**] Transducer v4 2444 by b-flo
- [**New Features**][**ESPnet2**] Support audio_format=flac.ark, wav.ark 2451 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**ASR**] Support conformer encoder in ESPnet2 ASR 2515 by kan-bayashi

Bugfix

- [**Bugfix**][**ESPnet1**] Fixed IndexError in BatchBeamSearch.post_process() (2483) 2484 by kan-bayashi
- [**Bugfix**][**ESPnet1**][**LM**] fix multigpu bug if pytorch>=1.5 2492 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] remove cleaner 2529 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**TTS**] Fix TTS inference bug for GST + Fastspeech2 2498 by kan-bayashi

Documentation

- [**Documentation**] Update espnet2_tutorial.md 2528 by kamo-naoyuki
- [**Documentation**] Update espnet2_tutorial.md 2532 by kamo-naoyuki
- [**Documentation**] Update espnet2_tutorial.md 2534 by kamo-naoyuki
- [**Documentation**] Update notebook submodule 2499 by kan-bayashi
- [**Documentation**][**ESPnet1**] Small fixes for transducer 2514 by b-flo
- [**Documentation**][**ESPnet2**][**README**][**TTS**] Update ESPnet2 TTS README 2516 by kan-bayashi
- [**Documentation**][**README**] Update README 2504 by kan-bayashi
- [**Documentation**][**README**][**ESPnet1**] CTC segmentation - checks for blank chars and RNN models 2535 by lumaku

Recipe

- [**Recipe**][**ESPnet1**][**ASR**] add conformer results for librispeech 2510 by yuekaizhang
- [**Recipe**][**ESPnet2**][**ASR**] Update ESPnet2 CSJ Transformer results 2497 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Add results for ESPnet2 TTS 2503 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update Transformer-TTS config 2494 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update Transformer-TTS configs 2502 by kan-bayashi

Refactoring

- [**Refactoring**] Modify uttid to "${spkid}-${uttid}" for trn files 2527 by kamo-naoyuki
- [**Refactoring**][**ESPnet1**][**ASR**][**LM**] Remove all __future__ lines 2481 by ShigekiKarita
- [**Refactoring**][**ESPnet1**][**ASR**][**MT**][**ST**] Unify arguments 2506 by hirofumi0810
- [**Refactoring**][**ESPnet1**][**ESPnet2**][**TTS**] Refactor length regulator to improve the speed 2482 by kan-bayashi
- [**Refactoring**][**ESPnet1**][**MT**][**ST**] Refactor decoding for translation tasks 2501 by hirofumi0810
- [**Refactoring**][**ESPnet2**] Change add_scalars to add_scalar for tensorboard SummaryWriter 2525 by kamo-naoyuki

CI

- [**CI**][**ASR**] Make test_e2e_asr.py faster 2488 by ShigekiKarita
- [**CI**][**ASR**] Make test_e2e_asr_maskctc.py faster. 2493 by ShigekiKarita
- [**CI**][**ASR**] Make test_recog.py faster 2486 by ShigekiKarita
- [**CI**][**ESPnet1**][**ASR**] make test_e2e_asr_mulenc.py faster 2480 by ruizhilijhu
- [**CI**][**ESPnet1**][**Installation**] Update shellcheck url. 2500 by ShigekiKarita
- [**CI**][**ESPnet2**][**Installation**] Limit test execution time to 2.0 sec 2520 by ShigekiKarita
- [**CI**][**SE**] Make test_beamformer_net.py faster 2489 by ShigekiKarita
- [**CI**][**SE**] shorten test time for tasnet 2491 by LiChenda

Other

- [**Installation**] Update h5py version to avoid errors in Python3.8 2519 by shigabeev
- [**Docker**] Docker Updates 2509 by Fhrozen

Acknowledgements

Special thanks to Fhrozen, LiChenda, ShigekiKarita, b-flo, hirofumi0810, kamo-naoyuki, kan-bayashi, lumaku, ruizhilijhu, shigabeev, yuekaizhang.

v.0.9.3
New Features

- [**New Features**][**ESPnet2**] Implement --grad_clip_type 2399 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**ASR**] Implement batch_score() method for ASR decoder and LM 2377 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**README**][**TTS**] Support Conformer-based FastSpeech / FastSpeech2 2413 by kan-bayashi

Bugfix

- [**Bugfix**][**CI**][**ESPnet1**][**ESPnet2**] make sure chainer independent 2411 by kamo-naoyuki
- [**Bugfix**][**CI**][**ESPnet1**][**Installation**] Revert ctc seg installation 2392 by kan-bayashi
- [**Bugfix**][**CI**][**Installation**] Fix the installation error in CI 2476 by kan-bayashi
- [**Bugfix**][**ESPnet1**][**ASR**] Lazy import chainer in asr_utils.py 2407 by kamo-naoyuki
- [**Bugfix**][**ESPnet1**][**ASR**] asr: Fix recog issue on Transformer CTC model 2394 by jaesong
- [**Bugfix**][**ESPnet1**][**MT**][**ST**] Fix score_bleu.sh 2400 by hirofumi0810
- [**Bugfix**][**ESPnet1**][**README**][**Typo**] fixed typo in egs/README.md 2473 by mrazizi
- [**Bugfix**][**ESPnet1**][**TTS**] lazy import chainer: espnet/nets/tts_interface.py 2409 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Add missing database in db.sh 2427 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix the CommonPreprocessor_multi missing issue 2460 by LiChenda
- [**Bugfix**][**ESPnet2**] Minor fix of egs2/commonvoice/asr1/local/data.sh 2438 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix the directory for init_file_prefix 2412 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix typo of log_level choices 2472 by glynpu
- [**Bugfix**][**ESPnet2**][**ASR**] Add grep -H option 2388 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**TTS**] Fix wrong sum axis in energy extraction 2469 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**Typo**] Fix typo in help comment and docstrings 2470 by kan-bayashi
- [**Bugfix**][**Installation**] add warpctc_pytorch version==0.1.2 2403 by kamo-naoyuki

Documentation

- [**Documentation**] Add bug report template 2396 by sw005320
- [**Documentation**] Add installation issue template 2397 by sw005320
- [**Documentation**] Update espnet2_distributed.md 2418 by kamo-naoyuki
- [**Documentation**] Update espnet2_distributed.md 2419 by kamo-naoyuki
- [**Documentation**] Update espnet2_training_option.md 2421 by kamo-naoyuki
- [**Documentation**] Update faq.md 2431 by kamo-naoyuki
- [**Documentation**] Update parallelization.md 2428 by kamo-naoyuki
- [**Documentation**][**ESPnet2**][**README**] Update README.md 2430 by kamo-naoyuki

Enhancement

- [**Enhancement**][**ESPnet1**][**ESPnet2**] Add -c option for multi GPUs mode for slurm.conf 2406 by kamo-naoyuki
- [**Enhancement**][**ESPnet1**][**Installation**] Install warpctc-pytorch wheel when torch version is 1.1, 1.2 or 1.3 2453 by ysk24ok
- [**Enhancement**][**ESPnet1**][**README**] ADD CSJ RNN pretrained model 2452 by jnishi
- [**Enhancement**][**ESPnet2**] Update db.sh 2426 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**][**TTS**] Update ESPnet2 TTS config 2468 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Update and add fastspeech2 configs 2429 by kan-bayashi
- [**Enhancement**][**Installation**] Add sanity check for setup_cuda_env.sh 2389 by kamo-naoyuki
- [**Enhancement**][**Installation**] Change cudatoolkit to cuda if cuda_version=8.0 2405 by kamo-naoyuki
- [**Enhancement**][**Installation**] Change to refer https://anaconda.org/pytorch/pytorch/files #2404 by kamo-naoyuki
- [**Enhancement**][**Installation**] Workaround for soundfile issue 2437 by kamo-naoyuki

Recipe

- [**Recipe**][**ESPnet1**][**ASR**] Add LibriCSS recipe 2246 by akreal
- [**Recipe**][**ESPnet1**][**ASR**] Update for the Official Split of YM Recipe 2435 by ftshijt
- [**Recipe**][**ESPnet1**][**ESPnet2**][**ASR**] Update CommonVoice for Latest Version 2455 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**] [zeroth korean] Not to use pipe format if feats_type=raw 2402 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] espnet2 zeroth_korean recipe changing feats_type from fbank_pitch to raw. 2393 by hchung12
- [**Recipe**][**ESPnet2**][**README**][**TTS**] Add ESPnet2 TTS finetuning example recipe (JVS) 2465 by kan-bayashi

CI

- [**CI**] Add codecov actions. 2467 by ShigekiKarita
- [**CI**] Fix hangup of unittests 2424 by kamo-naoyuki
- [**CI**] Make espnet2 tts test faster 2461 by kan-bayashi
- [**CI**] Make test_e2e_{asr,st,mt}_{transformer,conformer}.py faster. 2464 by ShigekiKarita
- [**CI**] Update .gitignore 2434 by kan-bayashi
- [**CI**][**ESPnet1**] Make test_(batch_)beam_search.py faster. 2462 by ShigekiKarita
- [**CI**][**ESPnet1**] Support Debian9 and CentOS7 in Github Actions 2457 by ShigekiKarita
- [**CI**][**ESPnet1**][**Installation**] Fix HKUST recipe 2440 by kamo-naoyuki

Acknowledgements
Special thanks to LiChenda, ShigekiKarita, akreal, ftshijt, glynpu, hchung12, hirofumi0810, jaesong, jnishi, kamo-naoyuki, kan-bayashi, mrazizi, sw005320, ysk24ok.

v.0.9.2
New Features
- [**New Features**][**ESPnet1**] CTC segmentation 2301 by lumaku
- [**New Features**][**ESPnet2**] Support multiple averaged nbest models 2353 by kamo-naoyuki
- [**New Features**][**ESPnet2**] Support recursive add in pack_funcs and add images to packed model 2367 by kamo-naoyuki

Bugfix
- [**Bugfix**][**ASR**][**ESPnet1**] remove ff_scale from conformer constructor arguments 2356 by koji-okabe-hub
- [**Bugfix**][**ASR**][**ESPnet2**] use lm_exp instead of lm_tag for inference_tag 2352 by kamo-naoyuki
- [**Bugfix**][**CI**][**ESPnet1**][**Installation**] Remove ctc_segmentation temporary 2385 by kan-bayashi
- [**Bugfix**][**ESPnet1**] Fix import error of conformer module 2384 by kan-bayashi
- [**Bugfix**][**ESPnet1**] Fix issue https://github.com/espnet/espnet/issues/2211 #2219 by Emrys365
- [**Bugfix**][**ESPnet2**] Add missing __init__.py 2326 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix --out_filename option: format_wav_scp.sh 2348 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix amp 2362 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] add egs2/an4/asr1/local/path.sh 2343 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix recursive add: espnet2/main_funcs/pack_funcs.py 2369 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] remove unused import 2331 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**Installation**][**Typo**] fix typo 2344 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**README**] Fix typo 2372 by Piteryo
- [**Bugfix**][**ESPnet2**][**TTS**] make vietnamese_cleaner to opiton 2341 by kamo-naoyuki
- [**Bugfix**][**Installation**] Fix python version check for chainer 2342 by kamo-naoyuki
- [**Bugfix**][**Installation**] add undefined variable: check_pytorch_cuda_compatibility.py 2361 by kamo-naoyuki
- [**Bugfix**][**TTS**] Fix device allocation error in guided attention loss 2282 2317 by kan-bayashi

Documentation
- [**Documentation**] updated comment on the documentation 2351 by GauravPandey892
- [**Documentation**][**ESPnet2**] Update TTS README 2316 by kan-bayashi
- [**Documentation**][**ESPnet2**][**README**] Update ESPnet2 TTS README 2376 by kan-bayashi
- [**Documentation**][**ESPnet2**][**README**][**TTS**] Update README 2330 by kan-bayashi
- [**Documentation**][**Installation**] Devide setup_python.sh into setup_venv.sh and setup_python.sh 2382 by kamo-naoyuki
- [**Documentation**][**Installation**] add a description about check install. 2360 by sw005320
- [**Documentation**][**README**] CTC segmentation - Demo 2347 by lumaku
- [**Documentation**][**README**] Update README.md 2379 by kamo-naoyuki

Enhancement
- [**Enhancement**][**ESPnet2**] Change the default inference model to averaged model instead of the best 2346 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**][**TTS**] Add pitch and energy stats in packing 2350 by kan-bayashi
- [**Enhancement**][**Installation**] Add checking for pytorch-cuda compatibility in Makefile 2334 by kamo-naoyuki
- [**Enhancement**][**Installation**] Show raw error message when failed to import packages 2374 by kamo-naoyuki

Refactoring
- [**Refactoring**] Apply new version black 2366 by kamo-naoyuki
- [**Refactoring**][**ASR**][**ESPnet2**] Not to add _sp to $asr_exp if --asr_exp option is specified 2368 by kamo-naoyuki
- [**Refactoring**][**CI**][**ESPnet1**][**ESPnet2**][**Installation**] Add installers for sctk and sph2pipe and create tools/extra_path.sh 2332 by kamo-naoyuki
- [**Refactoring**][**ESPnet1**][**Recipe**] Disable preparation for lm in wsj recipe 2373 by kamo-naoyuki
- [**Refactoring**][**ESPnet2**] Update Task design 2345 by kamo-naoyuki
- [**Refactoring**][**ESPnet2**][**SE**] Remove unused option from enh.sh:--feats_normalize 2325 by kamo-naoyuki

Recipe
- [**Recipe**][**ASR**][**ESPnet1**] MGB-2 2289 by AmirHussein96
- [**Recipe**][**ASR**][**ESPnet1**] Remove duplicated class definition of Conformer and update some new results of Aishell1 and Switchboard. 2364 by pengchengguo
- [**Recipe**][**ASR**][**ESPnet2**][**README**] ASR WSJ RESULT update: Tuning LM 2355 by kamo-naoyuki
- [**Recipe**][**ASR**][**ESPnet2**][**README**] add pretrained model link 2378 by kamo-naoyuki

CI
- [**CI**][**README**] Update ubuntu images in circle ci 2349 by ShigekiKarita
- [**CI**][**mergify**] Update .mergify.yml 2333 by kamo-naoyuki
- [**CI**][**mergify**] Update .mergify.yml 2354 by kamo-naoyuki

Acknowledgements
Special thanks to AmirHussein96, Emrys365, GauravPandey892, Piteryo, ShigekiKarita, kamo-naoyuki, kan-bayashi, koji-okabe-hub, lumaku, pengchengguo, sw005320.

v.0.9.1
New Features
- [**New Features**] Add metric option to checkpoint averaging for Transformer 2259 by hirofumi0810
- [**New Features**][**ESPnet2**] Generate run.sh in the experiment dir for resuming 2284 by kamo-naoyuki
- [**New Features**][**ESPnet2**] Support larger num_iters_per_epoch than the number of batches in small corpus 2255 by kamo-naoyuki
- [**New Features**][**ESPnet2**] Support torch native automatic mixed precision for espnet2 2257 by kamo-naoyuki

Documentation
- [**Documentation**] Update comments in MultiHeadAttention 2266 by placebokkk
- [**Documentation**][**ESPnet2**] append comment in reporter.py 2267 by kamo-naoyuki
- [**Documentation**][**ESPnet2**][**README**][**TTS**] Add ESPnet2 TTS recipe document 2312 by kan-bayashi

Enhancement
- [**Enhancement**][**ESPnet2**] Tensorboard stats between iterations 2252 by kamo-naoyuki

Refactoring
- [**Refactoring**][**ESPnet2**] Add some new features and a new recipe for the enhancement task 2238 by Emrys365
- [**Refactoring**][**Documentation**] Remove installation part of Python from Makefile 2245 by kamo-naoyuki

Recipe
- [**Recipe**][**ASR**] aidatatang conformer ESPnet1 recipe 2269 by nzhoward
- [**Recipe**][**ESPnet2**] espnet2 zeroth_korean recipe 2279 by hchung12

Bug fix
- [**Bugfix**] Fix 2295 2311 by kan-bayashi
- [**Bugfix**] Minor fix for Makefile 2268 by kamo-naoyuki
- [**Bugfix**] Not to install cupy-cuda* for python>=3.8 2277 by kamo-naoyuki
- [**Bugfix**] Remove channel: setup_anaconda.sh 2303 by kamo-naoyuki
- [**Bugfix**][**ASR**] ngram single decoding bug fix 2299 by qmpzzpmq
- [**Bugfix**][**ASR**][**ESPnet2**] Add missing __init__.py 2292 by kamo-naoyuki
- [**Bugfix**][**ASR**][**ESPnet2**] decode -> inference 2276 by kamo-naoyuki
- [**Bugfix**][**ASR**][**ESPnet2**] remove chainer dependency from show_asr_result.sh 2281 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Avoid illegal summary name for tensorboard 2294 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix average_nbest_models for pytorch=1.6 2283 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix decode config extension in ESPnet2 CSJ recipe 2258 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix for queue-freegpu.pl 2274 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix samplers about min_batch_size 2305 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Workaround for SGE jobname issue 2253 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] add missing shebang 2306 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix bug of reporter 2263 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**Recipe**] Update zeroth_korean 2308 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**SE**] add --spk-num 1 2285 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**distributed**] Not to save config.yaml if rank!=0 2287 by kamo-naoyuki

Others
- [**CI**] Remove unnecessary installation when CI 2307 by kamo-naoyuki
- [**CI**] Take integration tests into coverage 2254 by ShigekiKarita
- [**CI**][**ESPnet2**] Add coverage measure for espnet2 integration test 2256 by kamo-naoyuki
- [**CI**][**Installation**] Install wheel 2304 by kamo-naoyuki

Acknowledgements
Special thanks to Emrys365, ShigekiKarita, hchung12, hirofumi0810, kamo-naoyuki, kan-bayashi, nzhoward, placebokkk, qmpzzpmq.

v.0.9.0
New Features
- [**New Features**][**ASR**] Non-autoregressive ASR with Mask CTC 2070 by YosukeHiguchi
- [**New Features**][**ASR**] Support Conformer model. 2144 by pengchengguo
- [**New Features**][**ASR**][**ST**] CTC posterior visualization during training 2221 by hirofumi0810
- [**New Features**][**ESPnet2**] Implement espnet2.bin.zenodo_upload 2168 by kamo-naoyuki
- [**New Features**][**ESPnet2**] Python API for inference 2092 by kamo-naoyuki
- [**New Features**][**ESPnet2**] Support TTS-Transformer in ESPnet2 2134 by kan-bayashi
- [**New Features**][**ESPnet2**][**ASR**] Enable batch joint decoding with CTC in recog API v2 2197 by takaaki-hori
- [**New Features**][**ESPnet2**][**SE**] Speech Enhancement Frontend for ESPNet2 Phase 1 2124 by LiChenda
- [**New Features**][**ESPnet2**][**TTS**] Support FastSpeech for ESPnet2 TTS 2149 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support FastSpeech2 (+FastPitch) 2218 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support GST in ESPnet2 TTS 2139 by kan-bayashi
- [**New Features**][**README**][**ASR**] CTC forced alignment in E2E ASR Transformer model 2095 by simpleoier
- [**New Features**][**VC**] Voice Transformer Network 2064 by unilight

Enhancement
- [**Enhancement**] Fix error when downloading large files using `download_from_google_drive.sh` 2074 by unilight
- [**Enhancement**][**ASR**] added more beam search info 2130 by sw005320
- [**Enhancement**][**ESPnet2**] Change packed file of espnet2 to zip format 2161 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**] Make read_text faster 2114 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**] RESULTS.md -> README.md 2077 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**] Remove long wave in template recipe 2075 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**] Update ESPnet2 JSUT TTS recipe and TTS template 2110 by kan-bayashi
- [**Enhancement**][**MT**][**ST**] Fix ST/MT models for compatibility with ASR 2179 by hirofumi0810
- [**Enhancement**][**ST**] Add source case information to json files in ST task 2208 by hirofumi0810
- [**Enhancement**][**ST**] Refactor multi-task learning in ST 2202 by hirofumi0810

Recipe
- [**Recipe**][**ASR**] Add aidatatang_200zh recipe 2122 by nzhoward
- [**Recipe**][**ASR**] Add chime6 info 2250 by sw005320
- [**Recipe**][**ASR**] CHiME-6 recipe 2171 by GNroy
- [**Recipe**][**ASR**] Fix a bug in espnet wsj recipe. 2145 by houwenxin
- [**Recipe**][**ASR**] New Recipe for Yoloxóchitl-Mixtec (SLR89) 2085 by ftshijt
- [**Recipe**][**ASR**] Support averaging model for Conformer. 2244 by pengchengguo
- [**Recipe**][**ASR**] Updated model after tuning aidatatang_200zh recipe 2204 by nzhoward
- [**Recipe**][**ASR**] created a recipe to run asr on ljspeech 1996 by ibkuroyagi
- [**Recipe**][**ASR**] updatemodel link (add pre-trained bpe model and lm model) 2101 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**] espnet2 librispeech recipe 2109 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**] espnet2 librispeech v2 2189 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**] update espnet2 aishell results 2150 by Cescfangs
- [**Recipe**][**ESPnet2**][**ASR**][**TTS**] fix dev_set/eval_sets issues 2142 by sw005320
- [**Recipe**][**ESPnet2**][**TTS**] Add ESPnet2 CSMSC TTS recipe 2129 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Add ESPnet2 LJSpeech recipe 2117 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Add VCTK recipe for ESPnet2 TTS 2165 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Create espnet2 jsut/tts recipe 2047 by kamo-naoyuki

Refactoring
- [**Refactoring**][**ESPnet2**] Change stats_dir naming not to overwrite 2111 by kan-bayashi
- [**Refactoring**][**ESPnet2**] Move modules 2086 by kamo-naoyuki
- [**Refactoring**][**ESPnet2**] Remove $KALDI_ROOT/tools/env.sh from path.sh 2242 by kamo-naoyuki
- [**Refactoring**][**ESPnet2**] Several update for pretrain model 2212 by kamo-naoyuki
- [**Refactoring**][**ESPnet2**] Update Makefile 2225 by kamo-naoyuki

Documentation
- [**README**] Fix URL in README 2090 by kan-bayashi
- [**README**] Update README about TTS 2079 by kan-bayashi
- [**README**] Update README.md 2046 by kamo-naoyuki
- [**README**] Update README.md 2067 by kamo-naoyuki
- [**README**] Update README.md 2243 by kamo-naoyuki
- [**README**] Update citation 2206 by hirofumi0810
- [**README**] Update installation.md 2233 by kamo-naoyuki
- [**README**][**ESPnet2**] Update egs2/TEMPLATE/README.md 2098 by kamo-naoyuki

Bugfix
- [**Bugfix**] Add cupy.done in make python 2091 by kan-bayashi
- [**Bugfix**] Append a missing space in cmd-line args in utils/dump_pcm.sh 2209 by yistLin
- [**Bugfix**] Fix Makefile 2097 by kamo-naoyuki
- [**Bugfix**] Fix minor bug of Makefile 2055 by kamo-naoyuki
- [**Bugfix**] Fix old model compatibility 2048 2060 2063 by kan-bayashi
- [**Bugfix**] Fix pretrained model 2053 2069 by kan-bayashi
- [**Bugfix**] Fix pyopenjtalk installation 2108 by kan-bayashi
- [**Bugfix**] Fix typo in run.sh of TTS recipes 2216 by hirofumi0810
- [**Bugfix**] Update Makefile to disable cupy for cuda=10.2 or later 2230 by kamo-naoyuki
- [**Bugfix**] fix path of PESQ 2058 by kamo-naoyuki
- [**Bugfix**] scorerinterface warning English correction 2076 by qmpzzpmq
- [**Bugfix**][**CI**] Fix bug in attention plotting 2185 by hirofumi0810
- [**Bugfix**][**CI**] Freeze the matplotlib version with 3.1.0 2181 by sw005320
- [**Bugfix**][**CI**] fix integration_test_ctc_align_wav.bats with a small model 2170 by simpleoier
- [**Bugfix**][**CI**] temporally disable subsample 6 and 8 tests 2205 by sw005320
- [**Bugfix**][**CI**][**MT**][**ST**] Add integration test for ST/MT tasks 2210 by hirofumi0810
- [**Bugfix**][**ESPnet2**] Add missing path.sh in egs2/vctk/tts1 2167 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix TTS inference 2222 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix `tts_inference` when `feats_extract` is None 2176 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix bug for feats_type=extracted 2087 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix bug of iterable dataset when num_workers>=1 2081 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix bug of when espnet2/bin/tokenize_text.py --cutoff or --vocabulary_size is used 2158 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix log: benchmark -> deterministic 2080 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Implement configargparse in espnet2 2157 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Select torchaudio version according to torch version 2214 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] avoid UnboundLocalError when lm is not loaded 2227 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix 2050 2051 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix 2198: PhonemeTokenizer can't perform with multiprocessing 2201 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix best_model_criterion: wsj/asr1/conf/tuning/train_lm.yaml 2153 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix bug of lm.py 2056 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix the stage number: enh.sh 2220 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix: decode_config -> inference_config 2239 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**Recipe**] Not removing short/long utterances for eval_sets 2112 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**SE**] Fix bugs in espnet2/enh and format related directory structures 2215 by Emrys365
- [**Bugfix**][**ESPnet2**][**TTS**] Fix feature extractor of TTS for compatibility 2102 by kamo-naoyuki

Acknowledgements

Special thanks to Cescfangs, Emrys365, GNroy, LiChenda, YosukeHiguchi, ftshijt, hirofumi0810, houwenxin, ibkuroyagi, kamo-naoyuki, kan-bayashi, nzhoward, pengchengguo, qmpzzpmq, simpleoier, sw005320, takaaki-hori, unilight, yistLin.

v.0.8.0
ESPnet2
- [**ESPnet2**] Solve memory issue with super large corpus training 1972 by kamo-naoyuki
- [**ESPnet2**] Added model parameter count to trainer 1867 by SeanNaren
- [**ESPnet2**] Refactoring espnet2/utils/fileio.py -> espnet2/fileio 1807 by kamo-naoyuki

New Features
- [**New Features**] Lightweight and Dynamic Convolutions. 1599 by yuyfujit
- [**New Features**] Implement Ngram scorer 1946 by qmpzzpmq
- [**New Features**] resampling in utils/compute-fbank-feats.py and utils/compute-stft-feats.py 2035 by kamo-naoyuki

Enhancement
- [**Enhancement**] Ngram scorer update 1992 by qmpzzpmq

Documentation
- [**Documentation**] fix a typo for the decoder add_argument_group 2030 by sw005320
- [**Documentation**] Update multiple GPU descriptions. 2016 by sw005320
- [**Documentation**] Finetuning doc + freezing parameters option 1897 by b-flo

Bugfix
- [**Bugfix**] Fix memory issue when resuming 2040 by kamo-naoyuki
- [**Bugfix**] fixed typo in cmvn.py 1988 by gullyboy007
- [**Bugfix**] update notebook 1986 by ShigekiKarita
- [**Bugfix**] Fix freezing modules (when using multi-gpu) 1983 by atozto9
- [**Bugfix**] Fix BLEU/PPL calculation during training 2009 by hirofumi0810
- [**Bugfix**] Fix download file extension 2042 by takenori-y
- [**Bugfix**] fix tedlium2/3 model link 2032 by sw005320
- [**Bugfix**] Fix bug for pure Transformer-CTC 2023 by hirofumi0810
- [**Bugfix**] li42 recipe: add li42 results; fix bug in adding language id "zh_TW" 1950 by houwenxin

CI
- [**CI**] Add espnet2 in ci/doc.sh 1976 by ShigekiKarita
- [**CI**] Add test for pytorch1.5 1881 by kamo-naoyuki

Acknowledgements
Special thanks to SeanNaren, ShigekiKarita, atozto9, b-flo, gullyboy007, hirofumi0810, houwenxin, kamo-naoyuki, qmpzzpmq, sw005320, takenori-y, yuyfujit.

v.0.7.0
Now, the ESPnet project moves on to a new endeavor! We launched [espnet2](https://github.com/espnet/espnet/pull/1372), which aims to refine the modularities (chainer-free, kaldi-free), use a more customizable trainer, support distributed training, and achieve the scalability mainly led by kamo-naoyuki with his great efforts and leadership. This project is one of the outcomes of our ESPnet hackathon in Tokyo 2019 with a lot of discussions about the design, new features, and community contributions. espnet2 currently supports main ASR recipes (with a well-designed recipe template) and limited TTS recipes. We maintain both espnet1 and espnet2, but gradually move to our development in espnet2. The ESPnet project is further accelerated!

ESPnet2
- [**ESPnet2**] keep the latest model 1769 by kamo-naoyuki
- [**ESPnet2**] Remove "E2E" from all comments 1766 by kamo-naoyuki
- [**ESPnet2**] Refactoring for ESPnetDataset 1758 by kamo-naoyuki
- [**ESPnet2**] Implement SpecAug for ESPnet2 1746 by kamo-naoyuki
- [**ESPnet2**] Implement BatchBinSampler 1742 by kamo-naoyuki
- [**ESPnet2**] Support torch_optimizer 1739 by kamo-naoyuki
- [**ESPnet2**] Log rotation for launch.py 1737 by kamo-naoyuki
- [**ESPnet2**] Change the type of --chunk_length to str_or_int 1733 by kamo-naoyuki
- [**ESPnet2**] Change cudnn deterministic mode to default 1732 by kamo-naoyuki
- [**ESPnet2**] Add wsj results for espnet2 1724 by kamo-naoyuki
- [**ESPnet2**] Show estimated time to finish 1717 by kamo-naoyuki
- [**ESPnet2**] Add --name option for training job 1714 by kamo-naoyuki
- [**ESPnet2**] Show the log file when training process is failed: espnet2.bin.launch.py 1713 by kamo-naoyuki
- [**ESPnet2**] --max_length -> --fold_length 1712 by kamo-naoyuki
- [**ESPnet2**] Double quoter for NCCL_SOCKET_IFNAME 1706 by kamo-naoyuki
- [**ESPnet2**] Save apex state in checkpoint and support apex optimizer 1705 by kamo-naoyuki
- [**ESPnet2**] Update asr.sh 1694 by zh794390558
- [**ESPnet2**] Update ctc.py 1688 by zh794390558
- [**ESPnet2**] Update launch.py 1681 by zh794390558
- [**ESPnet2**] Update asr.sh 1678 by zh794390558
- [**ESPnet2**] --keep_n_best_checkpoints -> --keep_nbest_models 1647 by kamo-naoyuki
- [**ESPnet2**] Avoid deprecated warning: reduction="none" 1510 by kamo-naoyuki
- [**ESPnet2**] Minor change for speed perturbation 1627 by kamo-naoyuki
- [**ESPnet2**] Fix how2 recipe 1620 by kamo-naoyuki
- [**ESPnet2**] Fix recipes 1617 by kamo-naoyuki
- [**ESPnet2**] Renaming 1610 by kamo-naoyuki
- [**ESPnet2**] Implement chunk iterator 1608 by kamo-naoyuki
- [**ESPnet2**] Update voxforge RESULTS 1601 by kamo-naoyuki
- [**ESPnet2**] vivos recipe: --audio_format wav 1592 by kamo-naoyuki
- [**ESPnet2**] Lower python requirements to 3.6 1565 by kamo-naoyuki
- [**ESPnet2**] dirha_wsj recipe for espnet2 1556 by yuekaizhang
- [**ESPnet2**] Update AISHELL ASR Recipe 1549 by Emrys365
- [**ESPnet2**] Remove short data 1531 by kamo-naoyuki
- [**ESPnet2**] [WIP] Update JSUT ASR Recipe 1529 by YosukeHiguchi
- [**ESPnet2**] Update HOW2 recipe 1522 by b-flo
- [**ESPnet2**] [WIP] Update CSJ ASR Recipe 1520 by YosukeHiguchi
- [**ESPnet2**] Change NoamLR to deprecated and implement WarmupLR 1519 by kamo-naoyuki
- [**ESPnet2**] Implement --max_cache_size option 1509 by kamo-naoyuki
- [**ESPnet2**] distributed training 1506 by kamo-naoyuki
- [**ESPnet2**] ESPNet2 Recipe Update -- commonvoice, babel, ami 1504 by ftshijt
- [**ESPnet2**] Refactoring 1494 by kamo-naoyuki
- [**ESPnet2**] Fix ci of flake8 part 1491 by kamo-naoyuki
- [**ESPnet2**] Tensorboard, --num_iters_per_epoch, etc. 1487 by kamo-naoyuki
- [**ESPnet2**] Fix espnet2.bin.pack 1486 by kamo-naoyuki
- [**ESPnet2**] show_result.sh 1478 by kamo-naoyuki
- [**ESPnet2**] Pack and Unpack model 1477 by kamo-naoyuki
- [**ESPnet2**] collect-stats mode, trainer class, etc. 1462 by kamo-naoyuki
- [**ESPnet2**] add test codes for asr decoders 1445 by kamo-naoyuki
- [**ESPnet2**] Integrate Griffin-Lim with tts_decode() 1442 by kan-bayashi
- [**ESPnet2**] Update ASR recipe 1439 by kan-bayashi
- [**ESPnet2**] Update TTS recipes 1430 by kan-bayashi
- [**ESPnet2**] Disable wer/cer calculation when training 1547 by kamo-naoyuki
- [**ESPnet2**] Change CTC default to builtin 1546 by kamo-naoyuki
- [**ESPnet2**] Update chime4 asr1 Recipe 1570 by yuekaizhang
- [**ESPnet2**] Create documentation for espnet2 1710 by kamo-naoyuki
- [**ESPnet2**] shellcheck for local/data.sh 1524 by kamo-naoyuki
- [**ESPnet2**] commonvoice: RESULTS.md -> README.md 1797 by kamo-naoyuki

Bugfix
- [**Bugfix**] % -> percent: espnet2/tasks/abs_task.py 1767 by kamo-naoyuki
- [**Bugfix**] Fix gpu mode for tts_inference.py 1755 by kamo-naoyuki
- [**Bugfix**] Fix SubReporter 1748 by kamo-naoyuki
- [**Bugfix**] Fix calculate_all_attentions for espnet2 1747 by kamo-naoyuki
- [**Bugfix**] Not to create the averaged mdel if --keep_nbest_models=1 1744 by kamo-naoyuki
- [**Bugfix**] Fix --best_model_criterions 1743 by kamo-naoyuki
- [**Bugfix**] Fix the gpu device when resuming 1731 by kamo-naoyuki
- [**Bugfix**] Fix error log for espnet2/bin/launch.py 1730 by kamo-naoyuki
- [**Bugfix**] Disable CUDNN deterministic for CTC: espnet2/asr/ctc.py 1720 by kamo-naoyuki
- [**Bugfix**] Update default.py 1698 by zh794390558
- [**Bugfix**] Fix chunk iterator and refactoring for distributed training 1685 by kamo-naoyuki
- [**Bugfix**] Update vgg_rnn_encoder.py 1676 by zh794390558
- [**Bugfix**] [ESPnet2] chmod +x: run.sh for JSUT 1628 by kamo-naoyuki
- [**Bugfix**] [ESPnet2]Remove nlsyms when word scoring 1614 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix setup.sh 1596 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix launch.py for slurm 1588 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix ci for local/data.sh 1572 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix nj of scripts/audio/format_wav_scp.sh 1550 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Use load_scp_sequential in formart_wav_scp.py 1541 by kamo-naoyuki
- [**Bugfix**] [ESPNet2] Minor fix for CSJ recipe 1540 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix transformer 1539 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] fix rnn_type when bidirectional is used 1533 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix format_wav_scp.py 1532 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix bug of using GPU even if CPU mode 1526 by kamo-naoyuki
- [**Bugfix**] [ESPnet2 ] Fix --accum_grad 1525 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix voxforge config 1511 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Bug fix of splitting files for collect_stats mode 1505 by kamo-naoyuki
- [**Bugfix**] fix to use queue.conf 1431 by sw005320
- [**Bugfix**] [ESPnet2] Fix a bug in TTS 1428 by kan-bayashi
- [**Bugfix**] [ESPnet2] Refactor Encoder and Decoder and bug fix 1427 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix bug of text-chars converter 1426 by kamo-naoyuki
- [**Bugfix**] Optionize trans_type in egs/ljspeech/tts2 1789 by kan-bayashi
- [**Bugfix**] bugfix in ljspeech/tts2 1783 by beckgom
- [**Bugfix**] missing argument for local/data_prep.sh added 1782 by beckgom
- [**Bugfix**] avoid sentencepiece==0.1.90 1923 by kamo-naoyuki
- [**Bugfix**] FIX E523,E541,E741 1918 by kamo-naoyuki
- [**Bugfix**] fix reverse option for cmvn 1906 by magictron
- [**Bugfix**] Error handling for Transformer with CTC-based VAD 1875 by takenori-y
- [**Bugfix**] Revert deletion of init files 1842 by Fhrozen
- [**Bugfix**] fix the missing link of tedlium3 1841 by sw005320
- [**Bugfix**] Add test for torch>1.1 1840 by kamo-naoyuki
- [**Bugfix**] Fix 1808: change the argument order of --batch_type for collect stat… 1810 by kamo-naoyuki
- [**Bugfix**] Change to configargparse>=1.2.1 1803 by kamo-naoyuki
- [**Bugfix**] typo fixed for attention type 1793 by beckgom
- [**Bugfix**] fix https://github.com/espnet/espnet/issues/1780 #1784 by qmeeus
- [**Bugfix**] Fix bug of espnet2 asr_inference.py 1952 by kamo-naoyuki
- [**Bugfix**] Minor fix of import place and comments 1959 by kan-bayashi

New Features
- [**New Features**] Add utils/translate_wav.sh 1530 by ShigekiKarita
- [**New Features**] Batch beam search V2 for Transformer (no CTC) 1402 by ShigekiKarita

Enhancement
- [**Enhancement**] Support multiple sentences in synth_wav.sh 1788 by kan-bayashi
- [**Enhancement**] fix+update transducer 1760 by b-flo

Documentation
- [**Documentation**] Update notebook 1963 by kan-bayashi
- [**Documentation**] Update installation manual 1960 by kan-bayashi
- [**Documentation**] Update installation.md 1957 by kamo-naoyuki
- [**Documentation**] Add note in synth_wav.sh 1785 by kan-bayashi
- [**Documentation**] Update docs 1954 1955 by kamo-naoyuki
- [**Documentation**] Update docs 1938 by kamo-naoyuki
- [**Documentation**] docs: added fbank link to the experiment readme 1910 by kdubovikov

Recipe
- [**Recipe**] Added some TIMIT results 1819 by sknadig
- [**Recipe**] add recipe for French Polyphone: ELRA-S0030_02 1711 by AdolfVonKleist
- [**Recipe**] Use espnet_tts_frontend 1794 by kamo-naoyuki

CI
- [**CI**] Use cache in actions 1917 by ShigekiKarita
- [**CI**] Apply black 1850 by kamo-naoyuki
- [**CI**] Create .mergify.yml 1813 by kamo-naoyuki

Acknowledgements
Special thanks to AdolfVonKleist, Emrys365, Fhrozen, ShigekiKarita, YosukeHiguchi, beckgom, b-flo, ftshijt, kamo-naoyuki, kan-bayashi, kdubovikov, magictron, qmeeus, sknadig, sw005320, takenori-y, yuekaizhang, zh794390558

v.202402
News
We're thrilled to announce that our latest update brings two groundbreaking features to our project: `espnetez` and `ESPnet-SPK`!

New Features
- [**New Features**][**ESPnet2**][**ESPnet1**][**Installation**][**SE**] Add diffusion-base SE model to ESPnet-SE 5572 by LiChenda
- [**New Features**][**ESPnet2**][**ESPnet1**][**CI**][**ASR**] Add Bayes Risk CTC (reworked) 5519 by jctian98
- [**New Features**][**ESPnet2**][**TTS**] TTS evaluation script and monitoring functionality using MOS prediction model 5485 by Takaaki-Saeki
- [**New Features**][**ESPnet2**][**SE**] Add USES model for speech enhancement in diverse conditions 5482 by Emrys365
- [**New Features**][**ESPnet2**][**CI**][**SID**] ESPnet-SPk: major update 5408 by Jungjee
- [**New Features**][**ESPnet2**][**TTS**][**ASR**] Add espnetez 5372 by Masao-Someki

Enhancement
- [**Enhancement**][**ESPnet2**][**OWSM**] Improving OWSM inference interface 5618 by pyf98
- [**Enhancement**][**ESPnet2**][**OWSM**] Add OWSM v3.1 5611 by pyf98
- [**Enhancement**][**ESPnet2**][**CI**] ESPnet-SPK: Additional models, supplement readme 5559 by Jungjee
- [**Enhancement**][**ESPnet2**][**CI**][**SE**] Add PyTorch & GPU support for DNSMOS calculation 5548 by Emrys365
- [**Enhancement**][**ESPnet2**][**TTS**][**SID**] Speaker embedding extractor (with ESPnet pre-trained speaker model) 5579 by ftshijt

Recipe
- [**Recipe**][**ESPnet2**][**Music**] Fix relative setting of train-dev-test 5623 by ftshijt
- [**Recipe**][**ESPnet2**][**SID**] ESPnet-SPK: add Voxblink recipe 5583 by Jungjee
- [**Recipe**][**ESPnet2**][**SID**] ESPnet-SPK: Model upload and result generation 5558 by Jungjee
- [**Recipe**][**ESPnet2**][**Music**] ACE singer recipe fixing 5551 by ftshijt
- [**Recipe**][**ESPnet2**][**TTS**] TTS2 Template 5541 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**] fix kaldi dependency in asr2 5540 by ftshijt
- [**Recipe**][**ESPnet2**][**CI**][**S2ST**] CI test for s2st 5526 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**] Added data.sh to SPRING-INX IITM Recipe 5522 by arjun-gangwar
- [**Recipe**][**ESPnet2**][**ASR**] Add Libriheavy small and medium ASR2 recipes 5512 by akreal
- [**Recipe**][**ESPnet2**][**ASR**] SPRING-INX IITM RECIPE 5505 by arjun-gangwar
- [**Recipe**][**ESPnet2**][**ASR**][**RNNT**] Add transducer conformer configuration to commonvoice recipe 5503 by zuazo
- [**Recipe**][**ESPnet2**][**ESPnet1**] add centralized data preparation for OWSM 5478 by jctian98
- [**Recipe**][**ESPnet1**] Added clean speech results 5649 by linan2
- [**Recipe**][**ESPnet2**][**Installation**][**AV**] AVSR recipe for Easycom Dataset 5630 by ms-dot-k
- [**Recipe**][**ESPnet2**] Update CHiME-7 ASR1 recipe 5555 by popcornell
- [**Recipe**][**ESPnet2**] Add E-Branchformer model checkpoint in OWSM v2 5517 by pyf98
- [**Recipe**][**ESPnet2**][**SLU**] Slue PR configs 5087 by siddhu001

Bugfix
- [**Bugfix**][**ESPnet2**] Fix path dependency in ESPnet tutorial 5645 by siddhu001
- [**Bugfix**][**ESPnet2**] Fix ESPnet tutorial 5644 by siddhu001
- [**Bugfix**] Fix CI 5642 by siddhu001
- [**Bugfix**][**ESPnet2**] Fixed bug by copying missing Kaldi scripts 5636 by VicentCano
- [**Bugfix**][**ESPnet1**][**ASR**] CTC prefix score, fix if blank == eos 5620 by albertz
- [**Bugfix**][**ESPnet2**] Fix minor OWSM data prep bug 5607 by juice500ml
- [**Bugfix**][**ESPnet2**][**ESPnet1**][**CI**] E721 5589 by sw005320
- [**Bugfix**][**ESPnet2**][**ESPnet1**] Make minlenratio effective 5581 by jctian98
- [**Bugfix**][**ESPnet2**] Fix except 5567 by takenori-y
- [**Bugfix**][**ESPnet1**][**Installation**][**CI**] Improve error robustness of unit tests 5535 by Emrys365
- [**Bugfix**][**ESPnet2**][**AV**] Fix bug in lrs3 data preprocessing 5520 by ms-dot-k
- [**Bugfix**][**ESPnet1**] replace old mustc links with new instructions 5516 by brianyan918
- [**Bugfix**][**ESPnet2**][**ST**] Fix s2st HF model uploading 5504 by tjysdsg
- [**Bugfix**][**ESPnet2**][**ESPnet1**] bug fixes for must_c v2 recipe 5640 by jasonmusespresso

Documentation
- [**Documentation**][**ESPnet2**] Add instructions for finetuning owsm 5539 by pyf98
- [**Documentation**] Updated the reference of the accepted JOSS paper 5515 by neillu23

Others
- [**Others**] Update Discord Invitation Link 5578 by Fhrozen
- [**Others**][**ESPnet2**][**CI**] Improve error robustness of unit tests 5523 by Emrys365

Acknowledgements
Special thanks to Emrys365, Fhrozen, Jungjee, LiChenda, Masao-Someki, Takaaki-Saeki, VicentCano, akreal, albertz, arjun-gangwar, brianyan918, ftshijt, jasonmusespresso, jctian98, juice500ml, linan2, ms-dot-k, neillu23, popcornell, pyf98, siddhu001, sw005320, takenori-y, tjysdsg, zuazo.



v.202310
What's Changed
* Support arbitrary language finetune for Whisper models. by pengchengguo in https://github.com/espnet/espnet/pull/5344
* Update Dipco Data URL by Fhrozen in https://github.com/espnet/espnet/pull/5391
* Update readme in TEMPLATE/svs1 by linyueqian in https://github.com/espnet/espnet/pull/5394
* add gramvaani asr recipe by bloodraven66 in https://github.com/espnet/espnet/pull/5366
* ESPnet-SPK: sampler by Jungjee in https://github.com/espnet/espnet/pull/5365
* Adding general data augmentation methods for speech preprocessing by Emrys365 in https://github.com/espnet/espnet/pull/5370
* Update of several SE recipes and some minor fixes by Emrys365 in https://github.com/espnet/espnet/pull/5401
* Reproducing MIMOIRIS by YoshikiMas in https://github.com/espnet/espnet/pull/5409
* Kathbath asr by bloodraven66 in https://github.com/espnet/espnet/pull/5369
* Add pytorch2.0.1 to CI by kamo-naoyuki in https://github.com/espnet/espnet/pull/5413
* [skip ci] Update README.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5417
* In spec_augment.py, check whether an array is writeable before modifying it inplace by mdecerbo in https://github.com/espnet/espnet/pull/5416
* Docker updates for local builds by Fhrozen in https://github.com/espnet/espnet/pull/5406
* fix typo in TEMPLATE/svs1/README.md by linyueqian in https://github.com/espnet/espnet/pull/5426
* Update install_mwerSegmenter.sh by sw005320 in https://github.com/espnet/espnet/pull/5437
* Support Whisper-style training as a new task S2T by pyf98 in https://github.com/espnet/espnet/pull/5120
* fix twice numpy installation issue by kan-bayashi in https://github.com/espnet/espnet/pull/5447
* Add Whisper SOT recipe for Librimix by LiChenda in https://github.com/espnet/espnet/pull/5371
* Update for the JOSS paper editor review by neillu23 in https://github.com/espnet/espnet/pull/5418
* Add the VOiCES recipe for ASR by Emrys365 in https://github.com/espnet/espnet/pull/5448
* Improve diacritic compatibility in data_prep.pl preprocessing scripts by zuazo in https://github.com/espnet/espnet/pull/5445
* [WIP] create recipe for acesinger by linyueqian in https://github.com/espnet/espnet/pull/5431
* Add BibleTTS recipe by wyh2000 in https://github.com/espnet/espnet/pull/5436
* ASR2 CHiME4 & Gigaspeech Recipes by yichen14 in https://github.com/espnet/espnet/pull/5434
* [pre-commit.ci] pre-commit autoupdate by pre-commit-ci in https://github.com/espnet/espnet/pull/5427
* Simple fix to reduce test_slu_inference time by siddhu001 in https://github.com/espnet/espnet/pull/5460
* Do not use root logger in Beamsearch by vsd-vector in https://github.com/espnet/espnet/pull/5454
* Fix whisper test by siddhu001 in https://github.com/espnet/espnet/pull/5464
* Add doc for OWSM by pyf98 in https://github.com/espnet/espnet/pull/5463
* Speech-to-speech translation Task by ftshijt in https://github.com/espnet/espnet/pull/4859
* AVSR recipes on LRS3 using pre-trained AV-HuBERT model by ms-dot-k in https://github.com/espnet/espnet/pull/5456
* Support LoRA based large model finetuning. by pengchengguo in https://github.com/espnet/espnet/pull/5400
* Multilingual Librispeech (MLS) refactor ASR1 recipe by juice500ml in https://github.com/espnet/espnet/pull/5323
* Add phonemized LibriTTS ASR recipe by akreal in https://github.com/espnet/espnet/pull/5466
* Update the Enh framework to support training with variable numbers of speakers by Emrys365 in https://github.com/espnet/espnet/pull/5414
* speed up TFGridNet code by zqwang7 in https://github.com/espnet/espnet/pull/5395
* [pre-commit.ci] pre-commit autoupdate by pre-commit-ci in https://github.com/espnet/espnet/pull/5468
* ASR2 recipe on Tedlium3 dataset by kohei0209 in https://github.com/espnet/espnet/pull/5331
* Create README.md in OWSM v1 by pyf98 in https://github.com/espnet/espnet/pull/5489
* Update setup.py by sw005320 in https://github.com/espnet/espnet/pull/5490
* Fix default value in ML-SUPERB by ftshijt in https://github.com/espnet/espnet/pull/5492
* Fix bugs of Whisper SOT. by pengchengguo in https://github.com/espnet/espnet/pull/5494
* Multilingual Librispeech ASR2 + ASR1 baselines by juice500ml in https://github.com/espnet/espnet/pull/5441
* Add a new SE recipe combining five public corpora by Emrys365 in https://github.com/espnet/espnet/pull/5484
* Update .mergify.yml by kamo-naoyuki in https://github.com/espnet/espnet/pull/5502
* update version to 202310 by kan-bayashi in https://github.com/espnet/espnet/pull/5501

New Contributors
* linyueqian made their first contribution in https://github.com/espnet/espnet/pull/5394
* mdecerbo made their first contribution in https://github.com/espnet/espnet/pull/5416
* zuazo made their first contribution in https://github.com/espnet/espnet/pull/5445
* wyh2000 made their first contribution in https://github.com/espnet/espnet/pull/5436
* yichen14 made their first contribution in https://github.com/espnet/espnet/pull/5434
* vsd-vector made their first contribution in https://github.com/espnet/espnet/pull/5454
* ms-dot-k made their first contribution in https://github.com/espnet/espnet/pull/5456
* juice500ml made their first contribution in https://github.com/espnet/espnet/pull/5323
* kohei0209 made their first contribution in https://github.com/espnet/espnet/pull/5331

**Full Changelog**: https://github.com/espnet/espnet/compare/v.202308...v.202310

v.202308
What's Changed
* Update tutorial by ftshijt in https://github.com/espnet/espnet/pull/4648
* Update tutorials by ftshijt in https://github.com/espnet/espnet/pull/4898
* add e-branchformer result for tedlium3 and add checker for text output length by Some-random in https://github.com/espnet/espnet/pull/5130
* Limit the Numpy version (<1.24) to fix CI error temporarily. by simpleoier in https://github.com/espnet/espnet/pull/5162
* [SVS] Add new recipes by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5158
* Update README.md of CHiME-7 DASR: fixing typos by popcornell in https://github.com/espnet/espnet/pull/5166
* Fix typo in CONTRIBUTING.md by eltociear in https://github.com/espnet/espnet/pull/5167
* CHiME-7 DASR: Update install_dependencies.sh, fix lhotse version by popcornell in https://github.com/espnet/espnet/pull/5168
* Update TD-SpeakerBeam by Emrys365 in https://github.com/espnet/espnet/pull/5155
* Add pre-trained causal speech separation model and streaming demo by LiChenda in https://github.com/espnet/espnet/pull/5172
* KSC recipe by khassanoff in https://github.com/espnet/espnet/pull/5171
* [SVS] Add new recipe by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5173
* Update AphasiaBank Recipe by tjysdsg in https://github.com/espnet/espnet/pull/5104
* fix the gradient backward issue when joint training with s3prl frontend by simpleoier in https://github.com/espnet/espnet/pull/5159
* Add installer for ParallelWaveGAN by ftshijt in https://github.com/espnet/espnet/pull/4052
* [GAN SVS] Add VISinger2, UHifiGAN, Avocodo by jerryuhoo in https://github.com/espnet/espnet/pull/5123
* [SVS] Update docs README.md by South-Twilight in https://github.com/espnet/espnet/pull/5178
* Update SVS README.md by jerryuhoo in https://github.com/espnet/espnet/pull/5180
* Adding eendss models by soumimaiti in https://github.com/espnet/espnet/pull/5157
* 2022fall new task tutorial by ftshijt in https://github.com/espnet/espnet/pull/5186
* [SVS] Updates for recipes by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5187
* [GAN SVS] fix phoneme predictor by jerryuhoo in https://github.com/espnet/espnet/pull/5188
* Update generate_librimix_sd.sh by leepeiying in https://github.com/espnet/espnet/pull/5182
* Bug fix for 5195 by YosukeHiguchi in https://github.com/espnet/espnet/pull/5196
* [SVS] Update on recipes by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5197
* Update preprocessor.py by sw005320 in https://github.com/espnet/espnet/pull/5200
* Minor fixes for ML-SUPERB by ftshijt in https://github.com/espnet/espnet/pull/5202
* Quick fix for whisper specaug by siddhu001 in https://github.com/espnet/espnet/pull/5206
* espnet-spk data preparation part by Jungjee in https://github.com/espnet/espnet/pull/5184
* Fix M4singer multi-spk recipe by ftshijt in https://github.com/espnet/espnet/pull/5201
* Update Dataset link for mlsuperb by ftshijt in https://github.com/espnet/espnet/pull/5216
* Fix bug when score_type is set to normal in ml_superb by ftshijt in https://github.com/espnet/espnet/pull/5217
* Add new functions and fix some bugs in SE by Emrys365 in https://github.com/espnet/espnet/pull/5193
* Update import order by ftshijt in https://github.com/espnet/espnet/pull/5229
* Closed CHiME-7 DASR adding evaluation inference + adding support to use diarization baseline "pre-computed" JSONs (new PR) by popcornell in https://github.com/espnet/espnet/pull/5228
* Standalone Transducer v1.1 by b-flo in https://github.com/espnet/espnet/pull/5140
* Small fixes for Transducer by b-flo in https://github.com/espnet/espnet/pull/5247
* add asr2 task and librispeech recipe as an example. by simpleoier in https://github.com/espnet/espnet/pull/5181
* fix norm compatibility in scale discriminator by kan-bayashi in https://github.com/espnet/espnet/pull/5240
* CFSD, SECS metrics for TTS by imdanboy in https://github.com/espnet/espnet/pull/5235
* Add new SE recipes: chime1/enh1, chime2/enh1, reverb/enh1, and wsj0_2mix/tse1 by Emrys365 in https://github.com/espnet/espnet/pull/5246
* Fix bugs in mfa_format.py by G-Thor in https://github.com/espnet/espnet/pull/5223
* New features for SVS by ftshijt in https://github.com/espnet/espnet/pull/5245
* re-fix norm compatibility in scale discriminator by kan-bayashi in https://github.com/espnet/espnet/pull/5249
* add conv1d subsampling 3 and egs2/librispeech/asr2 wavlm_large_21 kmeans (1000/2000) results by simpleoier in https://github.com/espnet/espnet/pull/5252
* Revise the ESPnet-SE++ Joss paper to incorporate the feedback from the reviewer. by neillu23 in https://github.com/espnet/espnet/pull/5212
* Fix a bug in score script for ML-SUPERB by ftshijt in https://github.com/espnet/espnet/pull/5254
* Refactor prep_segments in SVS by jerryuhoo in https://github.com/espnet/espnet/pull/5210
* A minor fix for num_splits_ssl for training by ftshijt in https://github.com/espnet/espnet/pull/5262
* [SVS] add singing tacotron by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5233
* Add script to use speaker averaged xvectors in TTS training by G-Thor in https://github.com/espnet/espnet/pull/5244
* Fix filling of waveform_buffer with samples for streaming inference by espnetUser in https://github.com/espnet/espnet/pull/5267
* Some name update for ml-superb by ftshijt in https://github.com/espnet/espnet/pull/5276
* Add support for K2 pruned transducer loss by b-flo in https://github.com/espnet/espnet/pull/5268
* Fix Transducer doc by b-flo in https://github.com/espnet/espnet/pull/5306
* Update installation.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5291
* Update install_nkf.sh by sw005320 in https://github.com/espnet/espnet/pull/5300
* Fix Cython version to pass the installation of libraries with Cython by kan-bayashi in https://github.com/espnet/espnet/pull/5310
* Update README.md by sw005320 in https://github.com/espnet/espnet/pull/5315
* Update setup.py by sw005320 in https://github.com/espnet/espnet/pull/5316
* Migrate recipe for nit_song070 from Muskit by wwwbxy123 in https://github.com/espnet/espnet/pull/5251
* [pre-commit.ci] pre-commit autoupdate by pre-commit-ci in https://github.com/espnet/espnet/pull/5294
* A few updates for asr2 and hubert by simpleoier in https://github.com/espnet/espnet/pull/5285
* Add decode_options and hyp_cleaner in evaluate_whisper_inference by pyf98 in https://github.com/espnet/espnet/pull/5272
* update pyworld version by kan-bayashi in https://github.com/espnet/espnet/pull/5319
* fix a data preparation issue for librimix recipe. by LiChenda in https://github.com/espnet/espnet/pull/5322
* Update README.md in egs2/librimix/tse1 and egs2/wsj0_2mix/tse1 by Emrys365 in https://github.com/espnet/espnet/pull/5289
* fix the s3prl frontend gradient backprop bug, ensuring feature_grad_mult=1.0 by simpleoier in https://github.com/espnet/espnet/pull/5297
* ESPNet-SPK part 2 - training by Jungjee in https://github.com/espnet/espnet/pull/5258
* remove some tests in espnet1 integration test by sw005320 in https://github.com/espnet/espnet/pull/5328
* Fix random segments by iamanigeeit in https://github.com/espnet/espnet/pull/5274
* Skip CI for draft PR by ftshijt in https://github.com/espnet/espnet/pull/5333
* Update cancel.yml by kan-bayashi in https://github.com/espnet/espnet/pull/5334
* Update several SE recipes and bash scripts by Emrys365 in https://github.com/espnet/espnet/pull/5327
* Add PULL_REQUEST_TEMPLATE.md by kan-bayashi in https://github.com/espnet/espnet/pull/5340
* ESPnet-Spk part 3 - inference every epoch using EER by Jungjee in https://github.com/espnet/espnet/pull/5314
* Minimize espnet2 integration test by kan-bayashi in https://github.com/espnet/espnet/pull/5324
* PR Labels for CI control by Fhrozen in https://github.com/espnet/espnet/pull/5320
* Split ci into several jobs by kan-bayashi in https://github.com/espnet/espnet/pull/5343
* Update CONTRIBUTING.md by sw005320 in https://github.com/espnet/espnet/pull/5335
* Update Scoring for Speech Summarization from NLG-Eval to Huggingface Evaluate by roshansh-cmu in https://github.com/espnet/espnet/pull/5341
* Fix documentation skip CI by Fhrozen in https://github.com/espnet/espnet/pull/5351
* Update the usage by sw005320 in https://github.com/espnet/espnet/pull/5349
* Docker Update by Fhrozen in https://github.com/espnet/espnet/pull/5321
* Update installation.md by sw005320 in https://github.com/espnet/espnet/pull/5348
* Fix doc condition by kan-bayashi in https://github.com/espnet/espnet/pull/5355
* Update issue templates by sw005320 in https://github.com/espnet/espnet/pull/5357
* Update Contribution.md by Fhrozen in https://github.com/espnet/espnet/pull/5352
* Fix .mergify condition by kan-bayashi in https://github.com/espnet/espnet/pull/5354
* Reduce ffmpeg installation time in ci by kan-bayashi in https://github.com/espnet/espnet/pull/5356
* Update CI table by kan-bayashi in https://github.com/espnet/espnet/pull/5359
* Clean workflow files by kan-bayashi in https://github.com/espnet/espnet/pull/5360
* Couple of tweaks for asr2.sh for the HF hub upload by akreal in https://github.com/espnet/espnet/pull/5362
* Update TEMPLATE_HF_Readme.md (fix bash typo) by akreal in https://github.com/espnet/espnet/pull/5361
* Add discrete-token ASR for LibriSpeech 100h by akreal in https://github.com/espnet/espnet/pull/5350
* Whisper fine-tuning recipes for CHiME-4 and WSJ by YoshikiMas in https://github.com/espnet/espnet/pull/5342
* Fix bug in ngram training in slu.sh by siddhu001 in https://github.com/espnet/espnet/pull/5364
* Add musdb18 recipe for music source separation by Emrys365 in https://github.com/espnet/espnet/pull/5338
* Bugfix: JETS CTCLoss by imdanboy in https://github.com/espnet/espnet/pull/5288
* Check the value of `n_shift` == `upsample_factor` in GAN_TTS by imdanboy in https://github.com/espnet/espnet/pull/5299
* MFA format fix by iamanigeeit in https://github.com/espnet/espnet/pull/5275
* add --num-workers 0 option to enable coverage to truck data loader by kan-bayashi in https://github.com/espnet/espnet/pull/5368
* ESPnet-SPK: fix data augment by Jungjee in https://github.com/espnet/espnet/pull/5347
* A few minor fixes for SSL by ftshijt in https://github.com/espnet/espnet/pull/5265
* remove unused file + small typo/style by b-flo in https://github.com/espnet/espnet/pull/5346
* ESPnet-SPK: EER validation efficiency improvement by Jungjee in https://github.com/espnet/espnet/pull/5358
* New Architectures for ST by brianyan918 in https://github.com/espnet/espnet/pull/4815
* [SVS] Add CI test by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5269
* Add causal LM to Hugging Face Transformers Decoder by akreal in https://github.com/espnet/espnet/pull/5313
* Make `make_pad_mask` onnx convertible by Masao-Someki in https://github.com/espnet/espnet/pull/5326
* fix numerical error of parallel wavegan compatibility test in CI by kan-bayashi in https://github.com/espnet/espnet/pull/5380
* Add LibriTTS-R recipe by ShigekiKarita in https://github.com/espnet/espnet/pull/5379
* minor fix: correct wrong comments by imdanboy in https://github.com/espnet/espnet/pull/5378
* Add quotation marks to install_datasets.sh by qmeeus in https://github.com/espnet/espnet/pull/5387

New Contributors
* khassanoff made their first contribution in https://github.com/espnet/espnet/pull/5171
* leepeiying made their first contribution in https://github.com/espnet/espnet/pull/5182
* Jungjee made their first contribution in https://github.com/espnet/espnet/pull/5184
* wwwbxy123 made their first contribution in https://github.com/espnet/espnet/pull/5251

**Full Changelog**: https://github.com/espnet/espnet/compare/v.202304...v.202308

v.202304
What's Changed
* Update collect stats stage so that less memory cost in Utt_mvn by simpleoier in https://github.com/espnet/espnet/pull/4888
* Apply the latest black by kamo-naoyuki in https://github.com/espnet/espnet/pull/4907
* Add pytorch=1.13.1 to CI configuration by kamo-naoyuki in https://github.com/espnet/espnet/pull/4906
* How2 fix README, incorrect url by roshansh-cmu in https://github.com/espnet/espnet/pull/4902
* standardized inference and number of iterations for mSuperb single lang track by DanBerrebbi in https://github.com/espnet/espnet/pull/4905
* Fix typo in lrs/README.md by eltociear in https://github.com/espnet/espnet/pull/4911
* MSUPERB setting update by ftshijt in https://github.com/espnet/espnet/pull/4913
* Update test_import.yaml to install numba by kamo-naoyuki in https://github.com/espnet/espnet/pull/4918
* update pyopenjtalk version to 0.3.0 by kan-bayashi in https://github.com/espnet/espnet/pull/4912
* CHiME-7 Task1 recipe by popcornell in https://github.com/espnet/espnet/pull/4894
* Update CHiME-7 Task 1 README.md by popcornell in https://github.com/espnet/espnet/pull/4920
* Use native CPU version of STFT on newer pytorch versions, fix librosa window size < ftt by bmilde in https://github.com/espnet/espnet/pull/4922
* Add few shot subset for mSuperb multilingual setting by guapaQAQ in https://github.com/espnet/espnet/pull/4923
* Fix existing bugs in the TSE task by Emrys365 in https://github.com/espnet/espnet/pull/4915
* IAM OCR recipe updates by kenzheng99 in https://github.com/espnet/espnet/pull/4927
* Fixing some issues with chime7-task1 baseline by popcornell in https://github.com/espnet/espnet/pull/4925
* set default none decoder for ASR by ftshijt in https://github.com/espnet/espnet/pull/4917
* Update inference and training setting for mSuperb multilingual model by guapaQAQ in https://github.com/espnet/espnet/pull/4932
* Add E-Branchformer Transducer results by pyf98 in https://github.com/espnet/espnet/pull/4933
* add tf-gridnet by zqwang7 in https://github.com/espnet/espnet/pull/4864
* Fixes + Channel Selection for CHiME-7 Task by popcornell in https://github.com/espnet/espnet/pull/4934
* fix extracted feature dummy generation by roshansh-cmu in https://github.com/espnet/espnet/pull/4926
* Fix device mismatch error in GPU decoding with PyTorch 1.13 by pyf98 in https://github.com/espnet/espnet/pull/4941
* CHiME-7 DASR MD5 checksum fix for mixer6/train_call by popcornell in https://github.com/espnet/espnet/pull/4942
* Update show_asr_result.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/4943
* CHiME-7 DASR correct development results by popcornell in https://github.com/espnet/espnet/pull/4946
* Fix '__floordiv__ is deprecated' warnings by fujimotos in https://github.com/espnet/espnet/pull/4945
* Added WSLII installation instruction by sw005320 in https://github.com/espnet/espnet/pull/4949
* Update Muskits by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4931
* Set a longer time execution threshold for related failed time-outs CI by ftshijt in https://github.com/espnet/espnet/pull/4962
* Modify data prep for mSUPERB multilingual by guapaQAQ in https://github.com/espnet/espnet/pull/4965
* Add E-Branchformer results in some recipes by pyf98 in https://github.com/espnet/espnet/pull/4958
* Add 'six' as a required Python module by fujimotos in https://github.com/espnet/espnet/pull/4964
* add msuperb linguistic analysis by hhhaaahhhaa in https://github.com/espnet/espnet/pull/4938
* Fix a 'ref_channel'-related issue in espnet2/bin/enh_inference.py by Emrys365 in https://github.com/espnet/espnet/pull/4972
* Add E-Branchformer results in slurp_entity by pyf98 in https://github.com/espnet/espnet/pull/4971
* Add Conformer and E-Branchformer results in fisher_spanish_callhome ASR by pyf98 in https://github.com/espnet/espnet/pull/4976
* [SVS] Add Joint-training by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4977
* Update the chunk iterator for the TSE task by Emrys365 in https://github.com/espnet/espnet/pull/4929
* update msuperb LID scoring script by hhhaaahhhaa in https://github.com/espnet/espnet/pull/4979
* add multilingual+lid lid score generation by hhhaaahhhaa in https://github.com/espnet/espnet/pull/4982
* Add python=3.10 to CI by kamo-naoyuki in https://github.com/espnet/espnet/pull/4627
* LID score v2 by hhhaaahhhaa in https://github.com/espnet/espnet/pull/4983
* Fix ci by kamo-naoyuki in https://github.com/espnet/espnet/pull/4985
* Change to use Ubuntu-latest instead of Ubuntu-18.04 in CI by kamo-naoyuki in https://github.com/espnet/espnet/pull/4986
* Remove six by kamo-naoyuki in https://github.com/espnet/espnet/pull/4988
* Modify format_wav_scp.py to support PCM of uint8, int32, float32, float64, etc. by kamo-naoyuki in https://github.com/espnet/espnet/pull/4997
* Fix Whisper tokenizer CI error by slSeanWU in https://github.com/espnet/espnet/pull/5004
* fix s3prl upstream attribute bug by jwrh in https://github.com/espnet/espnet/pull/5003
* [Recipe] Add iwslt22 low resource speech translation task for egs2 by freddy5566 in https://github.com/espnet/espnet/pull/4994
* Fix typeguard version by silvanocerza in https://github.com/espnet/espnet/pull/5009
* Add .pre-commit-config.yaml by kamo-naoyuki in https://github.com/espnet/espnet/pull/5011
* Copy Kaldi utils/steps/sid and add a new github action to check the consistency by kamo-naoyuki in https://github.com/espnet/espnet/pull/4998
* Modfiy .pre-commit-config.yaml by kamo-naoyuki in https://github.com/espnet/espnet/pull/5012
* Modify .pre-commit-config.yaml by kamo-naoyuki in https://github.com/espnet/espnet/pull/5014
* Modify .pre-commit-config.yaml by kamo-naoyuki in https://github.com/espnet/espnet/pull/5015
* [Tuning] iwslt22 low-resource ST decode configuration tuning by freddy5566 in https://github.com/espnet/espnet/pull/5019
* Modify asr.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5020
* [SVS] Improve visinger by jerryuhoo in https://github.com/espnet/espnet/pull/5022
* Use scripts/utils/print_args.sh instead of pyscripts/utils/print_args.py by kamo-naoyuki in https://github.com/espnet/espnet/pull/5025
* Add docstring in extra_path.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5028
* Update installation.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5029
* Update README.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5030
* Update README.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5031
* Change bc to python by kamo-naoyuki in https://github.com/espnet/espnet/pull/5032
* Update tools/Makefile and path.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5027
* Fix for format_wav_scp.py by kamo-naoyuki in https://github.com/espnet/espnet/pull/5038
* Add execute permission to install_ice_g2p.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5040
* Bug fix of 5025 by kamo-naoyuki in https://github.com/espnet/espnet/pull/5039
* [pre-commit.ci] pre-commit autoupdate by pre-commit-ci in https://github.com/espnet/espnet/pull/5041
* Update README.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5042
* Update README.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5043
* Update README.md by kamo-naoyuki in https://github.com/espnet/espnet/pull/5045
* Fix in gen_task1_data.sh from CHiME7 by boeddeker in https://github.com/espnet/espnet/pull/4953
* Update README.md by eml914 in https://github.com/espnet/espnet/pull/5044
* Add installers/install_ffmpeg.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5046
* Fix broken links reported by 5048 by ShigekiKarita in https://github.com/espnet/espnet/pull/5050
* fix: resolve upgrade issues with praatio 6.0; lock praatio version by timmahrt in https://github.com/espnet/espnet/pull/4978
* Add miniconda in gitignore by pyf98 in https://github.com/espnet/espnet/pull/5052
* CHiME-7 DASR fixes from participants feedback by popcornell in https://github.com/espnet/espnet/pull/4999
* Fix the condition for maxlen warning in beam search by pyf98 in https://github.com/espnet/espnet/pull/5055
* Fixed SQLalchemy version for MFA by Fhrozen in https://github.com/espnet/espnet/pull/5059
* Support Multi-Blank Transducer in Espnet2 by jctian98 in https://github.com/espnet/espnet/pull/4876
* Fix chime7 DASR task1 run.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5060
* CHiME-7 DASR recipe, fix display bug for scenario-wide DER and JER by popcornell in https://github.com/espnet/espnet/pull/5061
* Add test_format_wav_scp_sh.bats by kamo-naoyuki in https://github.com/espnet/espnet/pull/5062
* Update documentation by kamo-naoyuki in https://github.com/espnet/espnet/pull/5063
* Support SOT training on LibriMix data. by pengchengguo in https://github.com/espnet/espnet/pull/4861
* Update check_install.py by kamo-naoyuki in https://github.com/espnet/espnet/pull/5066
* Tedlium3 recipe by Some-random in https://github.com/espnet/espnet/pull/5068
* Bug Fix: pretrained s3prl-frontend based models loaded with parameters key mismatch error by simpleoier in https://github.com/espnet/espnet/pull/5074
* Mechanism for multi channels input using multi columns wav.scp by kamo-naoyuki in https://github.com/espnet/espnet/pull/5075
* Clean ML-SUPERB by ftshijt in https://github.com/espnet/espnet/pull/5067
* CHiME-7 DASR: first diarization system based on Pyannote. by popcornell in https://github.com/espnet/espnet/pull/5054
* Chime7-task1 diarization (updated results) by popcornell in https://github.com/espnet/espnet/pull/5088
* Add InterCTC to E-Branchformer encoder, and the ability to save InterCTC inference output to files by tjysdsg in https://github.com/espnet/espnet/pull/5084
* [SVS] Bug fix: sample rate by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5094
* [SVS] Extend SingingGenerate by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5100
* [pre-commit.ci] pre-commit autoupdate by pre-commit-ci in https://github.com/espnet/espnet/pull/5080
* Add kaldi steps/libs by kamo-naoyuki in https://github.com/espnet/espnet/pull/5106
* Fix sentencepice version to v0.1.97 by kamo-naoyuki in https://github.com/espnet/espnet/pull/5107
* Drop PyTorch<=1.9 by kamo-naoyuki in https://github.com/espnet/espnet/pull/5111
* Update installers/install_kenlm.sh by kamo-naoyuki in https://github.com/espnet/espnet/pull/5110
* Merge */{scripts,pyscripts} into asr1/{scripts,pyscripts} by kamo-naoyuki in https://github.com/espnet/espnet/pull/5109
* Update ReazonSpeech training recipe for v1.1.0 by fujimotos in https://github.com/espnet/espnet/pull/5114
* Fix typo in espnet2_format_wav_scp.md by boeddeker in https://github.com/espnet/espnet/pull/5116
* Dtype for Speechbrain by Fhrozen in https://github.com/espnet/espnet/pull/5112
* Add test of soundfile for Makefile by kamo-naoyuki in https://github.com/espnet/espnet/pull/5119
* Add lm_inference for conditional text generation by pyf98 in https://github.com/espnet/espnet/pull/5122
* CHiME-7 diarization (updated README.md) by popcornell in https://github.com/espnet/espnet/pull/5102
* [WIP] Update Docker by Fhrozen in https://github.com/espnet/espnet/pull/5128
* Fix several bugs and improve function design in SE by Emrys365 in https://github.com/espnet/espnet/pull/5103
* [SVS] Update XiaoiceSing by A-Quarter-Mile in https://github.com/espnet/espnet/pull/5124
* Add missing filter_scps scripts and note about kaldi for diarization example of mini_librispeech by toto6038 in https://github.com/espnet/espnet/pull/5139
* Bump up the debian version to 11 by kamo-naoyuki in https://github.com/espnet/espnet/pull/5144
* Bug fixing and improvement in SE functions by Emrys365 in https://github.com/espnet/espnet/pull/5143
* Add data augmentation to ReazonSpeech recipe by fujimotos in https://github.com/espnet/espnet/pull/5127
* Update error calculator for transducer by aky15 in https://github.com/espnet/espnet/pull/5097
* Add streaming speech enhancemnt inference. by LiChenda in https://github.com/espnet/espnet/pull/5049
* Update README.md about debian by sw005320 in https://github.com/espnet/espnet/pull/5146
* Fix issues in split scps by pyf98 in https://github.com/espnet/espnet/pull/5138
* fix 5148 by kamo-naoyuki in https://github.com/espnet/espnet/pull/5149
* fix format_wav_scp.py by kamo-naoyuki in https://github.com/espnet/espnet/pull/5150
* Add more stats to the training log by Emrys365 in https://github.com/espnet/espnet/pull/5147
* update version to 202304 by kan-bayashi in https://github.com/espnet/espnet/pull/5151

New Contributors
* bmilde made their first contribution in https://github.com/espnet/espnet/pull/4922
* guapaQAQ made their first contribution in https://github.com/espnet/espnet/pull/4923
* zqwang7 made their first contribution in https://github.com/espnet/espnet/pull/4864
* hhhaaahhhaa made their first contribution in https://github.com/espnet/espnet/pull/4938
* jwrh made their first contribution in https://github.com/espnet/espnet/pull/5003
* freddy5566 made their first contribution in https://github.com/espnet/espnet/pull/4994
* silvanocerza made their first contribution in https://github.com/espnet/espnet/pull/5009
* pre-commit-ci made their first contribution in https://github.com/espnet/espnet/pull/5041
* boeddeker made their first contribution in https://github.com/espnet/espnet/pull/4953
* timmahrt made their first contribution in https://github.com/espnet/espnet/pull/4978
* Some-random made their first contribution in https://github.com/espnet/espnet/pull/5068
* toto6038 made their first contribution in https://github.com/espnet/espnet/pull/5139

**Full Changelog**: https://github.com/espnet/espnet/compare/v.202301...v.202304

v.202301
What's Changed
* Initialize VISinger branch by ftshijt in https://github.com/espnet/espnet/pull/4683
* Update VISInger branch by ftshijt in https://github.com/espnet/espnet/pull/4705
* Update UASR branch with latest ESPnet functions by ftshijt in https://github.com/espnet/espnet/pull/4752
* Update uasr by ftshijt in https://github.com/espnet/espnet/pull/4770
* Shell scripts for UASR processing by ftshijt in https://github.com/espnet/espnet/pull/4769
* Uasr python scripts by DongjiGao in https://github.com/espnet/espnet/pull/4791
* Update visinger by ftshijt in https://github.com/espnet/espnet/pull/4818
* Update test_custom_transducer.py by sw005320 in https://github.com/espnet/espnet/pull/4826
* Update asr.sh by sw005320 in https://github.com/espnet/espnet/pull/4827
* Fixed pad mode for librosa.stft by Masao-Someki in https://github.com/espnet/espnet/pull/4832
* Add E-Branchformer models in some recipes by pyf98 in https://github.com/espnet/espnet/pull/4833
* Fix data prep in GigaSpeech by pyf98 in https://github.com/espnet/espnet/pull/4836
* time sync decoding for asr by brianyan918 in https://github.com/espnet/espnet/pull/4792
* Remove duplicated VOXFORGE in db.sh (line81 and line157) by pyf98 in https://github.com/espnet/espnet/pull/4840
* Fix argument parsing for non_linguistic_symbols in asr.sh by pyf98 in https://github.com/espnet/espnet/pull/4841
* Add a warning statement when the hypo length equals to the max out length. by pengchengguo in https://github.com/espnet/espnet/pull/4843
* Add target speaker extraction (TSE) functions by Emrys365 in https://github.com/espnet/espnet/pull/4823
* Multilingual superb by ftshijt in https://github.com/espnet/espnet/pull/4824
* VISinger by jerryuhoo in https://github.com/espnet/espnet/pull/4689
* Update VISInger to latest by ftshijt in https://github.com/espnet/espnet/pull/4849
* VISinger for singing voice synthesis by ftshijt in https://github.com/espnet/espnet/pull/4848
* Reduce word counts for ESPnet-SE++ Joss paper by neillu23 in https://github.com/espnet/espnet/pull/4844
* Add E-Branchformer configs and models in ASR recipes by pyf98 in https://github.com/espnet/espnet/pull/4837
* Address Muskits updates on README by ftshijt in https://github.com/espnet/espnet/pull/4850
* Minor fix for MSUPERB recipe by ftshijt in https://github.com/espnet/espnet/pull/4851
* Update for the latest changes in the draft (minor changes) by neillu23 in https://github.com/espnet/espnet/pull/4852
* Add E-Branchformer results on Librispeech by kkim-asapp in https://github.com/espnet/espnet/pull/4856
* Update hubert implementation. by simpleoier in https://github.com/espnet/espnet/pull/4747
* VISinger unit test by jerryuhoo in https://github.com/espnet/espnet/pull/4855
* Minor fix to commonvoice espnet1 by ftshijt in https://github.com/espnet/espnet/pull/4862
* [WIP] Add S4 decoder in ESPnet2 by m-koichi in https://github.com/espnet/espnet/pull/4845
* Update hubert feature and acknowledge information in related Readmes. by simpleoier in https://github.com/espnet/espnet/pull/4863
* Generating MFA aligments by Fhrozen in https://github.com/espnet/espnet/pull/4803
* [WIP] EURO uasr scripts by DongjiGao in https://github.com/espnet/espnet/pull/4846
* Update README.md related to ASR architecture by m-koichi in https://github.com/espnet/espnet/pull/4865
* Minor fix to librimix diar recipe by ftshijt in https://github.com/espnet/espnet/pull/4867
* Add Full Whisper Model for Finetuning by slSeanWU in https://github.com/espnet/espnet/pull/4793
* Add torchaudio version check for HuBERT pretraining by simpleoier in https://github.com/espnet/espnet/pull/4872
* add k2 decoder related scripts for EURO by DongjiGao in https://github.com/espnet/espnet/pull/4868
* EURO: small fix (temporarily remove support for nbest_rescoring) by DongjiGao in https://github.com/espnet/espnet/pull/4875
* Add description for Whisper ASR in homepage readme by slSeanWU in https://github.com/espnet/espnet/pull/4877
* Update README.md by eltociear in https://github.com/espnet/espnet/pull/4879
* add explanations to text tokenizing related scripts and remove unused script by DongjiGao in https://github.com/espnet/espnet/pull/4880
* update information about source and our modification for k2 related scripts by DongjiGao in https://github.com/espnet/espnet/pull/4881
* AphasiaBank ASR recipe by tjysdsg in https://github.com/espnet/espnet/pull/4860
* Multilingual SUPERB update by ftshijt in https://github.com/espnet/espnet/pull/4878
* ESPnet Unsupervised ASR (EURO project) by ftshijt in https://github.com/espnet/espnet/pull/4774
* Support ProDiff in TTS by Fhrozen in https://github.com/espnet/espnet/pull/4808
* Add E-Branchformer for GigaSpeech by pyf98 in https://github.com/espnet/espnet/pull/4882
* FLEURS - Auxillary CTC conditioning tasks by wanchichen in https://github.com/espnet/espnet/pull/4756
* Add python 3.8 requirement for Whisper & update tests by slSeanWU in https://github.com/espnet/espnet/pull/4891
* Update some ASR results in the main readme file by pyf98 in https://github.com/espnet/espnet/pull/4883
* Add Conv2dSubsampling1 module and test it in AphasiaBank ASR recipe by tjysdsg in https://github.com/espnet/espnet/pull/4892
* Support x-vector extractor based on RawNet by Takaaki-Saeki in https://github.com/espnet/espnet/pull/4884
* single language track setups by DanBerrebbi in https://github.com/espnet/espnet/pull/4895
* fixing bug deu1 by DanBerrebbi in https://github.com/espnet/espnet/pull/4900
* Fix dataprep issues based on updated data release via Google form by roshansh-cmu in https://github.com/espnet/espnet/pull/4899
* Add a new EGS2 recipe 'reazonspeech' by fujimotos in https://github.com/espnet/espnet/pull/4885
* Update version to 202301 by kan-bayashi in https://github.com/espnet/espnet/pull/4901

New Contributors
* DongjiGao made their first contribution in https://github.com/espnet/espnet/pull/4791
* jerryuhoo made their first contribution in https://github.com/espnet/espnet/pull/4689
* m-koichi made their first contribution in https://github.com/espnet/espnet/pull/4845
* fujimotos made their first contribution in https://github.com/espnet/espnet/pull/4885

**Full Changelog**: https://github.com/espnet/espnet/compare/v.202211...v.202301

v.202211
What's Changed
* Update muskits update by ftshijt in https://github.com/espnet/espnet/pull/4616
* Muskit installation by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4617
* Sync Muskits branch with Master by ftshijt in https://github.com/espnet/espnet/pull/4640
* Updates on Muskit Migration by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4631
* Update Muskits branch by ftshijt in https://github.com/espnet/espnet/pull/4662
* Add stage 5 & stage 6 by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4649
* Muskit: rename & reorganize features by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4668
* Update Muskits branch by ftshijt in https://github.com/espnet/espnet/pull/4671
* Muskits CI fixing by ftshijt in https://github.com/espnet/espnet/pull/4672
* Muskits CI fix by ftshijt in https://github.com/espnet/espnet/pull/4673
* Muskits - apply isort by ftshijt in https://github.com/espnet/espnet/pull/4677
* Muskits CI fix by ftshijt in https://github.com/espnet/espnet/pull/4678
* Muskit: Add tokenizer by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4676
* Muskits - various fix for CI test by ftshijt in https://github.com/espnet/espnet/pull/4679
* Muskit: add recipe ofuton by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4681
* Muskits (CI fix) by ftshijt in https://github.com/espnet/espnet/pull/4682
* Fix CI issue in muskits by ftshijt in https://github.com/espnet/espnet/pull/4687
* Add dns_icassp22 Speech Enhancement Recipe by slSeanWU in https://github.com/espnet/espnet/pull/4657
* Singing Voice Synthesis Task for ESPnet by ftshijt in https://github.com/espnet/espnet/pull/4670
* Documentation of Tutorial and Muskits by ftshijt in https://github.com/espnet/espnet/pull/4692
* Add tests on MacOS and Windows (only installation) by kamo-naoyuki in https://github.com/espnet/espnet/pull/4669
* Add missing entries in readme by ftshijt in https://github.com/espnet/espnet/pull/4699
* Support ST without texts in source language by sophia1488 in https://github.com/espnet/espnet/pull/4688
* Update ConvInput for Transducer by b-flo in https://github.com/espnet/espnet/pull/4720
* Small changes for standalone Transducer by b-flo in https://github.com/espnet/espnet/pull/4722
* Fix input block tutorial documentation for Transducer by b-flo in https://github.com/espnet/espnet/pull/4724
* Fix HF Pytest Errors by siddhu001 in https://github.com/espnet/espnet/pull/4737
* Update to puebla-nahuatl recipe (some minor fixes) by ftshijt in https://github.com/espnet/espnet/pull/4713
* Add espnet2 TTS recipe on M-AILABS by Takaaki-Saeki in https://github.com/espnet/espnet/pull/4701
* Update outdated enh config files by Emrys365 in https://github.com/espnet/espnet/pull/4719
* add src_sos & src_eos for mt task to address the index out of range w… by simpleoier in https://github.com/espnet/espnet/pull/4736
* Add g2pk_explicit_space tokenizer by jonghwanhyeon in https://github.com/espnet/espnet/pull/4718
* Fix JETS inference with GST (4743) by kan-bayashi in https://github.com/espnet/espnet/pull/4744
* Update on Muskit by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4700
* add fleurs conformer+sc-ctc results by wanchichen in https://github.com/espnet/espnet/pull/4746
* Add recipe for OCR task on IAM handwriting dataset by kenzheng99 in https://github.com/espnet/espnet/pull/4707
* Add Talromur2 recipe by G-Thor in https://github.com/espnet/espnet/pull/4680
* Add multi-channel enh_asr for CHiME-4 by YoshikiMas in https://github.com/espnet/espnet/pull/4706
* chunk_mask error by aky15 in https://github.com/espnet/espnet/pull/4751
* fix wav2vec2 encoder mask bug by simpleoier in https://github.com/espnet/espnet/pull/4772
* Add Hugging Face Transformers Decoder, Tokenizer and their example on SLURP by akreal in https://github.com/espnet/espnet/pull/4099
* [Recipe PR] MELD: Multimodal EmotionLines Dataset by realzza in https://github.com/espnet/espnet/pull/4771
* MultiIRIS follow up by YoshikiMas in https://github.com/espnet/espnet/pull/4765
* Add CATSLU results for XLS-R with mBART-50 by akreal in https://github.com/espnet/espnet/pull/4782
* Add MEDIA and PortMEDIA results for XLS-R with mBART-50 by akreal in https://github.com/espnet/espnet/pull/4794
* Add SLUE-VoxPopuli results for WavLM with mBART-50 by akreal in https://github.com/espnet/espnet/pull/4777
* Follow up for SLURP and CATSLU by akreal in https://github.com/espnet/espnet/pull/4796
* Update README in chime4/enh_asr1 by YoshikiMas in https://github.com/espnet/espnet/pull/4795
* fix parsing token_list by imdanboy in https://github.com/espnet/espnet/pull/4778
* Use torchaudio functions for beamforming related operations in torch 1.12.1+ by Emrys365 in https://github.com/espnet/espnet/pull/4638
* PIT E2E multi-speaker ASR and librimix recipe by simpleoier in https://github.com/espnet/espnet/pull/4753
* Fix an audio format issue in some enh recipes by YoshikiMas in https://github.com/espnet/espnet/pull/4799
* Fixing How2-2000h Data preparation and Seq Length Assert for Longformer Encoder by roshansh-cmu in https://github.com/espnet/espnet/pull/4805
* Adding MFA scripts for LJSpeech by iamanigeeit in https://github.com/espnet/espnet/pull/4801
* fix typo in espnet2_tutorial.md by eltociear in https://github.com/espnet/espnet/pull/4811
* [WIP] E-Branchformer Encoder in ESPnet2 by kkim-asapp in https://github.com/espnet/espnet/pull/4812
* Muskit update by A-Quarter-Mile in https://github.com/espnet/espnet/pull/4783

New Contributors
* A-Quarter-Mile made their first contribution in https://github.com/espnet/espnet/pull/4617
* sophia1488 made their first contribution in https://github.com/espnet/espnet/pull/4688
* kenzheng99 made their first contribution in https://github.com/espnet/espnet/pull/4707
* realzza made their first contribution in https://github.com/espnet/espnet/pull/4771
* iamanigeeit made their first contribution in https://github.com/espnet/espnet/pull/4801
* eltociear made their first contribution in https://github.com/espnet/espnet/pull/4811
* kkim-asapp made their first contribution in https://github.com/espnet/espnet/pull/4812

**Full Changelog**: https://github.com/espnet/espnet/compare/v.202209...v.202211

v.202209
What's Changed
* Add dynamic mixing in the speech separation task. by LiChenda in https://github.com/espnet/espnet/pull/4387
* Added test script and usage for calculate_rtf.py script to ESPnet2 tutorial page by espnetUser in https://github.com/espnet/espnet/pull/4560
* Offline/Online (standalone) ESPnet2 Transducer by b-flo in https://github.com/espnet/espnet/pull/4479
* Unfix matplotlib version by kamo-naoyuki in https://github.com/espnet/espnet/pull/4576
* use torch.finfo for dtype other than float by wenzhe-nrv in https://github.com/espnet/espnet/pull/4584
* Update recipe for slurp-entity by ftshijt in https://github.com/espnet/espnet/pull/4585
* Egs2 aesrc by brianyan918 in https://github.com/espnet/espnet/pull/4592
* update checks for bias in initialization by LiChenda in https://github.com/espnet/espnet/pull/4574
* [WIP] Update to fit the recent update in s3prl. by simpleoier in https://github.com/espnet/espnet/pull/4593
* Unfix numpy version by kamo-naoyuki in https://github.com/espnet/espnet/pull/4598
* Update to fit the recent update in s3prl. by simpleoier in https://github.com/espnet/espnet/pull/4600
* Add improved results on FLEURS dataset by wanchichen in https://github.com/espnet/espnet/pull/4596
* Update mp4_to_wav.sh by jaehyun-ko in https://github.com/espnet/espnet/pull/4605
* Pass output_dir as str to wandb.init() by jonghwanhyeon in https://github.com/espnet/espnet/pull/4607
* Support enh_s2t joint training on multi-speaker data by Emrys365 in https://github.com/espnet/espnet/pull/4566
* Add ASR results for commonvoice zh_TW by slSeanWU in https://github.com/espnet/espnet/pull/4612
* Fix both utt2sid and utt2lid when removing long/short data by jonghwanhyeon in https://github.com/espnet/espnet/pull/4609
* recipe config update by ftshijt in https://github.com/espnet/espnet/pull/4621
* Add pytorch=1.12.1 to CI configurations by kamo-naoyuki in https://github.com/espnet/espnet/pull/4604
* New SLU task by siddhu001 in https://github.com/espnet/espnet/pull/4569
* Joss paper: Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing by neillu23 in https://github.com/espnet/espnet/pull/4620
* Update conformer result of AMI corpus by teinhonglo in https://github.com/espnet/espnet/pull/4629
* Offline/Online Branchformer Transducer by b-flo in https://github.com/espnet/espnet/pull/4582
* Change to install numba using pip instead of conda by kamo-naoyuki in https://github.com/espnet/espnet/pull/4637
* Add MixIT support. It is unsupervised only. Semi-supervised config is not available for now. by simpleoier in https://github.com/espnet/espnet/pull/4619
* Add 2-pass SLU code for FSC Challenge by siddhu001 in https://github.com/espnet/espnet/pull/4636
* CI fix and some other minor recipe fixes by ftshijt in https://github.com/espnet/espnet/pull/4656
* Update the title of plots to be y-label vs x-label by pyf98 in https://github.com/espnet/espnet/pull/4647
* Update VIVOS download link by hieuthi in https://github.com/espnet/espnet/pull/4644
* Add ASR recipe of MAGICDATA mandarin read speech by tjysdsg in https://github.com/espnet/espnet/pull/4635
* Amend to CI fix by ftshijt in https://github.com/espnet/espnet/pull/4663
* qasr update by massabaali7 in https://github.com/espnet/espnet/pull/4642
* Open_li110 for large-scale multilingual speech by ftshijt in https://github.com/espnet/espnet/pull/4408
* Fix the path of calculate_rft.py by sw005320 in https://github.com/espnet/espnet/pull/4660
* Fix importlib-metadata version by kan-bayashi in https://github.com/espnet/espnet/pull/4686
* Cmu arctic tts pretrain finetune by soumimaiti in https://github.com/espnet/espnet/pull/4456
* updated version to 202209 by kan-bayashi in https://github.com/espnet/espnet/pull/4685

New Contributors
* wenzhe-nrv made their first contribution in https://github.com/espnet/espnet/pull/4584
* jaehyun-ko made their first contribution in https://github.com/espnet/espnet/pull/4605
* jonghwanhyeon made their first contribution in https://github.com/espnet/espnet/pull/4607
* slSeanWU made their first contribution in https://github.com/espnet/espnet/pull/4612
* massabaali7 made their first contribution in https://github.com/espnet/espnet/pull/4642
* soumimaiti made their first contribution in https://github.com/espnet/espnet/pull/4456

**Full Changelog**: https://github.com/espnet/espnet/compare/v.202207...v.202209

v.202207
New Features
- [**New Features**][**ESPnet1**][**ASR**] Add DDP support for v1 ASR training. 4430 by lazykyama
- [**New Features**][**ESPnet2**] Support tensorboard graph 4418 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**ASR**] Branchformer Encoder in ESPnet2 4400 by pyf98
- [**New Features**][**ESPnet2**][**Diarization**][**SE**] enh_diar joint model 4339 by YushiUeda
- [**New Features**][**ESPnet2**][**ESPnet1**] Calculate RTF and latency in espnet2 4382 by espnetUser
- [**New Features**][**ESPnet2**][**ESPnet1**][**SE**] Add EnhPreprocessor for Speech Enhancement 4321 by Emrys365
- [**New Features**][**ESPnet2**][**SE**] Add DPTNet and WarmupStepLR scheduler 4449 by Emrys365
- [**New Features**][**ESPnet2**][**SE**] Add support for calculating losses on noise and dereverberated signals 4476 by Emrys365

Recipe
- [**Recipe**][**ESPnet2**] Aishell-2 GPU info 4501 by jctian98
- [**Recipe**][**ESPnet2**] Fix librispeech default path to signify auto download 4517 by karthik19967829
- [**Recipe**][**ESPnet2**] Recipe fix for PueblaNahuatl Recipe 4522 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add Aishell-2 ASR Recipe for Espnet2 4451 by jctian98
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add AmericasNLP 2022 baselines 4428 by akreal
- [**Recipe**][**ESPnet2**][**ESPnet1**][**ASR**][**Installation**] FLEURS ASR Recipe for ESPnet2 4455 by wanchichen
- [**Recipe**][**ESPnet2**][**ESPnet1**][**ASR**][**README**] tedx_spanish_corpus egs2 recipe 4523 by jessicah25
- [**Recipe**][**ESPnet2**][**ESPnet1**][**ASR**][**SE**] Adding L3DAS22 Task1 model to ESPNet-SE 3994 by popcornell
- [**Recipe**][**ESPnet2**][**ESPnet1**][**ST**] Must_C v1 and v2 in egs2 4306 by brianyan918
- [**Recipe**][**ESPnet2**][**README**] Dcase task1 Baseline 4317 by siddhu001
- [**Recipe**][**ESPnet2**][**README**] Report Aishell-2 Transducer results 4489 by jctian98
- [**Recipe**][**ESPnet2**][**README**] Update language codes in AmericasNLP 2022 baseline 4441 by akreal
- [**Recipe**][**ESPnet2**][**README**] Vox populi baseline 4478 by siddhu001
- [**Recipe**][**ESPnet2**][**SE**] L3DAS22 enhancement recipe 4269 by neillu23
- [**Recipe**][**ESPnet2**][**SE**] Update notes in the recipes for DNS challenges 4433 by YoshikiMas
- [**Recipe**][**ESPnet2**][**SE**][**SLU**][**ST**] LT-Spatialized and SLURP-Spatialized combined enhancement recipe 4268 by neillu23
- [**Recipe**][**ESPnet2**][**ST**] Add moses check for ST recipes 4417 by ftshijt
- [**Recipe**][**ESPnet2**][**TTS**] Add talromur recipe 4379 by G-Thor
- [**Recipe**][**ESPnet2**][**TTS**] Fix for issue 4401 4402 by G-Thor
- [**Recipe**][**ESPnet2**][**TTS**] add pre-trained model jets in the recipe of ljspeech, kss 4406 by imdanboy

Bugfix
- [**Bugfix**][**ESPnet1**] fix the corrupted pretrained model 4490 by wentaoxandry
- [**Bugfix**][**ESPnet1**][**ESPnet2**] Fix an4 URL 4427 by pyf98
- [**Bugfix**][**ESPnet1**][**ESPnet2**][**RNNT**] Fix mAES with big vocab size 4312 by b-flo
- [**Bugfix**][**ESPnet2**] Adding __init__.py to espnet2/diar/layers and espnet2/diar/separator 4470 by cycentum
- [**Bugfix**][**ESPnet2**] Fix tensorboard-graph creation for multi gpu mode 4431 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Update char_tokenizer.py 4499 by xiabingquan
- [**Bugfix**][**ESPnet2**][**ESPnet1**][**ASR**][**LM**][**MT**][**TTS**] Fix Transducer LM fusion and add Logging for Transducer inference 4327 by chintu619
- [**Bugfix**][**ESPnet2**][**SE**] Fix a bug in enh unit test 4435 by Emrys365

Enhancement
- [**Enhancement**][**ESPnet2**] Optionize graph creation 4551 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**Installation**][**TTS**] Add icelandic g2p 4384 by G-Thor
- [**Enhancement**][**ESPnet2**][**SE**] Add support of test-only criterions after each epoch 4381 by Emrys365
- [**Enhancement**][**ESPnet2**][**SSL**] raise more useful error in espnet2/asr/frontend/s3prl.py if s3prl is not installed 4480 by popcornell
- [**Enhancement**][**ESPnet2**][**TTS**] Add JETS AlignmentModule in calculate_all_attentions.py 4446 by seastar105

Refactoring
- [**Refactoring**][**ESPnet1**] Refactoring 'is_prefix' function 4530 by jhlee9010
- [**Refactoring**][**ESPnet2**][**ASR**] Zero_infinity option for ctc loss 4415 by kamo-naoyuki

Others
- [**CI**][**ESPnet1**][**ESPnet2**][**Installation**] Remove the version restriction for numpy 4419 by kamo-naoyuki
- [**CI**][**ESPnet2**] Canged to install espnet from wheel in the test_import CI test 4471 by kamo-naoyuki
- [**CI**][**Installation**] Temporary fixed numpy version 4464 by kamo-naoyuki
- [**Documentation**] Add notes on batch size and num of GPUs in ESPnet2 documentation 4436 by pyf98
- [**Documentation**][**ESPnet1**] Update decoder.py 4322 by sw005320
- [**Documentation**][**ESPnet2**] Add a note to follow the installation instructions 4477 by akreal

Acknowledgements
Special thanks to Emrys365, G-Thor, YoshikiMas, YushiUeda, akreal, b-flo, brianyan918, chintu619, cycentum, espnetUser, ftshijt, imdanboy, jctian98, jessicah25, jhlee9010, kamo-naoyuki, kan-bayashi, karthik19967829, lazykyama, neillu23, popcornell, pyf98, seastar105, siddhu001, sw005320, wanchichen, wentaoxandry, xiabingquan.

v.202205

New Features
- [**New Features**][**ESPnet1**][**ESPnet2**][**ASR**] Add quantization in ESPnet2 for asr inference 4349 by pyf98
- [**New Features**][**ESPnet2**][**SE**] Add svoice recipe for wsj0-2mix speech separation 4257 by nateanl
- [**New Features**][**ESPnet2**][**SE**] Merge Deep Clustering and Deep Attractor Network to enh separator 4110 by earthmanylf
- [**New Features**][**ESPnet2**][**SE**] Some improvements to current enh functions 4251 by Emrys365
- [**New Features**][**ESPnet2**][**SE**][**Installation**] Import fast_bss_eval and update some time-domain losses for enh task 4256 by LiChenda
- [**New Features**][**ESPnet2**][**TTS**] add e2e tts model: JETS 4364 by imdanboy

Bugfix
- [**Bugfix**][**ESPnet1**] Fix minimum input length for Conv2dSubsampling2 in check_short_utt 4378 by akreal
- [**Bugfix**][**ESPnet1**][**ESPnet2**] Minor fixes for the intermediate loss usage and Mask-CTC decoding 4374 by YosukeHiguchi
- [**Bugfix**][**ESPnet2**] Fix 4396 4398 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix a bug in utterance_mvn 4304 by Emrys365
- [**Bugfix**][**ESPnet2**] Minor fix for Mask-CTC forward function 4347 by YosukeHiguchi
- [**Bugfix**][**ESPnet2**] Wandb Minor Fix for Model Resume 4329 by roshansh-cmu
- [**Bugfix**][**ESPnet2**] fix the enh_s2t_task argument in espnet2/bin/st_inference.py 4323 by simpleoier
- [**Bugfix**][**ESPnet2**][**MT**][**ST**] fix bug in mt/st templates for having separate token lists 4149 by brianyan918
- [**Bugfix**][**ESPnet2**][**Recipe**] Fix aishell3 data preparation script 4277 by LanceaKing
- [**Bugfix**][**ESPnet2**][**SE**] Fix a bug in stats aggregation when PITSolver is used 4343 by Emrys365
- [**Bugfix**][**ESPnet2**][**SE**] fix for enhancement model loading compatibility 4259 by LiChenda
- [**Bugfix**][**ESPnet2**][**ST**] bug fixes in ST recipes 4341 by chintu619
- [**Bugfix**][**ESPnet2**][**TTS**] Fix optional data names for TTS 4355 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**TTS**] fix a bug in Mandarin pypinyin_g2p_phone 4206 by WeiGodHorse
- [**Bugfix**][**ESPnet2**][**TTS**] fix loss = NaN in VITS with mixed precision 4356 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**streaming**] Add unit test to streaming ASR inference 4352 by espnetUser
- [**Bugfix**][**Installation**] fix s3prl install by using legacy version. Temporal solution. 4399 by simpleoier
- [**Bugfix**][**README**] Fix typo 4338 by ftshijt

Enhancement
- [**Enhancement**][**ESPnet1**][**ESPnet2**][**ASR**][**SE**][**SLU**][**ST**] enh_s2t joint model 4226 by simpleoier
- [**Enhancement**][**ESPnet2**] Add progress bar to phonemization 4320 by G-Thor
- [**Enhancement**][**ESPnet2**][**MT**] Update show_translation_result.sh to show all decoding results under the given exp directory 4330 by pyf98

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Accented English Speech Recognition Challenge 2020 recipe (AESRC2020) 3898 by brianyan918
- [**Recipe**][**ESPnet1**][**ESPnet2**][**ASR**][**README**][**Recipe**] Add MediaSpeech ASR recipe 4183 by AshibaWu
- [**Recipe**][**ESPnet2**][**ASR**][**README**] recipee for Microsoft speech corpus for Indian Languages 4191 by navya-yarrabelly
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Accented French Openslr57 ASR recipe (ESPnet2) (part of Homework3 MNLP) 4280 by DanBerrebbi
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add Mask-CTC results 4180 by YosukeHiguchi
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add ml_openslr63 ASR recipe 4173 by bharaniuk
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Adding new recipe for Burmese (OpenSLR80) 4182 by JainSameer06
- [**Recipe**][**ESPnet2**][**ASR**][**README**] add chime6 recipe 4332 by simpleoier
- [**Recipe**][**ESPnet2**][**ASR**][**SE**][**README**] add egs2/chime4/enh_asr1 recipe and results 4316 by simpleoier
- [**Recipe**][**ESPnet2**][**README**][**RNNT**] updated librispeech-asr with rnn-t results 4281 by chintu619
- [**Recipe**][**ESPnet2**][**README**][**SE**] 2021 Clarity Challenge recipe 4210 by popcornell
- [**Recipe**][**ESPnet2**][**README**][**SE**] Add AISHELL-4 ENH recipe 4249 by Emrys365
- [**Recipe**][**ESPnet2**][**README**][**SE**] Add ConferencingSpeech 2021 recipe to egs2 4192 by Emrys365
- [**Recipe**][**ESPnet2**][**README**][**SE**] Add ICASSP2021 DNS Challenge 2 recipe 4253 by YoshikiMas
- [**Recipe**][**ESPnet2**][**README**][**SE**] Add INTERSPEECH 2021 DNS Challenge 3 recipe 4238 by YoshikiMas
- [**Recipe**][**ESPnet2**][**README**][**SE**] Add results of ICASSP2021 DNS Challenge 2 recipe 4309 by YoshikiMas
- [**Recipe**][**ESPnet2**][**README**][**SE**] Rename egs2/clarity21/enh_2021 to egs2/clarity21/enh1 4328 by Emrys365
- [**Recipe**][**ESPnet2**][**README**][**SE**] add convtasnet recipe for dns_ins20 4314 by muqiaoy
- [**Recipe**][**ESPnet2**][**README**][**SLU**] Harpervalley recipe 4315 by YushiUeda
- [**Recipe**][**ESPnet2**][**README**][**SLU**] SLUE Voxpopuli base recipe 4262 by siddhu001
- [**Recipe**][**ESPnet2**][**README**][**ST**] CoVOST2 recipes 4300 by ftshijt
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Update SLU results for ICASSP 4283 by siddhu001

Others
- [**CI**][**Docker**] Github Action Trigger Docker Build 4295 by Fhrozen
- [**CI**][**Docker**] Github Action for Docker build 4219 by Fhrozen
- [**CI**][**ESPnet1**][**ESPnet2**][**Installation**][**README**] Add isort checking to the CI tests 4372 by kamo-naoyuki
- [**CI**][**ESPnet1**][**ESPnet2**][**Installation**][**README**][**mergify**] Add pytorch=1.10.2 and 1.11.0 to ci configurations 4348 by kamo-naoyuki
- [**CI**][**ESPnet2**][**ASR**][**SE**] add integration test and fix the decoding in enh_asr and enh_st 4310 by simpleoier
- [**CI**][**ESPnet2**][**New Features**][**SLU**][**ST**][**streaming**] Add streaming ST/SLU 4243 by D-Keqi
- [**CI**][**ESPnet2**][**ST**] Add Test Functions for ST Train and Inference 4324 by ftshijt
- [**CI**][**Installation**] update install_pesq.sh 4265 by LiChenda
- [**Documentation**][**ESPnet2**][**README**][**TTS**] Minor update for JETS 4369 by kan-bayashi
- [**Documentation**][**README**] Change the order of README 4289 by ftshijt
- [**Documentation**][**README**] Update README.md 4284 by sw005320

Acknowledgements
Special thanks to AshibaWu, D-Keqi, DanBerrebbi, Emrys365, Fhrozen, G-Thor, JainSameer06, LanceaKing, LiChenda, WeiGodHorse, YoshikiMas, YosukeHiguchi, YushiUeda, akreal, bharaniuk, brianyan918, chintu619, earthmanylf, espnetUser, ftshijt, imdanboy, kamo-naoyuki, kan-bayashi, muqiaoy, nateanl, navya-yarrabelly, popcornell, pyf98, roshansh-cmu, siddhu001, simpleoier, sw005320.

v.202204
News
From this version, we decided to use date-based versioning, e.g., `v.202204`.

New Features
- [**New Features**][**ESPnet1**] added learnable fourier features 4029 by popcornell
- [**New Features**][**ESPnet1**][**ESPnet2**][**ASR**] Restricted Self Attention for E2E Speech Summarization 4071 by roshansh-cmu
- [**New Features**][**ESPnet1**][**Installation**][**README**] add lrs avsr recipe 4104 by wentaoxandry
- [**New Features**][**ESPnet1**][**README**] add lip reading sentences dataset code 4074 by wentaoxandry
- [**New Features**][**ESPnet2**][**ASR**] [ESPnet2] Intermediate/Self-conditioned CTC 4084 by YosukeHiguchi
- [**New Features**][**ESPnet2**][**ASR**] [WIP] [ESPnet2] Mask-CTC 4158 by YosukeHiguchi
- [**New Features**][**ESPnet2**][**ASR**][**README**] Add stochastic depth to conformer and share results on LibriSpeech 960h 4142 by pyf98
- [**New Features**][**ESPnet2**][**MT**] MT task for espnet2 with IWSLT14 recipe 4111 by siddalmia
- [**New Features**][**ESPnet2**][**README**][**SE**] Add DC-CRN complex masking and spectral mapping approach for speech enhancement 4127 by Emrys365
- [**New Features**][**ESPnet2**][**README**][**SE**] Add DCCRN separator 4097 by Johnson-Lsx
- [**New Features**][**ESPnet2**][**README**][**SE**] Add a new separator for speech enhancement/separation tasks 4062 by LiChenda
- [**New Features**][**ESPnet2**][**README**][**SE**] Add iFaSNet for enhancement/separation tasks. 4130 by LiChenda
- [**New Features**][**ESPnet2**][**SE**] Refactor DNN_Beamformer in espnet2 and add new beamformers 4082 by Emrys365


Enhancement
- [**Enhancement**][**ESPnet2**] Add an optional suffix to the averaged model file name 4067 by pyf98
- [**Enhancement**][**ESPnet2**] Update perturb_data_dir_speed.sh 4091 by AmirHussein96
- [**Enhancement**][**ESPnet2**][**ASR**] Add tests for Intermediate/Self-conditioned CTC 4117 by YosukeHiguchi
- [**Enhancement**][**ESPnet2**][**TTS**] Add option to use norm. feats over denorm. 4250 by G-Thor

Recipe
- [**Recipe**][**ESPnet1**][**RNNT**] [ESPNET1] Add the results of conformer-transducer for Librispeech 4080 by eesungkim
- [**Recipe**][**ESPnet2**][**ASR**] Add ASR recipe for VCTK dataset based on TTS's dataprep. 4088 by kashikashi
- [**Recipe**][**ESPnet2**][**ASR**] Add new conformer config with hop length 160 for LibriSpeech 960h 4162 by pyf98
- [**Recipe**][**ESPnet2**][**ASR**] Add new zh_openslr38 ASR recipe 4181 by cuichenx
- [**Recipe**][**ESPnet2**][**ASR**] Add transformer results for LibriSpeech 100h 4089 by pyf98
- [**Recipe**][**ESPnet2**][**ASR**] Added Marathi OpenSLR 64 recipe 4179 by SujaySKumar
- [**Recipe**][**ESPnet2**][**ASR**] Added recipe for Microsoft Speech Corpus (Indian languages) 4194 by chintu619
- [**Recipe**][**ESPnet2**][**ASR**] Automatic lyric recognition Recipe 4129 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**] ESPNET - LRS3 Recepie 4101 by gdebayan
- [**Recipe**][**ESPnet2**][**ASR**] bengali asr model with no finetuning 4047 by dzeinali
- [**Recipe**][**ESPnet2**][**MT**] IWSLT'14 Results using ESPnet2-MT 4132 by pyf98
- [**Recipe**][**ESPnet2**][**README**] Mandarin ISO id should be CMN instead of ZHO 4125 by xinjli
- [**Recipe**][**ESPnet2**][**README**] Update README.md 4037 by dzeinali
- [**Recipe**][**ESPnet2**][**README**] Update README.md 4121 by dzeinali
- [**Recipe**][**ESPnet2**][**README**] Update README.md for How2 2000h ASR,SUM 4155 by roshansh-cmu
- [**Recipe**][**ESPnet2**][**RNNT**] Create decode_rnnt_conformer.yaml 4058 by sw005320
- [**Recipe**][**ESPnet2**][**RNNT**] Create train_rnnt_conformer.yaml 4057 by sw005320
- [**Recipe**][**ESPnet2**][**SLU**] Add IEMOCAP results and configs 4100 by YushiUeda
- [**Recipe**][**ESPnet2**][**SLU**] Add new config and support for computing WER in SLUE-VoxCeleb 4152 by siddhu001
- [**Recipe**][**ESPnet2**][**SLU**] Add sentiment data preparation for IEMOCAP 4065 by YushiUeda
- [**Recipe**][**ESPnet2**][**SLU**] ESPnet2 swbd_sentiment recipe 4134 by YushiUeda
- [**Recipe**][**ESPnet2**][**ST**] egs2/iwslt22_dialect 4013 by brianyan918

Bugfix
- [**Bugfix**][**CI**][**ESPnet2**] Fix CI test failures related to torch_complex 0.4.0 4112 by Emrys365
- [**Bugfix**][**CI**][**Installation**] fix doc ci by pinning jinja version 4239 by xinjli
- [**Bugfix**][**ESPnet2**] Fix n-gram decoding 4168 by sw005320
- [**Bugfix**][**ESPnet2**] bug fixes and efficient train/dev split in data prep of Microsoft Indian Languages recipe 4196 by chintu619
- [**Bugfix**][**ESPnet2**] fix errors in configs of librispeech ssl frontends 4098 by simpleoier
- [**Bugfix**][**ESPnet2**][**ASR**][**ST**] [bug patch] egs2/iwslt22_dialect 4049 by brianyan918
- [**Bugfix**][**ESPnet2**][**MT**][**ST**] Fix joint tokenization in st.sh 4143 by pyf98
- [**Bugfix**][**ESPnet2**][**MT**][**ST**] scoring fixes MT and ST 4146 by siddalmia
- [**Bugfix**][**ESPnet2**][**TTS**] Fix speaker normalization 4229 by LanceaKing
- [**Bugfix**][**Installation**] set gtn version 4122 by brianyan918
- [**Bugfix**][**ESPnet1**][**ESPnet2**] minor fixes in ST in espnet2 4056 by siddalmia

Others
- [**CI**] Simplify vocoder compatibility test 4061 by kan-bayashi
- [**CI**][**Documentation**] Fix notebook in the official doc. 4171 by ShigekiKarita
- [**Docker**] Docker Updates 4064 by Fhrozen
- [**Documentation**] Add a checklist for PRs on recipe 4053 by ftshijt
- [**Documentation**] README Update for E2E Speech Summarization 4071 4150 by roshansh-cmu
- [**Documentation**] Update the example PyTorch version in Installation doc 4116 by pyf98
- [**Documentation**] [documentation] fix minor typo in installation.md 4164 by JDongian
- [**Documentation**][**ESPnet1**] fix typo 4044 by ooyamatakehisa
- [**Documentation**][**ESPnet1**][**ESPnet2**][**ASR**] Add Huggingface-cli usage 4027 by karthik19967829

Acknowledgements
Special thanks to AmirHussein96, Emrys365, Fhrozen, G-Thor, JDongian, Johnson-Lsx, LanceaKing, LiChenda, ShigekiKarita, SujaySKumar, YosukeHiguchi, YushiUeda, brianyan918, chintu619, cuichenx, dzeinali, eesungkim, ftshijt, gdebayan, kan-bayashi, karthik19967829, kashikashi, ooyamatakehisa, popcornell, pyf98, roshansh-cmu, siddalmia, siddhu001, simpleoier, sw005320, wentaoxandry, xinjli.

v.0.10.6
New Features
- [**New Features**][**ESPnet2**][**TTS**][**Installation**][**README**] [TTS] Support python-based toolkit for xvector extractors 4016 by Fhrozen
- [**New Features**][**ESPnet2**] Add SpecAug2 which supports variable maximum width in time masking 3902 by pyf98

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Add librispeech-100h recipe 3997 by YosukeHiguchi
- [**Recipe**][**ESPnet1**][**ASR**] Update egs/librispeech_100 4036 by YosukeHiguchi
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Scoring Mandarin / English separately for the SEAME corpus 3976 by vectominist
- [**Recipe**][**ESPnet2**][**ASR**][**README**] update LibriSpeech Pretrained models with SSLRs: results and huggingf… 3979 by simpleoier
- [**Recipe**][**ESPnet2**][**ASR**][**README**][**ST**] Speech translation framework (merging into master) 3987 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**][**TTS**] Update two recipes (googlei18n and hub4_spanish) 3895 by ftshijt
- [**Recipe**][**ESPnet2**][**SLU**][**README**] updated the results of Slue voxceleb 3929 by siddhu001
- [**Recipe**][**ESPnet2**][**ST**] Update the default setting for st 3993 by ftshijt

Bugfix
- [**Bugfix**][**ESPnet1**][**RNNT**] Fix bug for Conformer-T 4020 by YosukeHiguchi
- [**Bugfix**][**ESPnet2**][**Diarization**] Diarization: fix for convolutional input layer in the encoder 3957 by alumae
- [**Bugfix**][**ESPnet2**][**Diarization**] Two fixes to diarization evaluation scripts 3938 by alumae
- [**Bugfix**][**ESPnet2**][**Diarization**][**Recipe**] Fix issues in EEND-EDA & add Librimix_diar recipe 3900 by YushiUeda
- [**Bugfix**][**ESPnet2**][**ESPnet1**][**ASR**][**streaming**] streaming conformer bugfix 4025 by jeon30c
- [**Bugfix**][**ESPnet2**][**LM**] Bugfix for espnet2 ngram 4002 by yaochie
- [**Bugfix**][**ESPnet2**][**RNNT**] espnet2 asr inference bugfix for transducer 3943 by jeon30c
- [**Bugfix**][**ESPnet2**][**ST**] Bugfix for ST scoring 3972 by ftshijt

Enhancement
- [**Enhancement**][**ESPnet2**] cleaned tensorboard and stats logging for espnet2 3910 by siddalmia
- [**Enhancement**][**ESPnet2**][**Diarization**] Add test codes for diarization 3953 by YushiUeda
- [**Enhancement**][**ESPnet2**][**streaming**] Add reference for streaming ASR 4014 by D-Keqi

Ohter
- [**CI**] remove the support of pytorch 1.3.1 4038 by sw005320
- [**CI**][**ESPnet1**][**ESPnet2**] fix ci for librosa update 4043 by ftshijt
- [**CI**][**Installation**] Fix numpy version 3965 by kan-bayashi
- [**CI**][**Installation**] temporary fixed pypinyin version 3995 by kan-bayashi
- [**Documentation**][**ESPnet1**][**ESPnet2**][**README**][**SLU**] Add Sinhala E2E SLU Recipe 3890 by karthik19967829
- [**Documentation**][**README**] Update README.md 4039 by sw005320
- [**ESPnet2**][**README**] Update README.md 3931 by sw005320
- [**ESPnet2**][**README**][**TTS**][**Typo**] Fix typo in README.md 4024 by kan-bayashi

Acknowledgements
Special thanks to D-Keqi, Fhrozen, YosukeHiguchi, YushiUeda, alumae, ftshijt, jeon30c, kan-bayashi, karthik19967829, pyf98, siddalmia, siddhu001, simpleoier, sw005320, vectominist, yaochie.

Full Changelog
https://github.com/espnet/espnet/compare/v.0.10.5...v.0.10.6

v.0.10.5
New Features
- [**New Features**][**ESPnet1**][**ASR**] Implement self-conditioned CTC 3856 by komatta-san
- [**New Features**][**ESPnet2**][**ASR**][**CI**][**Installation**] GTN CTC for ESPnet2 3778 by brianyan918
- [**New Features**][**ESPnet2**][**ASR**][**Refactoring**] [ESPnet2] Transducer 2533 by b-flo
- [**New Features**][**ESPnet2**][**README**][**Recipe**] Frontends fusion (any type, any number, linear fusion only for now) for ASR in espnet2 3824 by DanBerrebbi
- [**New Features**][**ESPnet2**][**SE**] Refactor loss computation in enhancement tasks. 3838 by LiChenda

Recipe
- [**Recipe**][**ESPnet1**][**ESPnet2**][**ASR**][**README**] updated the results of aidatatang_200zh 3925 by sw005320
- [**Recipe**][**ESPnet1**][**VC**] Various fixes of voice conversion recipes 3800 by unilight
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Expanding egs2 of Tedlium2 3795 by D-Keqi
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update an4 config 3913 by pyf98
- [**Recipe**][**ESPnet2**][**ASR**][**README**] aidatatang_200zh recipe 3892 by sw005320
- [**Recipe**][**ESPnet2**][**README**] Update README.md 3881 by daisylab
- [**Recipe**][**ESPnet2**][**README**] Update egs2/TEMPLATE/README.md 3793 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**README**] fix readme 3827 by seastar105
- [**Recipe**][**ESPnet2**][**README**][**Recipe**] Add ASR Recipe: Primewords_Chinese 3903 by pyf98
- [**Recipe**][**ESPnet2**][**README**][**Recipe**] Update MISP challenge ASR baseline and add AVSR baseline 3819 by neillu23
- [**Recipe**][**ESPnet2**][**README**][**SLU**] Fsc Maseeval scripts 3769 by siddhu001
- [**Recipe**][**ESPnet2**][**README**][**SLU**] Update Google Speechcommands (SLU recipe) 3915 by pyf98
- [**Recipe**][**ESPnet2**][**README**][**TTS**] ESPnet2 ARCTIC TTS 3791 by peter-yh-wu
- [**Recipe**][**ESPnet2**][**README**][**TTS**] Update README and add missing config 3917 by kan-bayashi
- [**Recipe**][**ESPnet2**][**Recipe**][**SLU**] Slue voxceleb Sentiment Analysis 3894 by siddhu001
- [**Recipe**][**ESPnet2**][**SE**] modified data type in enh.sh 3768 by simpleoier

Bugfix
- [**Bugfix**][**ESPnet1**][**README**][**RNNT**] Fix cache for Transducer search strategies + doc 3869 by b-flo
- [**Bugfix**][**ESPnet1**][**RNNT**] Fix recombine_hyps 3908 by b-flo
- [**Bugfix**][**ESPnet1**][**RNNT**] fix rnn-t ALSD beam search index bug 3794 by maxwellzh
- [**Bugfix**][**ESPnet1**][**RNNT**] fix the sort order in select_k_expansions() 3864 by freewym
- [**Bugfix**][**ESPnet2**] Bug fix for .gitignore and db fill up for CMU cluster 3891 by siddalmia
- [**Bugfix**][**ESPnet2**] Fix 3716 3849 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Merging asr_streaming.sh into asr.sh for laborotv egs2 3868 by D-Keqi
- [**Bugfix**][**ESPnet2**] add init.py 3928 by sw005320
- [**Bugfix**][**ESPnet2**] fix small problem that used before defined in step 12 3871 by simpleoier
- [**Bugfix**][**ESPnet2**] fix stft olens when win_lengths is not equal to n_fft 3812 by IceCreamWW
- [**Bugfix**][**ESPnet2**] update s3prl frontend w.r.t. recent modification in s3prl interface 3839 by simpleoier
- [**Bugfix**][**ESPnet2**][**TTS**] bugfix lang2lid in tts.sh 3906 by imdanboy
- [**Bugfix**][**Installation**] Fix 3783 3786 by kamo-naoyuki

Others
- [**CI**] Fix G2P test failure in CI due to the dict update 3848 by kan-bayashi
- [**CI**][**Documentation**][**ESPnet1**][**ESPnet2**] Fixing issues about streaming Transformer/Conformer training 3880 by D-Keqi
- [**CI**][**ESPnet1**][**ESPnet2**][**Installation**][**New Features**][**README**] nbest rescoring with k2 3567 by glynpu
- [**Documentation**][**README**] Update README.md 3893 by sw005320
- [**Documentation**][**README**][**SSL**] Add more docs about s3prl frontend 3796 by simpleoier
- [**Documentation**][**README**][**streaming**] Updating main README.md about streaming transformer 3855 by D-Keqi
- [**ESPnet1**][**RNNT**] Add exception for conformer decoder 3801 by b-flo
- [**ESPnet2**][**README**][**Typo**] Fix typo in README.md 3852 by kan-bayashi
- [**ESPnet2**][**SE**] add eps in beam-forming reference channel selection 3904 by LiChenda
- [**ESPnet2**][**SLU**] Add unit test for score_intent.py 3759 by siddhu001
- [**ESPnet2**][**ST**] Speech Translation Update 3860 by ftshijt
- [**ESPnet2**][**TTS**][**Installation**][**Refactoring**] Refactor Phonemizer-based G2P 3916 by kan-bayashi

Acknowledgements
Special thanks to D-Keqi, DanBerrebbi, IceCreamWW, LiChenda, b-flo, brianyan918, daisylab, freewym, ftshijt, glynpu, imdanboy, kamo-naoyuki, kan-bayashi, komatta-san, maxwellzh, neillu23, peter-yh-wu, pyf98, seastar105, siddalmia, siddhu001, simpleoier, sw005320, unilight.

v.0.10.4
New Features
- [**New Features**][**ESPnet1**][**ESPnet2**][**ASR**][**README**] The code for Emiru's real streaming Transformer 3614 by D-Keqi
- [**New Features**][**ESPnet1**][**MT**][**ST**][**Installation**] Support sacreBLEU 3698 by hirofumi0810
- [**New Features**][**ESPnet2**][**ST**] ESPNet2 speech translation 3587 by ftshijt

Enhancement
- [**Enhancement**][**ESPnet1**][**ASR**] Fix e2e_asr_maskctc.py to make RTF computable 3634 by eddiewng
- [**Enhancement**][**ESPnet2**][**Installation**][**README**] HuggingFace Upload support for ESPnet2 tasks [cont.] 3677 by Fhrozen
- [**Enhancement**][**ESPnet2**][**TTS**][**Installation**] Add korean_jaso tokenizer and korean_cleaner 3588 by windtoker

Bugfix
- [**Bugfix**][**ESPnet1**][**ASR**][**RNNT**] Fix quantization for Transducer 3616 by b-flo
- [**Bugfix**][**ESPnet2**][**ASR**][**Recipe**] added download test set, small modifications for path of aishell 3663 by teinhonglo
- [**Bugfix**][**ESPnet2**] Do stft with librosa when neither MKL nor CUDA is available. 3668 by CTinRay
- [**Bugfix**][**ESPnet2**] [bug fixed] allow adding noise independently of rir, bug fixed in 3692 by ranchlai
- [**Bugfix**][**ESPnet2**][**Recipe**] Create Symlinks for 1-channel/2-channel tracks in chime4 3699 by neillu23
- [**Bugfix**][**ESPnet2**][**Recipe**] Fix SWBD Data Prep Bug 3742 by brianyan918

Recipe
- [**Recipe**][**ESPnet1**][**ASR**][**MT**][**ST**] Add CoVoST2 recipe 3720 by hirofumi0810
- [**Recipe**][**ESPnet2**][**ASR**][**README**] MISP2021 E2E ASR Baseline 3738 by neillu23
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Wenetspeech 3686 by pengchengguo
- [**Recipe**][**ESPnet2**][**SLU**] Add snips hubert feature training 3619 by yuekaizhang
- [**Recipe**][**ESPnet2**][**SLU**] Make scoring part more general 3715 by siddhu001
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Add ESPnet-SLU Recipe: Google Speech Commands 3693 by pyf98
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Add an ESPnet2 recipe for the Grabo SLU dataset 3669 by pyf98
- [**Recipe**][**ESPnet2**][**SLU**][**README**] CATSLU-MAPS: Added recipe 3685 by SujaySKumar
- [**Recipe**][**ESPnet2**][**SLU**][**README**] ESPnet2 Japanese dialogue act classification recipe 3667 by YushiUeda
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Slurp SLU with bpe encoded transcripts 3674 by siddhu001
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Slurp entity classification 3739 by siddhu001
- [**Recipe**][**ESPnet2**][**SSL**] Add eps in acc computation of HuBERT model 3713 by simpleoier
- [**Recipe**][**ESPnet2**][**TTS**] Change the timing of srctexts creation 3734 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] update kss recipe with VITS configuration 3660 by windtoker

Others
- [**CI**][**ESPnet2**][**Installation**] Fix tests in CI 3700 by kan-bayashi
- [**CI**][**ESPnet2**][**SLU**][**README**] Add Hubert pretrained ASR in FSC SLU 3653 by siddhu001
- [**CI**][**Installation**] Minor update for CI 3656 by kan-bayashi
- [**Documentation**][**ESPnet1**][**README**][**RNNT**][**Refactoring**] Refactor custom Transducer build 3697 by b-flo
- [**Documentation**][**ESPnet2**][**README**] Hugging Face support - Doc [cont.] 3709 by Fhrozen
- [**Installation**] Update pyopenjtalk version 3733 by kan-bayashi
- [**README**] Huggingface spaces ESPnet2-TTS web demo 3673 by AK391
- [**README**][**ESPnet2**] Add Huggingface model documentation 3714 by siddhu001
- [**README**][**ESPnet2**] Fix readme 3750 by takenori-y


Acknowledgements
Special thanks to AK391, CTinRay, D-Keqi, Fhrozen, SujaySKumar, YushiUeda, b-flo, brianyan918, eddiewng, ftshijt, hirofumi0810, kan-bayashi, neillu23, pengchengguo, pyf98, ranchlai, siddhu001, simpleoier, takenori-y, teinhonglo, windtoker, yuekaizhang.

v.0.10.3
New Features
- [**New Features**][**ESPnet1**][**RNNT**][**Installation**][**README**] FastEmit support 3591 by b-flo
- [**New Features**][**ESPnet2**][**ASR**] Add ASR portable evaluation script 3569 by kan-bayashi
- [**New Features**][**ESPnet2**][**README**] EEND-EDA model for diarization task 3621 by YushiUeda

Bugfix
- [**Bugfix**][**ESPnet1**] Fix /usr/bin/env bash -e 3651 by kamo-naoyuki
- [**Bugfix**][**ESPnet1**] ctc loss using dropout layer since .eval() will not work for F.dropout 3539 by zh794390558
- [**Bugfix**][**ESPnet2**] Minor fix of `evaluate_asr.sh` 3596 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**ASR**] wav2vec2_encoder bug fix 3545 by simpleoier
- [**Bugfix**][**ESPnet2**][**README**][**SSL**] Fix some issues of 3512 and add README.md to librispeech/ssl1 recipe. 3572 by Jzmo
- [**Bugfix**][**ESPnet2**][**TTS**] Bug fix the attribute registration in VITS generator 3573 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**TTS**] Fix pyopenjtalk_g2p_accent(_with_pause) 3555 by zzxiang

Recipe
- [**Recipe**][**ESPnet1**][**ASR**][**RNNT**] Update Transducer recipes 3465 by b-flo
- [**Recipe**][**ESPnet1**][**ST**] Clean libri-trans 3540 by hirofumi0810
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Dan aishell4 branch 3585 by DanBerrebbi
- [**Recipe**][**ESPnet2**][**ASR**][**README**] update pretrained models of librispeech using hubert/wav2vec2 3568 by simpleoier
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Add slu snips data receipe 3407 by yuekaizhang
- [**Recipe**][**ESPnet2**][**TTS**] Update GAN-TTS based configurations 3570 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add initial VITS results for JSUT 3550 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add つくよみちゃんコーパス recipe 3552 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] IndicSpeech TTS Scripts 3435 by peter-yh-wu
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update ESPnet2-TTS results 3578 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update JSUT and JVS results 3553 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update LJSpeech and CSMSC results 3560 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update TTS results 3615 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update TTS results 3648 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update VCTK results 3581 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update pret-trained model for TTS recipes 3590 by ftshijt
- [**Recipe**][**ESPnet2**][**TTS**][**README**] update kss recipe with new result. 3589 by windtoker
- [**Recipe**][**ESPnet2**][**TTS**][**Typo**] Fix typo `egs2/jtubespeech/tts1` 3564 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**Typo**] Update JVS README 3554 by kan-bayashi

Enhancement
- [**Enhancement**][**ESPnet2**][**SE**][**Refactoring**] Add PyTorch Builtin Complex Support in the Speech Enhancement Task 3355 by Emrys365
- [**Enhancement**][**ESPnet2**][**TTS**] Hindi g2p 3579 by peter-yh-wu
- [**Enhancement**][**ESPnet2**][**TTS**] Unify spks / lids / spk_embed_dim type 3551 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Update `evaluate_mcd.py` script 3566 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**][**Installation**] Add the installer of tdmelodic pyopenjtalk 3561 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**][**Installation**][**README**] Update TTS objective eval scripts 3650 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**][**README**] Add a new Japanese G2P for TTS 3558 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**][**README**] Add a new english G2P 3597 by kan-bayashi

Others
- [**CI**] Add codecov config and flags. 3603 by ShigekiKarita
- [**CI**] Omit tools/ from code coverage. 3600 by ShigekiKarita
- [**CI**] Split test_integration.sh 3599 by ShigekiKarita
- [**CI**][**ESPnet2**][**Installation**][**Refactoring**] Make the installation of transformers optional 3622 by kan-bayashi
- [**CI**][**Installation**] Add no-check-certificate option in PESQ installation 3649 by kan-bayashi
- [**CI**][**Installation**][**README**][**mergify**] Change setup.py for pytorch1.9.1 3636 by kamo-naoyuki
- [**Documentation**][**ESPnet1**][**RNNT**] Fix/improve doc(string)s related to Transducer model 3623 by b-flo
- [**Documentation**][**ESPnet2**][**TTS**][**README**] Update README of ESPnet2-TTS 3546 by kan-bayashi
- [**Documentation**][**ESPnet2**][**TTS**][**README**] Update TTS README 3565 by kan-bayashi
- [**Documentation**][**ESPnet2**][**TTS**][**README**] Update TTS fine-tuning README 3549 by kan-bayashi
- [**Typo**][**ESPnet2**] Minor bug in format_wav_scp.py 3575 by ftshijt
- [**Typo**][**ESPnet2**][**TTS**] update mismatch help info for tts 3602 by ftshijt


Acknowledgements
Special thanks to DanBerrebbi, Emrys365, Jzmo, ShigekiKarita, YushiUeda, b-flo, ftshijt, hirofumi0810, kamo-naoyuki, kan-bayashi, peter-yh-wu, simpleoier, windtoker, yuekaizhang, zh794390558, zzxiang.

v.0.10.2
News

- Hubert training is now available!
- Try with `egs2/librispeech/ssl1`
- GAN-based TTS model is now available!
- Joint text2mel and vocoder training
- End-to-end text-to-wave model (VITS) training
- Try with `egs2/ljspeech/tts1`
- Support `from_pretrained` function!
python
e.g.
from espnet2.bin.asr_inference import Speech2Text
asr = Speech2Text.from_pretrained("model_tag")

from espnet2.bin.tts_inference import Text2Speech
tts = Text2Speech.from_pretrained("model_tag")

from espnet2.bin.enh_inference import SeparateSpeech
enh = SeparateSpeech.from_pretrained("model_tag")

from espnet2.bin.diar_inference import DiarizeSpeech
diar = DiarizeSpeech.from_pretrained("model_tag")

Please check the available pretrained models in [espnet_model_zoo](https://github.com/espnet/espnet_model_zoo)!

New Features
- [**New Features**][**ESPnet1**] Intermediate CTC + Stochastic depth 3274 by jaesong
- [**New Features**][**ESPnet2**] Add new trainer for GAN-based training 3436 by kan-bayashi
- [**New Features**][**ESPnet2**][**ASR**] Add Hubert model in Espnet2/Refactor from 3458 3512 by Jzmo
- [**New Features**][**ESPnet2**][**ASR**] batch decode with k2 ctc 3433 by glynpu
- [**New Features**][**ESPnet2**][**ASR**][**SE**] Support `from_pretrained` for ASR and ENH 3535 by kan-bayashi
- [**New Features**][**ESPnet2**][**DIAR**] Support `from_pretrained` for DIAR 3537 by YushiUeda
- [**New Features**][**ESPnet2**][**SE**] Adding portable speech enhancement scripts for other tasks 3487 by Emrys365
- [**New Features**][**ESPnet2**][**TTS**] Add GAN-TTS task with VITS 3449 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support SID and LID inputs for TTS models 3490 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support `from_pretrained` function in `Text2Speech` 3532 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support `parallel_wavegan` vocoders in `tts_inference.py` 3513 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support joint training of text2mel and vocoder 3501 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support language ID input for espnet2 TTS 3489 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support speaker id input for TTS models 3452 by kan-bayashi

Enhancement
- [**Enhancement**][**ESPnet2**][**CTC segmentation**][**README**] Fix CTC Segmentation 3500 by shirayu
- [**Enhancement**][**ESPnet2**][**TTS**] Add VITS-related modules 3448 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Add cython code for VITS 3483 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Add joint training config example 3508 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Add melgan module for joint training 3516 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Add parallel wavegan module for joint training 3515 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Add style melgan module for joint training 3517 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Add vocoder modules related to VITS 3439 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Change Text2Speech class output format 3437 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Follow up of the support speaker id input 3453 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support cleaner option in phn converter util 3450 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support language id in VITS 3499 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support linear spectrogram 3438 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support new g2p functions for various languages 3463 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Update the TTS inference 3498 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**SLU**][**README**] Add support for intent classification on SLURP dataset 3482 by siddhu001
- [**Enhancement**][**ESPnet2**][**SLU**][**README**] Add NLU post-encoder using Hugging Face Transformers 3410 by akreal

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Mucs21 subtask1 3376 by sanket0211
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add Swahili ASR recipe 3485 by akreal
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Rename `swahili` recipe to `iwslt21_low_resource` 3522 by akreal
- [**Recipe**][**ESPnet2**][**DIAR**][**README**] Modify ESPnet2 diarization recipe 3524 by YushiUeda
- [**Recipe**][**ESPnet2**][**ESPnet1**][**ASR**] Espnet2 mucs_subtask2 3415 by bloodraven66
- [**Recipe**][**ESPnet2**][**ESPnet1**][**ASR**] mucs subtask1 3417 by bloodraven66
- [**Recipe**][**ESPnet2**][**SE**] Add Voicebank (vctk_noisy) script 3486 by neillu23
- [**Recipe**][**ESPnet2**][**TTS**] Add missing configs for LibriTTS recipe 3455 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update VITS config comments and settings 3528 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] aishell3 dataset preparation 3505 by actboy
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add CSS10 recipe for ESPnet2-TTS 3464 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add JtubeSpeech Recipe 3459 by Takaaki-Saeki
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add SIWIS recipe 3460 by takenori-y
- [**Recipe**][**ESPnet2**][**TTS**][**README**] TTS recipe for J-KAC corpus 3468 by TanUkkii007
- [**Recipe**][**ESPnet2**][**TTS**][**README**] TTS recipes for thchs30 and aishell3 3470 by ftshijt
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update JMD README 3531 by takenori-y
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update SIWIS README 3509 by takenori-y
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Predict ASR transcript along with Intent for SLU 3480 by siddhu001
- [**Recipe**][**ESPnet2**][**SLU**][**README**] Update SWBD DA configuration 3425 by akreal

Bugfix
- [**Bugfix**][**ESPnet2**] Add return_complex=False for stft 3476 by D-X-Y
- [**Bugfix**][**ESPnet2**] Dynamic import for the ngram function 3420 by ftshijt
- [**Bugfix**][**ESPnet2**][**README**][**Recipe**] Add the GigaSpeech normalization and fix the WER 3519 by chaisz19
- [**Bugfix**][**ESPnet2**][**TTS**] Add duration and focus_rate in output dict 3469 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**TTS**] Add missing symlink to trim_silence.py for ESPnet2 3467 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**TTS**] Fix wrong arguments in pretrained vococder wrapper 3525 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**TTS**] Revert wrongly removed lines in `tts.sh` 3503 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**TTS**][**Typo**] Fix typo in hifigan 3504 by kan-bayashi

Refactoring
- [**Refactoring**][**ESPnet1**][**ASR**][**RNNT**][**README**] Transducer v5 3217 by b-flo
- [**Refactoring**][**ESPnet2**][**SE**][**DIAR**] Remove prefix `enh_` and `diar_` 3538 by kan-bayashi
- [**Refactoring**][**ESPnet2**][**TTS**] Refactor TTS modules in ESPnet2 3497 by kan-bayashi
- [**Refactoring**][**ESPnet2**][**TTS**] Remove the support of feats_type=fbank/stft in ESPnet2-TTS 3514 by kan-bayashi

Others
- [**CI**] Fix k2 version in CI using conda 3493 by kan-bayashi
- [**CI**] Fix test condition 3527 by kan-bayashi
- [**CI**][**Installation**] Update Sentencepiece and add python 3.9 to CI 3422 by shirayu
- [**Docker**] Docker Updates 3393 by Fhrozen
- [**Documentation**] Update the tutorial about maxlenratio usage 3523 by akreal
- [**Documentation**][**ESPnet2**][**TTS**] Update README.md 3502 by kan-bayashi
- [**Installation**][**README**] Added a link and a classifier for Python 3.9 3440 by shirayu
- [**Typo**] Fix typos in "egs" 3447 by shirayu
- [**Typo**][**Documentation**] Fix typos in "doc" 3441 by shirayu
- [**Typo**][**Documentation**] Fix typos in "utils" 3442 by shirayu
- [**Typo**][**ESPnet1**][**MT**] Fix typos in "espnet" 3444 by shirayu
- [**Typo**][**ESPnet2**] Fix typos in "espnet2" 3443 by shirayu
- [**Typo**][**ESPnet2**][**README**] Fix typos in "egs2" 3445 by shirayu


Acknowledgements

Special thanks to D-X-Y, Emrys365, Fhrozen, Jzmo, Takaaki-Saeki, TanUkkii007, YushiUeda, actboy, akreal, b-flo, bloodraven66, chaisz19, ftshijt, glynpu, jaesong, kan-bayashi, neillu23, sanket0211, shirayu, siddhu001, takenori-y.

v.0.10.1
New Features
- [**New Features**][**ESPnet2**] Porting existing pre-trained models to hugging face 3321 by siddhu001
- [**New Features**][**ESPnet2**][**ASR**][**CI**][**Installation**] k2_and_espnet2 3358 by glynpu
- [**New Features**][**ESPnet2**][**ASR**][**LM**][**CI**] espnet2 ngram 3345 by qmpzzpmq
- [**New Features**][**ESPnet2**][**Installation**] add s3prl frontend 3187 by simpleoier

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Fix the iconv error in hkust data prep 3397 by sw005320
- [**Recipe**][**ESPnet1**][**ASR**] mucs subtask2 baseline recipes (e2e and kaldi) 3362 by bloodraven66
- [**Recipe**][**ESPnet1**][**ESPnet2**][**ASR**] JTubeSpeech recipe and hkust espnet1 3406 by sw005320
- [**Recipe**][**ESPnet1**][**TTS**] CMU INDIC TTS 3347 by peter-yh-wu
- [**Recipe**][**ESPnet2**][**ASR**] ESPnet2 Recipe for Ksponspeech 3387 by YushiUeda
- [**Recipe**][**ESPnet2**][**ASR**] Fix gigaspeech pre-trained model link 3317 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**] LRS2 lipreading recipe 3346 by LiChenda
- [**Recipe**][**ESPnet2**][**ASR**] OpenSLR Sundanese ASR 3344 by peter-yh-wu
- [**Recipe**][**ESPnet2**][**ASR**] Recipe of JTubeSpeech 3311 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**] fix path error in local/score.sh in swbd 3349 by wonkyuml
- [**Recipe**][**ESPnet2**][**ASR**] updated javanese and sundanese readmes 3369 by peter-yh-wu
- [**Recipe**][**ESPnet2**][**ASR**][**Installation**] OpenSLR Javanese ASR 2960 by peter-yh-wu
- [**Recipe**][**ESPnet2**][**SLU**] Add initial Switchboard Dialogue Act classification recipe 3395 by akreal
- [**Recipe**][**ESPnet2**][**SLU**] FSC Espnet2 data preparation 3352 by siddhu001
- [**Recipe**][**ESPnet2**][**TTS**] Add HUI-audio-corpus-german recipe for ESPnet2-TTS 3375 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Add JMD recipe 3394 by takenori-y
- [**Recipe**][**ESPnet2**][**TTS**] Add RUSLAN recipe for ESPnet2-TTS 3378 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Support KSS dataset recipe for ESPnet2-TTS 3383 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update HUI audio corpus german recipe 3381 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update HUI-audio-corpus-german recipe results of ESPnet2-TTS 3391 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update KSS dataset recipe results of ESPnet2-TTS 3400 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update RUSLAN recipe results of ESPnet2-TTS 3390 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] indic tts without pretrained model 3401 by peter-yh-wu

Enhancement
- [**Enhancement**][**ESPnet2**] Update wav2vec2_encoder.py 3312 by brotheroak
- [**Enhancement**][**ESPnet2**][**TTS**] Add trim_silence for ESPnet2-TTS 3380 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Allow override default 'speed_control_alpha' parameter 3316 by airenas
- [**Enhancement**][**ESPnet2**][**TTS**] Support French G2P 3372 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support German G2P 3371 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support Korean G2P 3382 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support Russian G2P 3377 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Support Spanish G2P 3373 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Update README about G2P 3374 by kan-bayashi

Bugfix
- [**Bugfix**][**ESPnet1**][**ESPnet2**] Fix a type error of swbd data preparation. 3324 by pengchengguo
- [**Bugfix**][**ESPnet1**][**ESPnet2**][**TTS**] Fixed label modification in Taco2 or Transformer-TTS with R > 1 3392 by kan-bayashi
- [**Bugfix**][**ESPnet2**] fix a bug in OneCycleLR and CyclicLR 3319 by sw005320

Others
- [**Typo**][**ESPnet1**] Update batch_beam_search_online_sim.py 3367 by aky15
- [**Typo**][**ESPnet2**] Fixed typo in model name 3364 by kan-bayashi
- [**Typo**][**ESPnet2**] Update contextual_block_transformer_encoder.py 3354 by aky15

Acknowledgements
Special thanks to LiChenda, YushiUeda, airenas, akreal, aky15, bloodraven66, brotheroak, glynpu, kan-bayashi, pengchengguo, peter-yh-wu, qmpzzpmq, siddhu001, simpleoier, sw005320, takenori-y, wonkyuml.

v.0.10.0
From v.0.10.x, we drop the support pytorch < 1.3.
See more info in https://github.com/espnet/espnet/issues/3300

New Features and Enhancement
- [**New Features**][**ESPnet1**][**ASR**][**CI**] Dynamic quantization for decoding 3210 by xu-gaopeng
- [**New Features**][**ESPnet1**] Add quantize args 3280 by xu-gaopeng
- [**Enhancement**][**ESPnet2**][**README**] Update W&B integration 3278 by AyushExel
- [**Enhancement**][**ESPnet2**][**README**] Change the default value of use_wandb to False 3287 by kamo-naoyuki

Bugfix
- [**Bugfix**][**ESPnet1**] Fix some bugs in xml2stm.py 3252 by AshrafMahdhi
- [**Bugfix**][**ESPnet1**][**Recipe**] fix the required number of arguments 3249 by AshrafMahdhi
- [**Bugfix**][**ESPnet2**] Bug fix of accum_grad when grad-nan 3283 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix 3255 3257 by tjysdsg
- [**Bugfix**][**ESPnet2**] Fix bug when "--field -5" is passed to espnet2.bin.tokenize_text 3262 by tjysdsg
- [**Bugfix**][**ESPnet2**] Fix typo in asr.sh (espnet2) that might cause bug 3264 by tjysdsg
- [**Bugfix**][**ESPnet2**] Warn ignore_nan_grad with warpctc instead of error. 3298 by ShigekiKarita
- [**Bugfix**][**ESPnet2**][**TTS**] Fix a bug in the TTS transformer initialization 3251 by sw005320

Recipe
- [**Recipe**][**ESPnet1**][**ST**] Minor fix of Fisher-Callhome recipe 3305 by hirofumi0810
- [**Recipe**][**ESPnet2**][**ASR**] ESPnet2 Receipe for swbd 3269 by yuekaizhang
- [**Recipe**][**ESPnet2**][**ASR**][**README**] SWBD Result Update 3308 by roshansh-cmu
- [**Recipe**][**ESPnet2**][**SE**] Add scripts for DNS Interspeech 2020 in ESPNet-se 3259 by neillu23
- [**Recipe**][**ESPnet2**][**SE**][**README**] Pretrained model for vctk noisy reverberant recipe 3273 by LiChenda
- [**Recipe**][**ESPnet2**][**SE**][**README**] dns_ins20: Add README.md and real_recording testing data. 3281 by neillu23

Refactoring
- [**Refactoring**][**ESPnet2**][**ASR**] Update ctc.py 3292 by 200987299
- [**Refactoring**][**ESPnet1**][**ASR**][**MT**][**CI**][**README**] Delete old pytorch dispatch in espnet1 3301 by ShigekiKarita
- [**Refactoring**][**CI**][**Documentation**][**Installation**][**README**] Remove travis and add .github/workflows/doc.yml to deploy doc 3294 by ShigekiKarita
- [**Refactoring**][**CI**][**Installation**][**README**] Add pytorch 1.9.0 support and remove 1.0.1, 1.1.0, and 1.2.0 3299 by ShigekiKarita

Others
- [**Documentation**][**ESPnet2**] Add a comment for disabling the attention plot 3258 by sw005320
- [**ESPnet2**][**Installation**][**mergify**] Follow up for 3299, about pytorch1.9.0 in ci 3310 by kamo-naoyuki

Acknowledgements
Special thanks to 200987299, AshrafMahdhi, AyushExel, LiChenda, ShigekiKarita, hirofumi0810, kamo-naoyuki, neillu23, roshansh-cmu, sw005320, tjysdsg, xu-gaopeng, yuekaizhang.

v.0.9.10
New Features
- [**New Features**][**ESPnet1**][**ESPnet2**][**Installation**][**README**] CTC Segmentation for ESPnet 2 3087 by lumaku

Bugfix
- [**Bugfix**][**ESPnet1**] Fix merge_short_segments.py 3171 by hirofumi0810
- [**Bugfix**][**ESPnet1**] update layer norm to reflect the dimension variable 3193 by sw005320
- [**Bugfix**][**ESPnet1**][**ASR**] Fix a bug about variable spelling errors 3208 by lzm0706
- [**Bugfix**][**ESPnet1**][**ST**] Fix ST-TED data preparation 3167 by hirofumi0810
- [**Bugfix**][**ESPnet2**] Fix a bug of adding noise to the training data. 3220 by pengchengguo
- [**Bugfix**][**ESPnet2**] fix a bug in the CTC mode 3190 by sw005320
- [**Bugfix**][**ESPnet2**] fix typo for AdapterForSoundScpReader 3096 by deciding
- [**Bugfix**][**ESPnet2**] remove find_unused_parameters from DataParallel 3149 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**ASR**] Changed to include nlsyms.txt in the pretrained model 3236 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**ASR**] Fix missing nlsyms.txt for pretrained models 3234 by lumaku
- [**Bugfix**][**ESPnet2**][**ASR**] Workaround for missing nlsyms.txt 3235 by kamo-naoyuki
- [**Bugfix**][**ESPnet1**][**ASR**][**Installation**] GTN CTC bug fix, unit test, and installer 3199 by brianyan918
- [**Bugfix**][**ESPnet2**][**README**] Update README.md, edit wrong file link. 3164 by xxjjvxb

Enhancement
- [**Enhancement**] Added "trans_type" to utils/remove_longshortdata.sh and utils/update_json.sh 3148 by teinhonglo
- [**Enhancement**][**ESPnet2**][**SE**][**README**] Update the readme file for the SE demo page. 3225 by LiChenda
- [**Enhancement**][**ESPnet2**][**ASR**][**README**] update asr demo 3192 by ftshijt

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Fix segmentation in IWSLT21 ASR 3169 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**] Fix tokenization on TEDLIUM2 in IWSLT21 ASR recipe 3142 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**] fix add_to_datadir.py in mgb2 recipe 3238 by AshrafMahdhi
- [**Recipe**][**ESPnet1**][**ASR**] fix receipe bug for swbd 3174 by yuekaizhang
- [**Recipe**][**ESPnet1**][**ASR**][**RNNT**] Transducer configs & results for AISHELL-1 3240 by yusshino
- [**Recipe**][**ESPnet1**][**ASR**][**ST**] Fix IWSLT21 recipe for test set evaluation 3155 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ESPnet2**][**README**] endangered language recognition espnet2 recipe 3214 by ftshijt
- [**Recipe**][**ESPnet1**][**MT**] Add IWSLT21 MT recipe 3140 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**] Add IWSLT21 ST recipe 3150 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**] Fix IWSLT evaluation data preparation 3168 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**] IWSLT21 punctuation restoration recipe 3145 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**] Merge short segments in IWSLT test sets 3162 by hirofumi0810
- [**Recipe**][**ESPnet1**][**TTS**] Fix misspelling in ./egs/jsut/tts1/local/download.sh 3227 by muramasa2
- [**Recipe**][**ESPnet2**][**ASR**] Normalization for Open_li52 3215 by ftshijt
- [**Recipe**][**ESPnet2**][**SE**] ESPnet-SE Recipe for noisy reverberant dataset 3243 by LiChenda
- [**Recipe**][**ESPnet2**][**SE**][**README**] Update recipes for speech enhancement task 3153 by LiChenda

Acknowledgements
Special thanks to AshrafMahdhi, LiChenda, brianyan918, deciding, ftshijt, hirofumi0810, kamo-naoyuki, lumaku, lzm0706, muramasa2, pengchengguo, sw005320, teinhonglo, xxjjvxb, yuekaizhang, yusshino.

v.0.9.9
New Features

- [**New Features**][**ESPnet2**] Speaker diarization implementation in ESPnet 2939 by ftshijt
- [**New Features**][**ESPnet2**] Adding gpu_max_cached_mem_GB in reporter's stats 3057 by kamo-naoyuki
- [**New Features**][**ESPnet2**] add --detect_anomaly option 3035 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**SE**] Further update to speech enhancement task 2929 by shincling

Bugfix

- [**Bugfix**][**ESPnet1**] Fix a typo in the aishell config 3089 by sw005320
- [**Bugfix**][**ESPnet1**] Fix utils/speed_perturb.sh 3062 by hirofumi0810
- [**Bugfix**][**ESPnet1**] fix 3017 3022 by kamo-naoyuki
- [**Bugfix**][**ESPnet1**][**RNNT**] Fix+update RNN encoder 3048 by b-flo
- [**Bugfix**][**ESPnet1**][**RNNT**] Minor fix for NSC 3030 by b-flo
- [**Bugfix**][**ESPnet2**] Fix 3072 3073 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix ESPnet2-TTS conformer backward compatibility 3108 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix a bug when use_amp=True without fairscale 3029 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix logging for pytorch>=1.8 3056 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fixed backward compatibility issue of new conformer definition 3068 by hfujihara
- [**Bugfix**][**Installation**] Fix a bug of uninstalling typing 3058 by kamo-naoyuki
- [**Bugfix**][**Installation**] Fix setup.py to install filelock 3074 by kamo-naoyuki
- [**Bugfix**][**Installation**] fix the condition to install fairscale 3050 by kamo-naoyuki
- [**Bugfix**][**Recipe**][**ESPnet1**] Typo fixed for nahuatl recipe 3044 by ftshijt
- [**Bugfix**][**Recipe**][**ESPnet1**][**ASR**] Bugfix for download_and_untar for nahuatl 3049 by ftshijt
- [**Bugfix**][**Recipe**][**ESPnet1**][**ESPnet2**][**TTS**] Fix CSMSC download script 3109 by kan-bayashi
- [**Bugfix**][**Recipe**][**ESPnet2**][**TTS**][**README**] fixed typo 3121 3123 by kan-bayashi

Enhancement

- [**Enhancement**][**ASR**][**ESPnet1**][**RNNT**] Update loss report 3110 by b-flo
- [**Enhancement**][**ESPnet1**][**RNNT**] Fix related to custom encoder and aux task 3045 by b-flo
- [**Enhancement**][**ESPnet2**][**Documentation**][**Installation**][**README**] modification of freezing option for Wav2Vec encoder, add documents 3036 by simpleoier

Recipe

- [**Recipe**][**ESPnet1**][**ASR**] added results and uploaded models 3063 by sw005320
- [**Recipe**][**ESPnet1**][**ASR**][**ST**] fix download for puebla-nahuatl 3039 by ftshijt
- [**Recipe**][**ESPnet1**][**MT**] Update IWSLT18 MT recipe 3071 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**] IWSLT21-low-resource recipe 3023 by ftshijt
- [**Recipe**][**ESPnet1**][**ST**] Nahuatl Speech Translation 3034 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Added spgispeech recipe in espnet2 2986 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update librispeech result 3082 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Updated ami ihm result 3091 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] added a bpe10000 model and result 3060 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] gigaspeech 3077 by sw005320

Refactoring

- [**Refactoring**][**ESPnet1**] Refactor layer selection in Transformer 3024 by hirofumi0810
- [**Refactoring**][**ESPnet1**][**MT**][**ST**] Unify divide_lang.sh 3066 by hirofumi0810
- [**Refactoring**][**ESPnet2**] Make batch bins sampler faster 3106 by kamo-naoyuki
- [**Refactoring**][**Installation**] Use new pyopenjtalk version 3107 by kan-bayashi
- [**Refactoring**][**ESPnet1**][**ESPnet2**][**Installation**][**Docker**][**Documentation**] Change '!/bin/bash' to '!/usr/bin/env bash' 3059 by kamo-naoyuki

Other

- [**CI**][**Installation**][**README**][**mergify**] Using torch=1.8.1 in ci tests 3122 by kamo-naoyuki
- [**CI**][**Installation**][**README**][**mergify**] Adding pytorch=1.8.0 to the ci 3046 by kamo-naoyuki

Acknowledgements
Special thanks to b-flo, ftshijt, hfujihara, hirofumi0810, kamo-naoyuki, kan-bayashi, shincling, simpleoier, sw005320.

v.0.9.8
New Features
- [**New Features**][**ESPnet1**][**ASR**][**RNNT**] Auxiliary task 2951 by b-flo
- [**New Features**][**ESPnet1**][**Recipe**] RTF calculation 2942 by hirofumi0810
- [**New Features**][**ESPnet2**] Supporting multiple optimizers in the default trainer 3014 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**ASR**] Streaming Transformer ASR 2907 by eml914
- [**New Features**][**ESPnet2**][**ASR**][**Installation**] add wav2vec_encoder 2889 by simpleoier
- [**New Features**][**ESPnet2**][**Documentation**][**Installation**][**README**] Support sharded training of fairscale 2980 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**SE**] Add SeparateSpeech API in espnet2/bin/enh_inference.py 2878 by Emrys365
- [**New Features**][**ESPnet2**][**TTS**][**Installation**][**README**] Support phonemizer for vairous language G2P 2959 by kan-bayashi

Bugfix
- [**Bugfix**][**CI**][**Installation**] Install warp-ctc using pip>=21.0 2999 by ysk24ok
- [**Bugfix**][**ESPnet1**] Integration testing for asr_mix was using the wrong config. 3006 by siddalmia
- [**Bugfix**][**ESPnet1**][**ASR**] Fix model averaging 2910 by b-flo
- [**Bugfix**][**ESPnet1**][**ASR**] bug fixed for streaming transformer ASR 2981 by eml914
- [**Bugfix**][**ESPnet1**][**ASR**] builtin ctc modification 3001 by siddalmia
- [**Bugfix**][**ESPnet1**][**ASR**][**CI**] Fix transfer learning w/ pre-trained LM + finetuning tutorial 2967 by b-flo
- [**Bugfix**][**ESPnet1**][**ASR**][**RNNT**] Fix a condition in TSD 2965 by b-flo
- [**Bugfix**][**ESPnet1**][**ASR**][**Recipe**] fix egs/ljspeech/asr1 2865 2884 by kan-bayashi
- [**Bugfix**][**ESPnet1**][**ASR**][**Recipe**][**ST**] Fix bug in How2 recipe 2933 by hirofumi0810
- [**Bugfix**][**ESPnet1**][**ASR**][**Refactoring**] Fix data sorting in attention/CTC visualization 2883 by hirofumi0810
- [**Bugfix**][**ESPnet1**][**Docker**] Fix docker error caused by BeamSearchTransducer 2973 by b-flo
- [**Bugfix**][**ESPnet1**][**ESPnet2**] Fix bugs of our Conformer implementation. 2816 by pengchengguo
- [**Bugfix**][**ESPnet1**][**ESPnet2**][**Refactoring**] Fix arguments in dynamic and lightweight conv 3004 by hirofumi0810
- [**Bugfix**][**ESPnet1**][**RNNT**] fix out_dim definition 2915 by b-flo
- [**Bugfix**][**ESPnet1**][**TTS**] Fix attention plot bug 2984 2985 by kan-bayashi
- [**Bugfix**][**ESPnet1**][**mergify**] swbd run.sh is including dev data in the training set 2977 by brianyan918
- [**Bugfix**][**ESPnet2**] Fix sharded_ddp mode 3015 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] bug fix for Wav2Vec encoder 2997 by simpleoier
- [**Bugfix**][**ESPnet2**][**Documentation**] Fix for sharded training with amp 2993 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**Documentation**] Fix sharded training for multiple nodes 2994 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**SE**] quick fix for librimix (SE) data preparation 2982 by LiChenda

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Fix dev set in IWSLT21 ASR recipe 3000 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**] IWSLT'21 ASR recipe 2934 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**] Update IWSLT21 ASR recipe 2987 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**] Update the pre-trained Conformer model link of Aishell-1 corpus. 2924 by pengchengguo
- [**Recipe**][**ESPnet1**][**ASR**] Update transformer training results on common vioce dataset 2927 by wenjie-p
- [**Recipe**][**ESPnet1**][**ASR**][**CI**][**Installation**][**Refactoring**] Update IWSLT18 (ST-TED) ASR recipe 2916 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**][**MT**][**ST**][**README**] Must-C v2 recipe 2963 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**][**MT**][**ST**][**Refactoring**] Refactor Fisher-CallHome recipe 2904 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**][**MT**][**ST**][**Refactoring**] Refactor How2 recipe 2906 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**][**MT**][**ST**][**Refactoring**] Refactor Must-C recipe 2901 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**][**MT**][**ST**][**Refactoring**] Refactor libri-trans recipe 2903 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ASR**][**ST**][**Refactoring**] Update IWSLT'19 recipe 2940 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**][**CI**][**Refactoring**] Refactor ST recipes 2975 by hirofumi0810
- [**Recipe**][**ESPnet1**][**ST**][**Refactoring**] Refactor Mboshi-French corpus 2911 by hirofumi0810
- [**Recipe**][**ESPnet2**][**ASR**] Open-li52(add language id scoring & text case align for test set) 2938 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add Russian open STT recipe for ESPnet2 2972 by akreal
- [**Recipe**][**ESPnet2**][**ASR**][**README**] MLS (multi-lingual librispeech) recipe 2869 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update espnet2 librispeech result 2966 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] added nsc results 2937 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] fix librispeech model url 2976 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] minor fix of li52 and nsc recipes 2936 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] update the results of open li52 recipe 2974 by sw005320
- [**Recipe**][**ESPnet2**][**SE**] Librimix separation results for Conv-Tasnet, 8k, min 2928 by anogkongda
- [**Recipe**][**ESPnet2**][**SE**][**README**] Espnet-SE, Speech enhancement recipes 2888 by LiChenda

Enhancement
- [**Enhancement**][**ESPnet1**][**ASR**] Auto Resampling to 16khz for pretrained models 2969 by siddalmia
- [**Enhancement**][**ESPnet1**][**ASR**][**RNNT**] Minor refactoring 2932 by b-flo
- [**Enhancement**][**ESPnet1**][**ASR**][**RNNT**][**README**][**CI**][**Documentation**] Refactoring RNNT 2887 by b-flo
- [**Enhancement**][**ESPnet1**][**ESPnet2**][**ASR**][**LM**][**MT**][**TTS**] Print total params and trainable params. 2996 by siddalmia
- [**Enhancement**][**ESPnet1**][**LM**] Add LM options like embedding dropout and tie weights 3010 by siddalmia
- [**Enhancement**][**ESPnet1**][**ST**][**Refactoring**] Add the latest RPE implementation to the ST task. 3005 by pengchengguo

Other
- [**CI**][**README**][**mergify**] Stop circle ci 2978 by kamo-naoyuki
- [**Documentation**] Update docs for ESPnet contributing (especially for recipes part) 2905 by ftshijt
- [**Documentation**] fix a typo 3016 by Huang17
- [**Installation**] Uninstall typing 2979 by kamo-naoyuki

Acknowledgements
Special thanks to Emrys365, Huang17, LiChenda, akreal, anogkongda, b-flo, brianyan918, eml914, ftshijt, hirofumi0810, kamo-naoyuki, kan-bayashi, pengchengguo, siddalmia, simpleoier, sw005320, wenjie-p, ysk24ok.

v.0.9.7
New Feature

- [**New Features**][**ESPnet1**][**ASR**] Option for GTN CTC mode 2866 by brianyan918
- [**New Features**][**ESPnet2**][**SE**][**README**] Update to speech enhancement task 2649 by LiChenda
- [**New Features**][**ESPnet2**][**ASR**][**README**] Lightweight Sinc Convolutions for Espnet2 2768 by lumaku
- [**New Features**][**ESPnet2**][**Documentation**] --freeze_param option 2787 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**TTS**][**README**] Add a new G2P `pyopenjtalk_accent_with_pause` 2843 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**][**README**] Add pyopenjtalk_accent g2p for ESPnet2 TTS 2781 by ota
- [**New Features**][**ESPnet2**][**TTS**][**README**] Support X-vector based multi-speaker TTS model in ESPnet2 2800 by kan-bayashi

Enhancement

- [**Enhancement**][**ESPnet1**][**ESPnet2**] Add version info in args 2841 by kan-bayashi
- [**Enhancement**][**ESPnet1**][**ESPnet2**][**ASR**] AMI Recipe (Short UTT checker) 2802 by ftshijt
- [**Enhancement**][**Installation**] add default activate_python.sh 2788 by kamo-naoyuki
- [**Enhancement**][**Installation**] modified: check_install.py 2834 by kamo-naoyuki
- [**Enhancement**][**Installation**][**Documentation**][**ESPnet1**][**ESPnet2**] Change version info location 2840 by kan-bayashi

Bugfix

- [**Bugfix**][**ESPnet1**][**ASR**] fix greedy decoding 2812 by b-flo
- [**Bugfix**][**ESPnet2**][**ASR**] Fix the compatibility of the pretrained ASR model 2794 by kan-bayashi
- [**Bugfix**][**Installation**] Fix 2799 2830 by kamo-naoyuki
- [**Bugfix**][**Installation**] Fix HTS engine installation 2825 by kan-bayashi
- [**Bugfix**][**Installation**] fix the incorrect $PATH setting in tools/extra_path.sh 2833 by jumon
- [**Bugfix**][**Recipe**][**ESPnet1**][**ASR**] Minor fixes in CSJ 2837 by YosukeHiguchi
- [**Bugfix**][**Recipe**][**ESPnet1**][**ASR**] fix receipe bug for librispeech 2735 by yuekaizhang
- [**Bugfix**][**Recipe**][**ESPnet2**][**ASR**] fix a config name 2729 by sw005320
- [**Bugfix**][**Recipe**][**ESPnet2**][**ASR**][**README**] Fix dirha_wsj recipe 2747 by kamo-naoyuki
- [**Bugfix**][**Recipe**][**ESPnet2**][**TTS**] Add missing decoding configs in LibriTTS recipe 2827 by kan-bayashi

Recipe

- [**Recipe**][**ESPnet1**][**ASR**] Add LibriSpeech Conformer results for LibriCSS 2861 by akreal
- [**Recipe**][**ESPnet1**][**ASR**] Update Commonvoice Recipe with Conformer Settings 2739 by ftshijt
- [**Recipe**][**ESPnet1**][**ASR**] Update Russian open STT recipe for v1.01 of the dataset 2776 by akreal
- [**Recipe**][**ESPnet1**][**ASR**] Update models and results of Conformer. 2765 by pengchengguo
- [**Recipe**][**ESPnet1**][**ESPnet2**][**ASR**][**README**] ESPnet2 recipe for commonvoice 2793 by hchung12
- [**Recipe**][**ESPnet1**][**VC**][**README**] VCC2020 database 2754 by unilight
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update Dirha WSJ result 2756 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] espnet2 hkust recipe 2863 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] update the AMI result in espnet2 2817 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] updated the laborotv result 2750 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update reverb result 2876 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**] Minor fix of laborotv recipe 2877 by hfujihara
- [**Recipe**][**ESPnet2**][**TTS**] Fix total number of iterations 2813 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add libritts recipe for ESPnet2 2807 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Add x-vector based configs for VCTK 2808 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Minor update TTS README 2818 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update JSUT TTS results 2792 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update JSUT results 2809 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update JSUT results 2871 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update LibriTTS results 2842 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update VCTK results 2814 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] Update libritts results 2828 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**][**README**] update latest CSMSC link address 2777 by meowtech

Other

- [**CI**][**Documentation**][**Installation**] Change warp-ctc and warp-transducer to extra 2748 by kamo-naoyuki
- [**CI**][**README**] Update ci setting 2848 by kan-bayashi
- [**ASR**][**Documentation**][**ESPnet2**] Sinc Convolutions - add documentation for plot_sinc_filters.py 2782 by lumaku
- [**Documentation**][**ESPnet1**] fixed some typos 2855 by jumon
- [**Documentation**][**Installation**] Update documentation 2757 by kamo-naoyuki
- [**Installation**][**Refactoring**] Move the dependencies coming from recipes 2740 by kamo-naoyuki

Acknowledgements

Special thanks to AdolfVonKleist, LiChenda, YosukeHiguchi, akreal, b-flo, brianyan918, ftshijt, hchung12, hfujihara, jumon, kamo-naoyuki, kan-bayashi, lumaku, meowtech, ota, pengchengguo, sw005320, unilight, yuekaizhang.



v.0.9.6
New Feature
- [**New Features**][**ESPnet2**] Wandb integration 2707 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**ASR**] Add ignore_nan_grad option for CTC 2699 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**SE**] Touching common modules before the main Enh PR 2705 by LiChenda

Bug fix
- [**Bugfix**][**ESPnet1**] bug fix for pytorch1.7 2656 by kamo-naoyuki
- [**Bugfix**][**ESPnet1**][**ESPnet2**][**TTS**] Use `nkf` in CSMSC data prep 2726 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix flooring for global_mvn.py 2623 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix small bug of tensorboard part 2702 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix wandb mode with multi gpus 2709 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**TTS**] Fix token averaged feature the case when r > 1 2704 by kan-bayashi

Recipe
- [**Recipe**][**ESPnet1**] Extend model averaging condition in run scripts 2613 by b-flo
- [**Recipe**][**ESPnet1**][**ASR**] Enable multi-thread processing of json files. 2681 by Peidong-Wang
- [**Recipe**][**ESPnet1**][**ASR**] Update KsponSpeech conformer results 2624 by jubang0219
- [**Recipe**][**ESPnet1**][**ASR**] Update Voxforge with Conformer results 2642 by YosukeHiguchi
- [**Recipe**][**ESPnet1**][**ASR**] lang was being used before being parsed for user input 2654 by siddalmia
- [**Recipe**][**ESPnet1**][**ASR**][**ESPnet2**][**Installation**][**README**] espnet2 reverb recipe 2691 by kamo-naoyuki
- [**Recipe**][**ESPnet1**][**ASR**][**README**] Update Switchboard with conformer results 2697 by Emrys365
- [**Recipe**][**ESPnet1**][**ASR**][**README**] add librispeech conformer w/ speed perturbation + specaug 2617 by yuekaizhang
- [**Recipe**][**ESPnet2**][**ASR**] ASR template recipe: --srctexts -> --lm_train_text, --bpe_train_text 2660 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**] Add $token_type to asr_tag and lm_tag 2625 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**Installation**][**README**][**Recipe**] Laborotv recipe 2703 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Add AISHELL w/o LM result 2718 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] ESPnet2 recipe for TIMIT 2568 by sknadig
- [**Recipe**][**ESPnet2**][**ASR**][**README**] JSUT conformer recipe achieving 12.0/13.9 CER(%) for dev/eval1 2720 by hchung12
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update README.md 2659 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**][**README**] Update WSJ result 2628 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] espnet2 librispeech with conformer 2687 by sw005320
- [**Recipe**][**ESPnet2**][**README**] Corpus README in egs2 2713 by sw005320
- [**Recipe**][**ESPnet2**][**README**] update egs2/README.md 2719 by Emrys365

Enhancement
- [**Enhancement**][**Documentation**][**ESPnet2**] Add --init_param option 2680 by kamo-naoyuki
- [**Enhancement**][**ESPnet1**][**ASR**] Save model snapshot at every epoch even if save_interval_iters > 0 - for model averaging 2637 by sknadig
- [**Enhancement**][**ESPnet2**] Update wandb part 2708 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**][**ASR**] Add *_stats_dir options in asr.sh 2724 by kan-bayashi


Documentation
- [**Documentation**][**ESPnet2**][**README**] Update egs2 README 2723 by kan-bayashi
- [**Documentation**][**ESPnet2**][**README**][**TTS**] Update README about fine-tuning 2685 by kan-bayashi
- [**Documentation**][**ESPnet2**][**README**][**TTS**] Update TTS README.md 2650 by kan-bayashi

Refactoring
- [**Refactoring**][**ESPnet1**][**ASR**][**README**] Refactor Mask CTC non-autoregressive ASR 2223 by YosukeHiguchi
- [**Refactoring**][**ESPnet2**] Added unicode support for generated configs 2672 by Piteryo

Others
- [**Installation**] python setup.py install -> pip install -e 2619 by kamo-naoyuki
- [**Installation**][**Refactoring**] modify for zsh: tools/extra_path.sh 2696 by kamo-naoyuki
- [**Docker**] Docker flags for extra libraries (VC) 2622 by Fhrozen

Acknowledgements
Special thanks to Emrys365, Fhrozen, LiChenda, Peidong-Wang, Piteryo, YosukeHiguchi, b-flo, hchung12, jubang0219, kamo-naoyuki, kan-bayashi, siddalmia, sknadig, sw005320, yuekaizhang.

v.0.9.5
New Features
- [**New Features**][**ESPnet2**][**TTS**] Support `g2p=none` for text with phonemes 2551 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Add MCD evaluation script for ESPnet2-TTS 2554 by kan-bayashi
- [**New Features**][**ESPnet1**][**ST**] Conformer End-to-End Speech Translation 2523 by hirofumi0810

Bugfix
- [**Bugfix**][**ESPnet1**] CTC segmentation - package update 2566 by lumaku
- [**Bugfix**][**ASR**][**ESPnet1**] fix bug about att_ws in multi-enc case 2549 by lzm0706
- [**Bugfix**][**ESPnet1**] Conformer averaging model support for pytorch 1.6 2604 by siddalmia
- [**Bugfix**][**ESPnet1**][**ASR**] Set built-in CTC for asr_recog 2588 by lumaku
- [**Bugfix**][**ESPnet1**][**ASR**][**Installation**] Transducer float16 loss bug fix 2496 by GNroy

Refactoring
- [**Refactoring**][**ESPnet1**][**ASR**] Refactor BeamSearchTransducer and ErrorCalculatorTrans 2538 by b-flo

Recipe
- [**Recipe**][**ESPnet1**][**ASR**] Alignment recipe for CSJ. 2531 by jnishi
- [**Recipe**][**ESPnet1**][**ASR**] New Recipe for KsponSpeech (Korean spontaneous speech; 969 hours) 2555 by jubang0219
- [**Recipe**][**ESPnet1**][**ASR**] Update TedLium3 conformer results 2600 by LiChenda
- [**Recipe**][**ESPnet1**][**ASR**] Update VIVOS models 2574 by b-flo
- [**Recipe**][**ESPnet1**][**ASR**] Update model link in Puebla-Nahuatl 2607 by ftshijt
- [**Recipe**][**ESPnet1**][**ASR**] Update tedlium2 with conformer results 2599 by Emrys365
- [**Recipe**][**ESPnet1**][**ASR**] update the JSUT recipe with conformer 2546 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**] Add CSJ conformer config 2560 by kan-bayashi
- [**Recipe**][**ESPnet2**][**ASR**] Add CSJ conformer results 2552 by kan-bayashi
- [**Recipe**][**ESPnet2**][**ASR**] Small changes for aishell config 2586 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**] Update espnet2 AISHELL results 2580 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**] update JSUT espnet2 with pre-trained models 2563 by sw005320
- [**Recipe**][**ESPnet2**][**TTS**] Add JSSS recipe for ESPnet2-TTS 2558 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update ESPnet2 TTS result 2542 by kan-bayashi

CI
- [**CI**][**Documentation**] Support espnet2/bin in sphinx doc. 2544 by ShigekiKarita
- [**CI**][**Installation**][**README**] Add pytorch1.7.0 ci test 2605 by kamo-naoyuki

Other
- [**Installation**] Install warpctc-pytorch wheel when torch version is 1.1 - 1.6 2547 by ysk24ok
- [**Installation**] Modified requirements: "dataclasses; python_version < '3.7'", 2541 by kamo-naoyuki
- [**Installation**] Remove pip3 check in setup_python.sh 2567 by ShigekiKarita

Acknowledgements
Special thanks to Emrys365, GNroy, LiChenda, ShigekiKarita, b-flo, ftshijt, hirofumi0810, jnishi, jubang0219, kamo-naoyuki, kan-bayashi, lumaku, lzm0706, siddalmia, sw005320, ysk24ok.

v.0.9.4
New Features

- [**New Features**][**ESPnet1**][**ASR**] Transducer v4 2444 by b-flo
- [**New Features**][**ESPnet2**] Support audio_format=flac.ark, wav.ark 2451 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**ASR**] Support conformer encoder in ESPnet2 ASR 2515 by kan-bayashi

Bugfix

- [**Bugfix**][**ESPnet1**] Fixed IndexError in BatchBeamSearch.post_process() (2483) 2484 by kan-bayashi
- [**Bugfix**][**ESPnet1**][**LM**] fix multigpu bug if pytorch>=1.5 2492 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] remove cleaner 2529 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**TTS**] Fix TTS inference bug for GST + Fastspeech2 2498 by kan-bayashi

Documentation

- [**Documentation**] Update espnet2_tutorial.md 2528 by kamo-naoyuki
- [**Documentation**] Update espnet2_tutorial.md 2532 by kamo-naoyuki
- [**Documentation**] Update espnet2_tutorial.md 2534 by kamo-naoyuki
- [**Documentation**] Update notebook submodule 2499 by kan-bayashi
- [**Documentation**][**ESPnet1**] Small fixes for transducer 2514 by b-flo
- [**Documentation**][**ESPnet2**][**README**][**TTS**] Update ESPnet2 TTS README 2516 by kan-bayashi
- [**Documentation**][**README**] Update README 2504 by kan-bayashi
- [**Documentation**][**README**][**ESPnet1**] CTC segmentation - checks for blank chars and RNN models 2535 by lumaku

Recipe

- [**Recipe**][**ESPnet1**][**ASR**] add conformer results for librispeech 2510 by yuekaizhang
- [**Recipe**][**ESPnet2**][**ASR**] Update ESPnet2 CSJ Transformer results 2497 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Add results for ESPnet2 TTS 2503 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update Transformer-TTS config 2494 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Update Transformer-TTS configs 2502 by kan-bayashi

Refactoring

- [**Refactoring**] Modify uttid to "${spkid}-${uttid}" for trn files 2527 by kamo-naoyuki
- [**Refactoring**][**ESPnet1**][**ASR**][**LM**] Remove all __future__ lines 2481 by ShigekiKarita
- [**Refactoring**][**ESPnet1**][**ASR**][**MT**][**ST**] Unify arguments 2506 by hirofumi0810
- [**Refactoring**][**ESPnet1**][**ESPnet2**][**TTS**] Refactor length regulator to improve the speed 2482 by kan-bayashi
- [**Refactoring**][**ESPnet1**][**MT**][**ST**] Refactor decoding for translation tasks 2501 by hirofumi0810
- [**Refactoring**][**ESPnet2**] Change add_scalars to add_scalar for tensorboard SummaryWriter 2525 by kamo-naoyuki

CI

- [**CI**][**ASR**] Make test_e2e_asr.py faster 2488 by ShigekiKarita
- [**CI**][**ASR**] Make test_e2e_asr_maskctc.py faster. 2493 by ShigekiKarita
- [**CI**][**ASR**] Make test_recog.py faster 2486 by ShigekiKarita
- [**CI**][**ESPnet1**][**ASR**] make test_e2e_asr_mulenc.py faster 2480 by ruizhilijhu
- [**CI**][**ESPnet1**][**Installation**] Update shellcheck url. 2500 by ShigekiKarita
- [**CI**][**ESPnet2**][**Installation**] Limit test execution time to 2.0 sec 2520 by ShigekiKarita
- [**CI**][**SE**] Make test_beamformer_net.py faster 2489 by ShigekiKarita
- [**CI**][**SE**] shorten test time for tasnet 2491 by LiChenda

Other

- [**Installation**] Update h5py version to avoid errors in Python3.8 2519 by shigabeev
- [**Docker**] Docker Updates 2509 by Fhrozen

Acknowledgements

Special thanks to Fhrozen, LiChenda, ShigekiKarita, b-flo, hirofumi0810, kamo-naoyuki, kan-bayashi, lumaku, ruizhilijhu, shigabeev, yuekaizhang.

v.0.9.3
New Features

- [**New Features**][**ESPnet2**] Implement --grad_clip_type 2399 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**ASR**] Implement batch_score() method for ASR decoder and LM 2377 by kamo-naoyuki
- [**New Features**][**ESPnet2**][**README**][**TTS**] Support Conformer-based FastSpeech / FastSpeech2 2413 by kan-bayashi

Bugfix

- [**Bugfix**][**CI**][**ESPnet1**][**ESPnet2**] make sure chainer independent 2411 by kamo-naoyuki
- [**Bugfix**][**CI**][**ESPnet1**][**Installation**] Revert ctc seg installation 2392 by kan-bayashi
- [**Bugfix**][**CI**][**Installation**] Fix the installation error in CI 2476 by kan-bayashi
- [**Bugfix**][**ESPnet1**][**ASR**] Lazy import chainer in asr_utils.py 2407 by kamo-naoyuki
- [**Bugfix**][**ESPnet1**][**ASR**] asr: Fix recog issue on Transformer CTC model 2394 by jaesong
- [**Bugfix**][**ESPnet1**][**MT**][**ST**] Fix score_bleu.sh 2400 by hirofumi0810
- [**Bugfix**][**ESPnet1**][**README**][**Typo**] fixed typo in egs/README.md 2473 by mrazizi
- [**Bugfix**][**ESPnet1**][**TTS**] lazy import chainer: espnet/nets/tts_interface.py 2409 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Add missing database in db.sh 2427 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix the CommonPreprocessor_multi missing issue 2460 by LiChenda
- [**Bugfix**][**ESPnet2**] Minor fix of egs2/commonvoice/asr1/local/data.sh 2438 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix the directory for init_file_prefix 2412 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix typo of log_level choices 2472 by glynpu
- [**Bugfix**][**ESPnet2**][**ASR**] Add grep -H option 2388 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**TTS**] Fix wrong sum axis in energy extraction 2469 by kan-bayashi
- [**Bugfix**][**ESPnet2**][**Typo**] Fix typo in help comment and docstrings 2470 by kan-bayashi
- [**Bugfix**][**Installation**] add warpctc_pytorch version==0.1.2 2403 by kamo-naoyuki

Documentation

- [**Documentation**] Add bug report template 2396 by sw005320
- [**Documentation**] Add installation issue template 2397 by sw005320
- [**Documentation**] Update espnet2_distributed.md 2418 by kamo-naoyuki
- [**Documentation**] Update espnet2_distributed.md 2419 by kamo-naoyuki
- [**Documentation**] Update espnet2_training_option.md 2421 by kamo-naoyuki
- [**Documentation**] Update faq.md 2431 by kamo-naoyuki
- [**Documentation**] Update parallelization.md 2428 by kamo-naoyuki
- [**Documentation**][**ESPnet2**][**README**] Update README.md 2430 by kamo-naoyuki

Enhancement

- [**Enhancement**][**ESPnet1**][**ESPnet2**] Add -c option for multi GPUs mode for slurm.conf 2406 by kamo-naoyuki
- [**Enhancement**][**ESPnet1**][**Installation**] Install warpctc-pytorch wheel when torch version is 1.1, 1.2 or 1.3 2453 by ysk24ok
- [**Enhancement**][**ESPnet1**][**README**] ADD CSJ RNN pretrained model 2452 by jnishi
- [**Enhancement**][**ESPnet2**] Update db.sh 2426 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**][**TTS**] Update ESPnet2 TTS config 2468 by kan-bayashi
- [**Enhancement**][**ESPnet2**][**TTS**] Update and add fastspeech2 configs 2429 by kan-bayashi
- [**Enhancement**][**Installation**] Add sanity check for setup_cuda_env.sh 2389 by kamo-naoyuki
- [**Enhancement**][**Installation**] Change cudatoolkit to cuda if cuda_version=8.0 2405 by kamo-naoyuki
- [**Enhancement**][**Installation**] Change to refer https://anaconda.org/pytorch/pytorch/files #2404 by kamo-naoyuki
- [**Enhancement**][**Installation**] Workaround for soundfile issue 2437 by kamo-naoyuki

Recipe

- [**Recipe**][**ESPnet1**][**ASR**] Add LibriCSS recipe 2246 by akreal
- [**Recipe**][**ESPnet1**][**ASR**] Update for the Official Split of YM Recipe 2435 by ftshijt
- [**Recipe**][**ESPnet1**][**ESPnet2**][**ASR**] Update CommonVoice for Latest Version 2455 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**] [zeroth korean] Not to use pipe format if feats_type=raw 2402 by kamo-naoyuki
- [**Recipe**][**ESPnet2**][**ASR**][**README**] espnet2 zeroth_korean recipe changing feats_type from fbank_pitch to raw. 2393 by hchung12
- [**Recipe**][**ESPnet2**][**README**][**TTS**] Add ESPnet2 TTS finetuning example recipe (JVS) 2465 by kan-bayashi

CI

- [**CI**] Add codecov actions. 2467 by ShigekiKarita
- [**CI**] Fix hangup of unittests 2424 by kamo-naoyuki
- [**CI**] Make espnet2 tts test faster 2461 by kan-bayashi
- [**CI**] Make test_e2e_{asr,st,mt}_{transformer,conformer}.py faster. 2464 by ShigekiKarita
- [**CI**] Update .gitignore 2434 by kan-bayashi
- [**CI**][**ESPnet1**] Make test_(batch_)beam_search.py faster. 2462 by ShigekiKarita
- [**CI**][**ESPnet1**] Support Debian9 and CentOS7 in Github Actions 2457 by ShigekiKarita
- [**CI**][**ESPnet1**][**Installation**] Fix HKUST recipe 2440 by kamo-naoyuki

Acknowledgements
Special thanks to LiChenda, ShigekiKarita, akreal, ftshijt, glynpu, hchung12, hirofumi0810, jaesong, jnishi, kamo-naoyuki, kan-bayashi, mrazizi, sw005320, ysk24ok.

v.0.9.2
New Features
- [**New Features**][**ESPnet1**] CTC segmentation 2301 by lumaku
- [**New Features**][**ESPnet2**] Support multiple averaged nbest models 2353 by kamo-naoyuki
- [**New Features**][**ESPnet2**] Support recursive add in pack_funcs and add images to packed model 2367 by kamo-naoyuki

Bugfix
- [**Bugfix**][**ASR**][**ESPnet1**] remove ff_scale from conformer constructor arguments 2356 by koji-okabe-hub
- [**Bugfix**][**ASR**][**ESPnet2**] use lm_exp instead of lm_tag for inference_tag 2352 by kamo-naoyuki
- [**Bugfix**][**CI**][**ESPnet1**][**Installation**] Remove ctc_segmentation temporary 2385 by kan-bayashi
- [**Bugfix**][**ESPnet1**] Fix import error of conformer module 2384 by kan-bayashi
- [**Bugfix**][**ESPnet1**] Fix issue https://github.com/espnet/espnet/issues/2211 #2219 by Emrys365
- [**Bugfix**][**ESPnet2**] Add missing __init__.py 2326 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix --out_filename option: format_wav_scp.sh 2348 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix amp 2362 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] add egs2/an4/asr1/local/path.sh 2343 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix recursive add: espnet2/main_funcs/pack_funcs.py 2369 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] remove unused import 2331 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**Installation**][**Typo**] fix typo 2344 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**README**] Fix typo 2372 by Piteryo
- [**Bugfix**][**ESPnet2**][**TTS**] make vietnamese_cleaner to opiton 2341 by kamo-naoyuki
- [**Bugfix**][**Installation**] Fix python version check for chainer 2342 by kamo-naoyuki
- [**Bugfix**][**Installation**] add undefined variable: check_pytorch_cuda_compatibility.py 2361 by kamo-naoyuki
- [**Bugfix**][**TTS**] Fix device allocation error in guided attention loss 2282 2317 by kan-bayashi

Documentation
- [**Documentation**] updated comment on the documentation 2351 by GauravPandey892
- [**Documentation**][**ESPnet2**] Update TTS README 2316 by kan-bayashi
- [**Documentation**][**ESPnet2**][**README**] Update ESPnet2 TTS README 2376 by kan-bayashi
- [**Documentation**][**ESPnet2**][**README**][**TTS**] Update README 2330 by kan-bayashi
- [**Documentation**][**Installation**] Devide setup_python.sh into setup_venv.sh and setup_python.sh 2382 by kamo-naoyuki
- [**Documentation**][**Installation**] add a description about check install. 2360 by sw005320
- [**Documentation**][**README**] CTC segmentation - Demo 2347 by lumaku
- [**Documentation**][**README**] Update README.md 2379 by kamo-naoyuki

Enhancement
- [**Enhancement**][**ESPnet2**] Change the default inference model to averaged model instead of the best 2346 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**][**TTS**] Add pitch and energy stats in packing 2350 by kan-bayashi
- [**Enhancement**][**Installation**] Add checking for pytorch-cuda compatibility in Makefile 2334 by kamo-naoyuki
- [**Enhancement**][**Installation**] Show raw error message when failed to import packages 2374 by kamo-naoyuki

Refactoring
- [**Refactoring**] Apply new version black 2366 by kamo-naoyuki
- [**Refactoring**][**ASR**][**ESPnet2**] Not to add _sp to $asr_exp if --asr_exp option is specified 2368 by kamo-naoyuki
- [**Refactoring**][**CI**][**ESPnet1**][**ESPnet2**][**Installation**] Add installers for sctk and sph2pipe and create tools/extra_path.sh 2332 by kamo-naoyuki
- [**Refactoring**][**ESPnet1**][**Recipe**] Disable preparation for lm in wsj recipe 2373 by kamo-naoyuki
- [**Refactoring**][**ESPnet2**] Update Task design 2345 by kamo-naoyuki
- [**Refactoring**][**ESPnet2**][**SE**] Remove unused option from enh.sh:--feats_normalize 2325 by kamo-naoyuki

Recipe
- [**Recipe**][**ASR**][**ESPnet1**] MGB-2 2289 by AmirHussein96
- [**Recipe**][**ASR**][**ESPnet1**] Remove duplicated class definition of Conformer and update some new results of Aishell1 and Switchboard. 2364 by pengchengguo
- [**Recipe**][**ASR**][**ESPnet2**][**README**] ASR WSJ RESULT update: Tuning LM 2355 by kamo-naoyuki
- [**Recipe**][**ASR**][**ESPnet2**][**README**] add pretrained model link 2378 by kamo-naoyuki

CI
- [**CI**][**README**] Update ubuntu images in circle ci 2349 by ShigekiKarita
- [**CI**][**mergify**] Update .mergify.yml 2333 by kamo-naoyuki
- [**CI**][**mergify**] Update .mergify.yml 2354 by kamo-naoyuki

Acknowledgements
Special thanks to AmirHussein96, Emrys365, GauravPandey892, Piteryo, ShigekiKarita, kamo-naoyuki, kan-bayashi, koji-okabe-hub, lumaku, pengchengguo, sw005320.

v.0.9.1
New Features
- [**New Features**] Add metric option to checkpoint averaging for Transformer 2259 by hirofumi0810
- [**New Features**][**ESPnet2**] Generate run.sh in the experiment dir for resuming 2284 by kamo-naoyuki
- [**New Features**][**ESPnet2**] Support larger num_iters_per_epoch than the number of batches in small corpus 2255 by kamo-naoyuki
- [**New Features**][**ESPnet2**] Support torch native automatic mixed precision for espnet2 2257 by kamo-naoyuki

Documentation
- [**Documentation**] Update comments in MultiHeadAttention 2266 by placebokkk
- [**Documentation**][**ESPnet2**] append comment in reporter.py 2267 by kamo-naoyuki
- [**Documentation**][**ESPnet2**][**README**][**TTS**] Add ESPnet2 TTS recipe document 2312 by kan-bayashi

Enhancement
- [**Enhancement**][**ESPnet2**] Tensorboard stats between iterations 2252 by kamo-naoyuki

Refactoring
- [**Refactoring**][**ESPnet2**] Add some new features and a new recipe for the enhancement task 2238 by Emrys365
- [**Refactoring**][**Documentation**] Remove installation part of Python from Makefile 2245 by kamo-naoyuki

Recipe
- [**Recipe**][**ASR**] aidatatang conformer ESPnet1 recipe 2269 by nzhoward
- [**Recipe**][**ESPnet2**] espnet2 zeroth_korean recipe 2279 by hchung12

Bug fix
- [**Bugfix**] Fix 2295 2311 by kan-bayashi
- [**Bugfix**] Minor fix for Makefile 2268 by kamo-naoyuki
- [**Bugfix**] Not to install cupy-cuda* for python>=3.8 2277 by kamo-naoyuki
- [**Bugfix**] Remove channel: setup_anaconda.sh 2303 by kamo-naoyuki
- [**Bugfix**][**ASR**] ngram single decoding bug fix 2299 by qmpzzpmq
- [**Bugfix**][**ASR**][**ESPnet2**] Add missing __init__.py 2292 by kamo-naoyuki
- [**Bugfix**][**ASR**][**ESPnet2**] decode -> inference 2276 by kamo-naoyuki
- [**Bugfix**][**ASR**][**ESPnet2**] remove chainer dependency from show_asr_result.sh 2281 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Avoid illegal summary name for tensorboard 2294 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix average_nbest_models for pytorch=1.6 2283 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix decode config extension in ESPnet2 CSJ recipe 2258 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix for queue-freegpu.pl 2274 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix samplers about min_batch_size 2305 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Workaround for SGE jobname issue 2253 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] add missing shebang 2306 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix bug of reporter 2263 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**Recipe**] Update zeroth_korean 2308 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**SE**] add --spk-num 1 2285 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**distributed**] Not to save config.yaml if rank!=0 2287 by kamo-naoyuki

Others
- [**CI**] Remove unnecessary installation when CI 2307 by kamo-naoyuki
- [**CI**] Take integration tests into coverage 2254 by ShigekiKarita
- [**CI**][**ESPnet2**] Add coverage measure for espnet2 integration test 2256 by kamo-naoyuki
- [**CI**][**Installation**] Install wheel 2304 by kamo-naoyuki

Acknowledgements
Special thanks to Emrys365, ShigekiKarita, hchung12, hirofumi0810, kamo-naoyuki, kan-bayashi, nzhoward, placebokkk, qmpzzpmq.

v.0.9.0
New Features
- [**New Features**][**ASR**] Non-autoregressive ASR with Mask CTC 2070 by YosukeHiguchi
- [**New Features**][**ASR**] Support Conformer model. 2144 by pengchengguo
- [**New Features**][**ASR**][**ST**] CTC posterior visualization during training 2221 by hirofumi0810
- [**New Features**][**ESPnet2**] Implement espnet2.bin.zenodo_upload 2168 by kamo-naoyuki
- [**New Features**][**ESPnet2**] Python API for inference 2092 by kamo-naoyuki
- [**New Features**][**ESPnet2**] Support TTS-Transformer in ESPnet2 2134 by kan-bayashi
- [**New Features**][**ESPnet2**][**ASR**] Enable batch joint decoding with CTC in recog API v2 2197 by takaaki-hori
- [**New Features**][**ESPnet2**][**SE**] Speech Enhancement Frontend for ESPNet2 Phase 1 2124 by LiChenda
- [**New Features**][**ESPnet2**][**TTS**] Support FastSpeech for ESPnet2 TTS 2149 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support FastSpeech2 (+FastPitch) 2218 by kan-bayashi
- [**New Features**][**ESPnet2**][**TTS**] Support GST in ESPnet2 TTS 2139 by kan-bayashi
- [**New Features**][**README**][**ASR**] CTC forced alignment in E2E ASR Transformer model 2095 by simpleoier
- [**New Features**][**VC**] Voice Transformer Network 2064 by unilight

Enhancement
- [**Enhancement**] Fix error when downloading large files using `download_from_google_drive.sh` 2074 by unilight
- [**Enhancement**][**ASR**] added more beam search info 2130 by sw005320
- [**Enhancement**][**ESPnet2**] Change packed file of espnet2 to zip format 2161 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**] Make read_text faster 2114 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**] RESULTS.md -> README.md 2077 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**] Remove long wave in template recipe 2075 by kamo-naoyuki
- [**Enhancement**][**ESPnet2**] Update ESPnet2 JSUT TTS recipe and TTS template 2110 by kan-bayashi
- [**Enhancement**][**MT**][**ST**] Fix ST/MT models for compatibility with ASR 2179 by hirofumi0810
- [**Enhancement**][**ST**] Add source case information to json files in ST task 2208 by hirofumi0810
- [**Enhancement**][**ST**] Refactor multi-task learning in ST 2202 by hirofumi0810

Recipe
- [**Recipe**][**ASR**] Add aidatatang_200zh recipe 2122 by nzhoward
- [**Recipe**][**ASR**] Add chime6 info 2250 by sw005320
- [**Recipe**][**ASR**] CHiME-6 recipe 2171 by GNroy
- [**Recipe**][**ASR**] Fix a bug in espnet wsj recipe. 2145 by houwenxin
- [**Recipe**][**ASR**] New Recipe for Yoloxóchitl-Mixtec (SLR89) 2085 by ftshijt
- [**Recipe**][**ASR**] Support averaging model for Conformer. 2244 by pengchengguo
- [**Recipe**][**ASR**] Updated model after tuning aidatatang_200zh recipe 2204 by nzhoward
- [**Recipe**][**ASR**] created a recipe to run asr on ljspeech 1996 by ibkuroyagi
- [**Recipe**][**ASR**] updatemodel link (add pre-trained bpe model and lm model) 2101 by ftshijt
- [**Recipe**][**ESPnet2**][**ASR**] espnet2 librispeech recipe 2109 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**] espnet2 librispeech v2 2189 by sw005320
- [**Recipe**][**ESPnet2**][**ASR**] update espnet2 aishell results 2150 by Cescfangs
- [**Recipe**][**ESPnet2**][**ASR**][**TTS**] fix dev_set/eval_sets issues 2142 by sw005320
- [**Recipe**][**ESPnet2**][**TTS**] Add ESPnet2 CSMSC TTS recipe 2129 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Add ESPnet2 LJSpeech recipe 2117 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Add VCTK recipe for ESPnet2 TTS 2165 by kan-bayashi
- [**Recipe**][**ESPnet2**][**TTS**] Create espnet2 jsut/tts recipe 2047 by kamo-naoyuki

Refactoring
- [**Refactoring**][**ESPnet2**] Change stats_dir naming not to overwrite 2111 by kan-bayashi
- [**Refactoring**][**ESPnet2**] Move modules 2086 by kamo-naoyuki
- [**Refactoring**][**ESPnet2**] Remove $KALDI_ROOT/tools/env.sh from path.sh 2242 by kamo-naoyuki
- [**Refactoring**][**ESPnet2**] Several update for pretrain model 2212 by kamo-naoyuki
- [**Refactoring**][**ESPnet2**] Update Makefile 2225 by kamo-naoyuki

Documentation
- [**README**] Fix URL in README 2090 by kan-bayashi
- [**README**] Update README about TTS 2079 by kan-bayashi
- [**README**] Update README.md 2046 by kamo-naoyuki
- [**README**] Update README.md 2067 by kamo-naoyuki
- [**README**] Update README.md 2243 by kamo-naoyuki
- [**README**] Update citation 2206 by hirofumi0810
- [**README**] Update installation.md 2233 by kamo-naoyuki
- [**README**][**ESPnet2**] Update egs2/TEMPLATE/README.md 2098 by kamo-naoyuki

Bugfix
- [**Bugfix**] Add cupy.done in make python 2091 by kan-bayashi
- [**Bugfix**] Append a missing space in cmd-line args in utils/dump_pcm.sh 2209 by yistLin
- [**Bugfix**] Fix Makefile 2097 by kamo-naoyuki
- [**Bugfix**] Fix minor bug of Makefile 2055 by kamo-naoyuki
- [**Bugfix**] Fix old model compatibility 2048 2060 2063 by kan-bayashi
- [**Bugfix**] Fix pretrained model 2053 2069 by kan-bayashi
- [**Bugfix**] Fix pyopenjtalk installation 2108 by kan-bayashi
- [**Bugfix**] Fix typo in run.sh of TTS recipes 2216 by hirofumi0810
- [**Bugfix**] Update Makefile to disable cupy for cuda=10.2 or later 2230 by kamo-naoyuki
- [**Bugfix**] fix path of PESQ 2058 by kamo-naoyuki
- [**Bugfix**] scorerinterface warning English correction 2076 by qmpzzpmq
- [**Bugfix**][**CI**] Fix bug in attention plotting 2185 by hirofumi0810
- [**Bugfix**][**CI**] Freeze the matplotlib version with 3.1.0 2181 by sw005320
- [**Bugfix**][**CI**] fix integration_test_ctc_align_wav.bats with a small model 2170 by simpleoier
- [**Bugfix**][**CI**] temporally disable subsample 6 and 8 tests 2205 by sw005320
- [**Bugfix**][**CI**][**MT**][**ST**] Add integration test for ST/MT tasks 2210 by hirofumi0810
- [**Bugfix**][**ESPnet2**] Add missing path.sh in egs2/vctk/tts1 2167 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix TTS inference 2222 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix `tts_inference` when `feats_extract` is None 2176 by kan-bayashi
- [**Bugfix**][**ESPnet2**] Fix bug for feats_type=extracted 2087 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix bug of iterable dataset when num_workers>=1 2081 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix bug of when espnet2/bin/tokenize_text.py --cutoff or --vocabulary_size is used 2158 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Fix log: benchmark -> deterministic 2080 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Implement configargparse in espnet2 2157 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] Select torchaudio version according to torch version 2214 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] avoid UnboundLocalError when lm is not loaded 2227 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix 2050 2051 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix 2198: PhonemeTokenizer can't perform with multiprocessing 2201 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix best_model_criterion: wsj/asr1/conf/tuning/train_lm.yaml 2153 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix bug of lm.py 2056 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix the stage number: enh.sh 2220 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**] fix: decode_config -> inference_config 2239 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**Recipe**] Not removing short/long utterances for eval_sets 2112 by kamo-naoyuki
- [**Bugfix**][**ESPnet2**][**SE**] Fix bugs in espnet2/enh and format related directory structures 2215 by Emrys365
- [**Bugfix**][**ESPnet2**][**TTS**] Fix feature extractor of TTS for compatibility 2102 by kamo-naoyuki

Acknowledgements

Special thanks to Cescfangs, Emrys365, GNroy, LiChenda, YosukeHiguchi, ftshijt, hirofumi0810, houwenxin, ibkuroyagi, kamo-naoyuki, kan-bayashi, nzhoward, pengchengguo, qmpzzpmq, simpleoier, sw005320, takaaki-hori, unilight, yistLin.

v.0.8.0
ESPnet2
- [**ESPnet2**] Solve memory issue with super large corpus training 1972 by kamo-naoyuki
- [**ESPnet2**] Added model parameter count to trainer 1867 by SeanNaren
- [**ESPnet2**] Refactoring espnet2/utils/fileio.py -> espnet2/fileio 1807 by kamo-naoyuki

New Features
- [**New Features**] Lightweight and Dynamic Convolutions. 1599 by yuyfujit
- [**New Features**] Implement Ngram scorer 1946 by qmpzzpmq
- [**New Features**] resampling in utils/compute-fbank-feats.py and utils/compute-stft-feats.py 2035 by kamo-naoyuki

Enhancement
- [**Enhancement**] Ngram scorer update 1992 by qmpzzpmq

Documentation
- [**Documentation**] fix a typo for the decoder add_argument_group 2030 by sw005320
- [**Documentation**] Update multiple GPU descriptions. 2016 by sw005320
- [**Documentation**] Finetuning doc + freezing parameters option 1897 by b-flo

Bugfix
- [**Bugfix**] Fix memory issue when resuming 2040 by kamo-naoyuki
- [**Bugfix**] fixed typo in cmvn.py 1988 by gullyboy007
- [**Bugfix**] update notebook 1986 by ShigekiKarita
- [**Bugfix**] Fix freezing modules (when using multi-gpu) 1983 by atozto9
- [**Bugfix**] Fix BLEU/PPL calculation during training 2009 by hirofumi0810
- [**Bugfix**] Fix download file extension 2042 by takenori-y
- [**Bugfix**] fix tedlium2/3 model link 2032 by sw005320
- [**Bugfix**] Fix bug for pure Transformer-CTC 2023 by hirofumi0810
- [**Bugfix**] li42 recipe: add li42 results; fix bug in adding language id "zh_TW" 1950 by houwenxin

CI
- [**CI**] Add espnet2 in ci/doc.sh 1976 by ShigekiKarita
- [**CI**] Add test for pytorch1.5 1881 by kamo-naoyuki

Acknowledgements
Special thanks to SeanNaren, ShigekiKarita, atozto9, b-flo, gullyboy007, hirofumi0810, houwenxin, kamo-naoyuki, qmpzzpmq, sw005320, takenori-y, yuyfujit.

v.0.7.0
Now, the ESPnet project moves on to a new endeavor! We launched [espnet2](https://github.com/espnet/espnet/pull/1372), which aims to refine the modularities (chainer-free, kaldi-free), use a more customizable trainer, support distributed training, and achieve the scalability mainly led by kamo-naoyuki with his great efforts and leadership. This project is one of the outcomes of our ESPnet hackathon in Tokyo 2019 with a lot of discussions about the design, new features, and community contributions. espnet2 currently supports main ASR recipes (with a well-designed recipe template) and limited TTS recipes. We maintain both espnet1 and espnet2, but gradually move to our development in espnet2. The ESPnet project is further accelerated!

ESPnet2
- [**ESPnet2**] keep the latest model 1769 by kamo-naoyuki
- [**ESPnet2**] Remove "E2E" from all comments 1766 by kamo-naoyuki
- [**ESPnet2**] Refactoring for ESPnetDataset 1758 by kamo-naoyuki
- [**ESPnet2**] Implement SpecAug for ESPnet2 1746 by kamo-naoyuki
- [**ESPnet2**] Implement BatchBinSampler 1742 by kamo-naoyuki
- [**ESPnet2**] Support torch_optimizer 1739 by kamo-naoyuki
- [**ESPnet2**] Log rotation for launch.py 1737 by kamo-naoyuki
- [**ESPnet2**] Change the type of --chunk_length to str_or_int 1733 by kamo-naoyuki
- [**ESPnet2**] Change cudnn deterministic mode to default 1732 by kamo-naoyuki
- [**ESPnet2**] Add wsj results for espnet2 1724 by kamo-naoyuki
- [**ESPnet2**] Show estimated time to finish 1717 by kamo-naoyuki
- [**ESPnet2**] Add --name option for training job 1714 by kamo-naoyuki
- [**ESPnet2**] Show the log file when training process is failed: espnet2.bin.launch.py 1713 by kamo-naoyuki
- [**ESPnet2**] --max_length -> --fold_length 1712 by kamo-naoyuki
- [**ESPnet2**] Double quoter for NCCL_SOCKET_IFNAME 1706 by kamo-naoyuki
- [**ESPnet2**] Save apex state in checkpoint and support apex optimizer 1705 by kamo-naoyuki
- [**ESPnet2**] Update asr.sh 1694 by zh794390558
- [**ESPnet2**] Update ctc.py 1688 by zh794390558
- [**ESPnet2**] Update launch.py 1681 by zh794390558
- [**ESPnet2**] Update asr.sh 1678 by zh794390558
- [**ESPnet2**] --keep_n_best_checkpoints -> --keep_nbest_models 1647 by kamo-naoyuki
- [**ESPnet2**] Avoid deprecated warning: reduction="none" 1510 by kamo-naoyuki
- [**ESPnet2**] Minor change for speed perturbation 1627 by kamo-naoyuki
- [**ESPnet2**] Fix how2 recipe 1620 by kamo-naoyuki
- [**ESPnet2**] Fix recipes 1617 by kamo-naoyuki
- [**ESPnet2**] Renaming 1610 by kamo-naoyuki
- [**ESPnet2**] Implement chunk iterator 1608 by kamo-naoyuki
- [**ESPnet2**] Update voxforge RESULTS 1601 by kamo-naoyuki
- [**ESPnet2**] vivos recipe: --audio_format wav 1592 by kamo-naoyuki
- [**ESPnet2**] Lower python requirements to 3.6 1565 by kamo-naoyuki
- [**ESPnet2**] dirha_wsj recipe for espnet2 1556 by yuekaizhang
- [**ESPnet2**] Update AISHELL ASR Recipe 1549 by Emrys365
- [**ESPnet2**] Remove short data 1531 by kamo-naoyuki
- [**ESPnet2**] [WIP] Update JSUT ASR Recipe 1529 by YosukeHiguchi
- [**ESPnet2**] Update HOW2 recipe 1522 by b-flo
- [**ESPnet2**] [WIP] Update CSJ ASR Recipe 1520 by YosukeHiguchi
- [**ESPnet2**] Change NoamLR to deprecated and implement WarmupLR 1519 by kamo-naoyuki
- [**ESPnet2**] Implement --max_cache_size option 1509 by kamo-naoyuki
- [**ESPnet2**] distributed training 1506 by kamo-naoyuki
- [**ESPnet2**] ESPNet2 Recipe Update -- commonvoice, babel, ami 1504 by ftshijt
- [**ESPnet2**] Refactoring 1494 by kamo-naoyuki
- [**ESPnet2**] Fix ci of flake8 part 1491 by kamo-naoyuki
- [**ESPnet2**] Tensorboard, --num_iters_per_epoch, etc. 1487 by kamo-naoyuki
- [**ESPnet2**] Fix espnet2.bin.pack 1486 by kamo-naoyuki
- [**ESPnet2**] show_result.sh 1478 by kamo-naoyuki
- [**ESPnet2**] Pack and Unpack model 1477 by kamo-naoyuki
- [**ESPnet2**] collect-stats mode, trainer class, etc. 1462 by kamo-naoyuki
- [**ESPnet2**] add test codes for asr decoders 1445 by kamo-naoyuki
- [**ESPnet2**] Integrate Griffin-Lim with tts_decode() 1442 by kan-bayashi
- [**ESPnet2**] Update ASR recipe 1439 by kan-bayashi
- [**ESPnet2**] Update TTS recipes 1430 by kan-bayashi
- [**ESPnet2**] Disable wer/cer calculation when training 1547 by kamo-naoyuki
- [**ESPnet2**] Change CTC default to builtin 1546 by kamo-naoyuki
- [**ESPnet2**] Update chime4 asr1 Recipe 1570 by yuekaizhang
- [**ESPnet2**] Create documentation for espnet2 1710 by kamo-naoyuki
- [**ESPnet2**] shellcheck for local/data.sh 1524 by kamo-naoyuki
- [**ESPnet2**] commonvoice: RESULTS.md -> README.md 1797 by kamo-naoyuki

Bugfix
- [**Bugfix**] % -> percent: espnet2/tasks/abs_task.py 1767 by kamo-naoyuki
- [**Bugfix**] Fix gpu mode for tts_inference.py 1755 by kamo-naoyuki
- [**Bugfix**] Fix SubReporter 1748 by kamo-naoyuki
- [**Bugfix**] Fix calculate_all_attentions for espnet2 1747 by kamo-naoyuki
- [**Bugfix**] Not to create the averaged mdel if --keep_nbest_models=1 1744 by kamo-naoyuki
- [**Bugfix**] Fix --best_model_criterions 1743 by kamo-naoyuki
- [**Bugfix**] Fix the gpu device when resuming 1731 by kamo-naoyuki
- [**Bugfix**] Fix error log for espnet2/bin/launch.py 1730 by kamo-naoyuki
- [**Bugfix**] Disable CUDNN deterministic for CTC: espnet2/asr/ctc.py 1720 by kamo-naoyuki
- [**Bugfix**] Update default.py 1698 by zh794390558
- [**Bugfix**] Fix chunk iterator and refactoring for distributed training 1685 by kamo-naoyuki
- [**Bugfix**] Update vgg_rnn_encoder.py 1676 by zh794390558
- [**Bugfix**] [ESPnet2] chmod +x: run.sh for JSUT 1628 by kamo-naoyuki
- [**Bugfix**] [ESPnet2]Remove nlsyms when word scoring 1614 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix setup.sh 1596 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix launch.py for slurm 1588 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix ci for local/data.sh 1572 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix nj of scripts/audio/format_wav_scp.sh 1550 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Use load_scp_sequential in formart_wav_scp.py 1541 by kamo-naoyuki
- [**Bugfix**] [ESPNet2] Minor fix for CSJ recipe 1540 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix transformer 1539 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] fix rnn_type when bidirectional is used 1533 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix format_wav_scp.py 1532 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix bug of using GPU even if CPU mode 1526 by kamo-naoyuki
- [**Bugfix**] [ESPnet2 ] Fix --accum_grad 1525 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix voxforge config 1511 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Bug fix of splitting files for collect_stats mode 1505 by kamo-naoyuki
- [**Bugfix**] fix to use queue.conf 1431 by sw005320
- [**Bugfix**] [ESPnet2] Fix a bug in TTS 1428 by kan-bayashi
- [**Bugfix**] [ESPnet2] Refactor Encoder and Decoder and bug fix 1427 by kamo-naoyuki
- [**Bugfix**] [ESPnet2] Fix bug of text-chars converter 1426 by kamo-naoyuki
- [**Bugfix**] Optionize trans_type in egs/ljspeech/tts2 1789 by kan-bayashi
- [**Bugfix**] bugfix in ljspeech/tts2 1783 by beckgom
- [**Bugfix**] missing argument for local/data_prep.sh added 1782 by beckgom
- [**Bugfix**] avoid sentencepiece==0.1.90 1923 by kamo-naoyuki
- [**Bugfix**] FIX E523,E541,E741 1918 by kamo-naoyuki
- [**Bugfix**] fix reverse option for cmvn 1906 by magictron
- [**Bugfix**] Error handling for Transformer with CTC-based VAD 1875 by takenori-y
- [**Bugfix**] Revert deletion of init files 1842 by Fhrozen
- [**Bugfix**] fix the missing link of tedlium3 1841 by sw005320
- [**Bugfix**] Add test for torch>1.1 1840 by kamo-naoyuki
- [**Bugfix**] Fix 1808: change the argument order of --batch_type for collect stat… 1810 by kamo-naoyuki
- [**Bugfix**] Change to configargparse>=1.2.1 1803 by kamo-naoyuki
- [**Bugfix**] typo fixed for attention type 1793 by beckgom
- [**Bugfix**] fix https://github.com/espnet/espnet/issues/1780 #1784 by qmeeus
- [**Bugfix**] Fix bug of espnet2 asr_inference.py 1952 by kamo-naoyuki
- [**Bugfix**] Minor fix of import place and comments 1959 by kan-bayashi

New Features
- [**New Features**] Add utils/translate_wav.sh 1530 by ShigekiKarita
- [**New Features**] Batch beam search V2 for Transformer (no CTC) 1402 by ShigekiKarita

Enhancement
- [**Enhancement**] Support multiple sentences in synth_wav.sh 1788 by kan-bayashi
- [**Enhancement**] fix+update transducer 1760 by b-flo

Documentation
- [**Documentation**] Update notebook 1963 by kan-bayashi
- [**Documentation**] Update installation manual 1960 by kan-bayashi
- [**Documentation**] Update installation.md 1957 by kamo-naoyuki
- [**Documentation**] Add note in synth_wav.sh 1785 by kan-bayashi
- [**Documentation**] Update docs 1954 1955 by kamo-naoyuki
- [**Documentation**] Update docs 1938 by kamo-naoyuki
- [**Documentation**] docs: added fbank link to the experiment readme 1910 by kdubovikov

Recipe
- [**Recipe**] Added some TIMIT results 1819 by sknadig
- [**Recipe**] add recipe for French Polyphone: ELRA-S0030_02 1711 by AdolfVonKleist
- [**Recipe**] Use espnet_tts_frontend 1794 by kamo-naoyuki

CI
- [**CI**] Use cache in actions 1917 by ShigekiKarita
- [**CI**] Apply black 1850 by kamo-naoyuki
- [**CI**] Create .mergify.yml 1813 by kamo-naoyuki

Acknowledgements
Special thanks to AdolfVonKleist, Emrys365, Fhrozen, ShigekiKarita, YosukeHiguchi, beckgom, b-flo, ftshijt, kamo-naoyuki, kan-bayashi, kdubovikov, magictron, qmeeus, sknadig, sw005320, takenori-y, yuekaizhang, zh794390558

v.0.6.3
New Features
- [**New Features**] VCC2020 baseline recipe 1641 by unilight
- [**New Features**] Embed defaultlm 1623 by qmpzzpmq

Enhancement
- [**Enhancement**] add test -d $(KALDI): tools/Makefile 1718 by kamo-naoyuki
- [**Enhancement**] Add option to load pretrained model in TTS 1639 by kan-bayashi
- [**Enhancement**] Add reverse_direction option to MT 1658 by hirofumi0810

Recipe
- [**Recipe**] Remove unnecessary lines on Fisher-CallHome Spanish 1650 by hirofumi0810
- [**Recipe**] Add the Aishell2 recipe for the master branch. 1615 by pengchengguo
- [**Recipe**] Reformat the RESULTS.md in vivos 1689 by sw005320

Documentation
- [**Documentation**] Added multiple GPU TIPS 1734 by sw005320
- [**Documentation**] added pure attention decoding TIPS 1725 by sw005320

Docker
- [**Docker**] Docker local updates 1677 by Fhrozen
- [**Docker**] Docker updates 1624 by Fhrozen

Bugfix
- [**Bugfix**] fix 1751 1779 by qmpzzpmq
- [**Bugfix**] Fix v.0.3.0 pretrained Transformer model compatibility 1778 by ShigekiKarita
- [**Bugfix**] Fix torch.ctc not implemented in float16 by casting float32 1777 by ShigekiKarita
- [**Bugfix**] Workaround for bug of configargparse==1.2 1764 by kamo-naoyuki
- [**Bugfix**] change train_iter to be the dataloader object 1741 by bobchennan
- [**Bugfix**] fix 1634 1719 by kamo-naoyuki
- [**Bugfix**] [VCC2020 baseline] Extra reference set 1684 by unilight
- [**Bugfix**] missing torch version in check_install.py 1675 by beckgom
- [**Bugfix**] Fix model link in the tedlium2 recipe 1662 by sw005320
- [**Bugfix**] Update Install for Pytorch version 1659 by Fhrozen
- [**Bugfix**] Fix lm compatibility for v2 1653 by kan-bayashi
- [**Bugfix**] correct results with builtin CTC and PyTorch 1.3 in WSJ recipe 1652 by Emrys365
- [**Bugfix**] Fix lm backward compatibility 1649 by kan-bayashi
- [**Bugfix**] fix 1604 1626 by TitouanT
- [**Bugfix**] Fix a bug in csmsc recipe 1618 by kan-bayashi
- [**Bugfix**] Update e2e_asr_common.py 1735 by zh794390558
- [**Bugfix**] remove non-available options 1738 by sw005320

Acknowledgements
Special thanks to Emrys365, Fhrozen, ShigekiKarita, TitouanT, beckgom, bobchennan, hirofumi0810, kamo-naoyuki, kan-bayashi, pengchengguo, qmpzzpmq, sw005320, unilight, zh794390558.

v.0.6.2
New Features
- [**New Features**] Transducer v3 (w/ transformer support for encoder/decoder) 1422 by b-flo
- [**New Features**] Improving LM training (custom optimizer, custom scheduler, Transformer LM, etc) 1246 by ShigekiKarita

Enhancement
- [**Enhancement**] Add MelGAN pretrained model and support in demo notebook 1581 by kan-bayashi

Recipe
- [**Recipe**] Update fisher-callhome results 1606 by hirofumi0810
- [**Recipe**] Update run_rnnt.sh 1602 by qmpzzpmq
- [**Recipe**] Upload Must-C models 1594 by hirofumi0810
- [**Recipe**] Upload Libri trans models 1569 by hirofumi0810
- [**Recipe**] Upload How2 models 1568 by hirofumi0810
- [**Recipe**] Add Mboshi-French corpus 1545 by hirofumi0810
- [**Recipe**] Update WSJ results using PyTorch 1.3.1 and builtin CTC 1527 by Emrys365
- [**Recipe**] [WIP] IWSLT2016 Recipe 1492 by butsugiri
- [**Recipe**] Update for Common Voice recipe & Multilingual training recipe 1485 by ftshijt
- [**Recipe**] [WIP] DiPCo Recipe 1472 by Fhrozen

Documentation
- [**Documentation**] Support markdown-table for sphinx 1611 by kamo-naoyuki
- [**Documentation**] update docs & README.md 1605 by kamo-naoyuki
- [**Documentation**] fix a link within README.md 1584 by sw005320
- [**Documentation**] Add MT result 1576 by butsugiri
- [**Documentation**] update readme to include Linux installation guides from CI 1567 by sw005320
- [**Documentation**] Update WSJ results in the main README.md 1537 by Emrys365

Bugfix
- [**Bugfix**] Fix a typo in AMI script? 1595 by HuangZiliAndy
- [**Bugfix**] ru_open_stt recipe bug fix 1589 by qmpzzpmq
- [**Bugfix**] Fix pure CTC decoding 1580 by takaaki-hori
- [**Bugfix**] fix snapshot/model test condition 1577 by IceCreamWW
- [**Bugfix**] Fix IWSLT16 Script Permission 1543 by butsugiri
- [**Bugfix**] Fix bug in MT training script 1515 by hirofumi0810
- [**Bugfix**] Use Markdown table instead for WER results 1514 by lijunzh
- [**Bugfix**] Fix a compatibility problem with PyTorch 1.3.0 in ESPnet (v0.6.0) 1421 by Emrys365

Acknowledgements
Special thanks to Emrys365, Fhrozen, HuangZiliAndy, IceCreamWW, ShigekiKarita, b-flo, butsugiri, ftshijt, hirofumi0810, kamo-naoyuki, kan-bayashi, lijunzh, qmpzzpmq, sw005320, takaaki-hori.


v.0.6.1
Happy new year!

New Features
- [**New Features**] Transformer NMT 1479 by hirofumi0810
- [**New Features**] Support knowledge distillation in FastSpeech training 1415 by kan-bayashi
- [**New Features**] Support attention constraint for Tacotron 2 1407 by kan-bayashi

Enhancement
- [**Enhancement**] Add focus rate logging in decoding 1412 by kan-bayashi
- [**Enhancement**] Support Tacotron 2 as a teacher of FastSpeech 1406 by kan-bayashi
- [**Enhancement**] Support length-weighted normalization in loss calculation 1397 by kan-bayashi
- [**Enhancement**] Transformer End-to-End Speech Translation 1348 by hirofumi0810

Recipe
- [**Recipe**] Add LM training/decoding in swbd recipe 1463 by YosukeHiguchi
- [**Recipe**] Add Fisher-CallHome asr1b recipe 1390 by hirofumi0810
- [**Recipe**] RECIPE JESC for MT 1346 by Fhrozen

Documentation
- [**Documentation**] added interspeech 2019 tutorial link and performed spell check 1476 by sw005320
- [**Documentation**] Updated README in ljspeech about FastSpeech training 1468 by kan-bayashi
- [**Documentation**] Add knowledge dist based FastSpeech link in README 1465 by kan-bayashi

Refactoring
- [**Refactoring**] Unify TTS Transformer mask with ASR Transformer 1470 by kan-bayashi

Bugfix
- [**Bugfix**] fixed a small problem in run.sh 1466 by Peidong-Wang
- [**Bugfix**] Fix wrong SC2026 fixing 1458 by kan-bayashi
- [**Bugfix**] Fix multi-encoder ASR integration test 1432 by ShigekiKarita
- [**Bugfix**] Fix wrong type float -> int 1413 by kan-bayashi
- [**Bugfix**] Fix missing key error in Tacotron2 1408 by kan-bayashi
- [**Bugfix**] TransformerST on Fisher-Callhome 1398 by hirofumi0810
- [**Bugfix**] fix rnnlm load bug 1391 by Cescfangs
- [**Bugfix**] Fix gradient accumlation 1388 by hirofumi0810

Acknowledgements
Special thanks to Cescfangs, Fhrozen, Peidong-Wang, ShigekiKarita, YosukeHiguchi, hirofumi0810, kan-bayashi, sw005320.


v.0.6.0
New Features
- [**New Features**] Support Parallel WaveGAN 1333 by kan-bayashi
- [**New Features**] Support save snapshot by iteration 1204 by fanlu
- [**New Features**] Multi-encoder architecture with hierarchical attention and per-encoder CTC 1193 by ruizhilijhu
- [**New Features**] Support multiple inputs 1180 by ruizhilijhu
- [**New Features**] Add E2E-ST specific modules 1139 by hirofumi0810

Enhancement
- [**Enhancement**] Fixing compatibility problems with PyTorch 1.3.0 in ESPnet (v0.5.3) 1343 by Emrys365
- [**Enhancement**] Change log level info -> warning about batchsize 1336 by kan-bayashi
- [**Enhancement**] Support batch decoding for streaming E2E 1270 by takenori-y
- [**Enhancement**] Implement attention cache in Transformer for faster decoding 1240 by ShigekiKarita

Bugfix
- [**Bugfix**] Fix pretrained model URL for master 1351 by kan-bayashi
- [**Bugfix**] Return parser in add_arguments method for transducer 1337 by b-flo
- [**Bugfix**] Disabling nonlinear activation of the last encoder layer 1323 by simpleoier
- [**Bugfix**] Fixed error: "Expected object of device type cuda but got device type cpu" in decoder of transducer 1315 by rai4
- [**Bugfix**] Fix ASR eval for TTS in the case of trans_type=phn 1368 by kan-bayashi
- [**Bugfix**] Make --preprocess_conf optional in pack_model.sh 1365 by kan-bayashi
- [**Bugfix**] Remove set start method to fix 1290 1363 by kan-bayashi
- [**Bugfix**] Fix pretrained model URL 1354 by kan-bayashi
- [**Bugfix**] Fix pretrained model URL 1350 by kan-bayashi
- [**Bugfix**] Fix TTS transformer attention weight calculation in inference 1331 by kan-bayashi
- [**Bugfix**] Fix decoding for chainer transformer 1101 by Fhrozen

Recipe
- [**Recipe**] Update libri_trans asr recipe 1344 by hirofumi0810
- [**Recipe**] Update LJSpeech to limit frequency range 1330 by kan-bayashi
- [**Recipe**] IWSLT19 Speech Translation recipe 1169 by hirofumi0810
- [**Recipe**] Must-C NMT recipe 1168 by hirofumi0810
- [**Recipe**] How2 NMT recipe 1165 by hirofumi0810
- [**Recipe**] Update how2 recipe 1148 by hirofumi0810
- [**Recipe**] Pre-trained CSJ model 1341 by takenori-y
- [**Recipe**] TTS: add FastSpeech config and result for jsut 1321 by r9y9
- [**Recipe**] Asr commonvoice recipe update 1241 by ftshijt

Documentation
- [**Documentation**] Update notebook submodule 1367 by kan-bayashi
- [**Documentation**] Fix sphinx warning of TTS modules 1366 by kan-bayashi
- [**Documentation**] Update notebook and add to Sphinx document 1364 by kan-bayashi
- [**Documentation**] Update notebook 1352 by kan-bayashi
- [**Documentation**] Doc for Chainer transformer 1017 by Fhrozen
- [**Documentation**] Update README 1342 by takenori-y


Refactoring
- [**Refactoring**] Indirect call for training method [chainer] 1256 by Fhrozen
- [**Refactoring**] Refact transformer for transformer LM 1223 by Fhrozen
- [**Refactoring**] Refine NMT 1152 by hirofumi0810
- [**Refactoring**] Small changes in chainer backend 1110 by Fhrozen
- [**Refactoring**] Format Chainer E2E transformer forward (fixed) 1034 by Fhrozen

Acknowledgements
Special thanks to Emrys365, Fhrozen, ShigekiKarita, b-flo, fanlu, ftshijt, hirofumi0810, kan-bayashi, r9y9, rai4, ruizhilijhu, simpleoier, takenori-y.


v.0.5.4
Bugfix
- [**Bugfix**] Fixed pretrained model URL in CSMSC reicpe 1314 by kan-bayashi
- [**Bugfix**] Fix CSMSC wavenet link 1298 by kan-bayashi
- [**Bugfix**] Minor fix of FastSpeech 1295 by kan-bayashi
- [**Bugfix**] [bug fixing] Using inplace masked_fill_() 1273 by Emrys365
- [**Bugfix**] Fix RuntimeError in setting spawn multiple times 1267 by kan-bayashi
- [**Bugfix**] Use spawn in multiprocessing to fix 404 1251 by kan-bayashi

Documentation
- [**Documentation**] Update README.md 1309 by kan-bayashi
- [**Documentation**] Fix docstrings 1288 by kan-bayashi
- [**Documentation**] Fixed a typo in swbd asr1 1220 by Shujian2015
- [**Documentation**] update notebook 1219 by ShigekiKarita

Recipe
- [**Recipe**] Update VAIS1000 recipe RESULTS.md 1308 by kan-bayashi
- [**Recipe**] Fix VAIS1000 recipe 1305 by kan-bayashi
- [**Recipe**] Update CSMSC results 1299 by kan-bayashi
- [**Recipe**] Add vais1000 recipe - Vietnamese TTS 1283 by enamoria
- [**Recipe**] Add VIVOS recipe - Vietnamese ASR 1271 by hieuthi
- [**Recipe**] Add JNAS tts1 recipe 1269 by kan-bayashi
- [**Recipe**] Support Polish speakers in M-AILABS 1265 by kan-bayashi
- [**Recipe**] Add TWEB recipe 1263 by kan-bayashi
- [**Recipe**] Update M-AILABS results 1262 by kan-bayashi
- [**Recipe**] Add CSMSC reicpe 1259 by kan-bayashi
- [**Recipe**] Add JVS recipe 1258 by kan-bayashi
- [**Recipe**] Add CMU Arctic recipes 1257 by kan-bayashi
- [**Recipe**] Add M-AILABS pretrained models 1229 by kan-bayashi

New Features
- [**New Features**] Add eval-interval-epochs for the tiny dataset 1306 by kan-bayashi
- [**New Features**] ASR-based CER/WER eval for TTS 1190 by potato-inoue

Enhancement
- [**Enhancement**] Add Mandarin Pretrained Wavenet 1292 by kan-bayashi
- [**Enhancement**] Add pretrained models: JSUT and LibriTTS 1260 by r9y9
- [**Enhancement**] Improved JSUT TTS recipe 1216 by r9y9

Acknowledgements
Special thanks to Emrys365, ShigekiKarita, Shujian2015, enamoria, hieuthi, kan-bayashi, potato-inoue, r9y9.


v.0.5.3
Bugfix
- [**Bugfix**] Fix a bug in building docker container 1197 by protoget
- [**Bugfix**] fixed h5py version as 2.9.0 1183 by ruizhilijhu
- [**Bugfix**] Fix error on waveform generation by WaveNet 1170 by r9y9
- [**Bugfix**] Sort nbest_hyps without limiting them to beam size 1157 by elgeish
- [**Bugfix**] fix recursive make 1153 by b-flo
- [**Bugfix**] missing file in iwslt19 1147 by sw005320
- [**Bugfix**] Wsj mix 1145 by simpleoier

Enhancement
- [**Enhancement**] Install warp-ctc from PyPI 1196 by ysk24ok
- [**Enhancement**] TTS: MoL WaveNet minor update 1195 by r9y9
- [**Enhancement**] Transducer v1.2 1173 by b-flo

New Features
- [**New Features**] Add support for MoL WaveNet to synth_wav.sh 1186 by r9y9
- [**New Features**] Using pytorch dataloader for pytorch backend 1138 by bobchennan

Recipe
- [**Recipe**] dirha_wsj recipe 1179 by ruizhilijhu
- [**Recipe**] Update Russian open STT recipe for v0.5 of the dataset 1160 by akreal
- [**Recipe**] Blizzard recipe 1056 by potato-inoue

Refactoring
- [**Refactoring**] Install warpctc-pytorch from pytorch-0.4 branch when PyTorch version is 0.4.X 1162 by ysk24ok
- [**Refactoring**] using python3 as default 1159 by zh794390558
- [**Refactoring**] Fix download_from gdrive.sh on osx 1158 by r9y9

Documentation
- [**Documentation**] Fix doc/module2rst.py to use glob and remove --nowarn from travis-sphinx 1155 by ShigekiKarita

Acknowledgements
Special thanks to ShigekiKarita, akreal, b-flo, bobchennan, elgeish, potato-inoue, protoget, r9y9, ruizhilijhu, simpleoier, sw005320, ysk24ok, zh794390558.


v.0.5.2
Documentation
- [**Documentation**] Clean up TTS module docstrings 1143 by kan-bayashi
- [**Documentation**] update readme for warp-transducer 1125 by sw005320
- [**Documentation**] Fix flake8 blacklist 1107 by ShigekiKarita

Bugfix
- [**Bugfix**] Minor fix 1142 by kan-bayashi
- [**Bugfix**] Fix apex error when opt == “noam" 1134 by kan-bayashi
- [**Bugfix**] Fix model compatibility 1133 by kan-bayashi
- [**Bugfix**] Fix backward compatibility problem in PositionalEncoding by adding pre-hook to ignore `self.pe` 1127 by ShigekiKarita
- [**Bugfix**] fix iwslt19 recipe 1124 by sw005320
- [**Bugfix**] Fix best validation perplexity LM averaging 1122 by akreal
- [**Bugfix**] Fix bug in how2 asr1 1117 by hirofumi0810
- [**Bugfix**] Fix: wrong variable in greedy decode 1113 by b-flo
- [**Bugfix**] Chainer fix mixed input 1096 by Fhrozen
- [**Bugfix**] Fix deleted argument atype 1095 by Fhrozen
- [**Bugfix**] Fix guided attention loss in Tacotron2 when reduction factor > 1 1087 by kan-bayashi
- [**Bugfix**] Fix multi gpu LM issues and add hdf5 LM dataset dump 1083 by ShigekiKarita

Enhancement
- [**Enhancement**] Add stdout.pl for debugging version run.pl 1141 by ShigekiKarita
- [**Enhancement**] Update recog_wav.sh 1140 by kan-bayashi
- [**Enhancement**] Update spm_train and test it 1135 by ShigekiKarita
- [**Enhancement**] Transducer v1.1 1129 by b-flo
- [**Enhancement**] Allow to extend the length of positional encoding at training and inference 1105 by ShigekiKarita
- [**Enhancement**] Update batchfy.py 1104 by zh794390558
- [**Enhancement**] Add PYTHONIOENCODING=UTF-8 in path.sh 1099 by kan-bayashi
- [**Enhancement**] Improve batch decoding 980 by takaaki-hori
- [**Enhancement**] Implement add_arguments method of E2E for rnn. 941 by kamo-naoyuki

Recipe
- [**Recipe**] Update swbd 1137 by sw005320
- [**Recipe**] Updated symlink in Librispeech 1130 by kan-bayashi
- [**Recipe**] Add missing lines to iwslt19 LM training data 1126 by hirofumi0810
- [**Recipe**] Add iwslt19 ASR recipe 1120 by hirofumi0810
- [**Recipe**] How2 speech translation recipe 1102 by hirofumi0810
- [**Recipe**] Must-C ASR recipe 1098 by hirofumi0810
- [**Recipe**] Must-C speech translation corpus 1085 by hirofumi0810
- [**Recipe**] Replace character-level recipe with the BPE one in iwslt18 1079 by hirofumi0810
- [**Recipe**] Fix swbd recipe v2 1072 by sw005320
- [**Recipe**] Updated REVERB multi-channel E2E recipe 1057 by Xiaofei-Wang

New Features
- [**New Features**] Add --train-dtype option for float16/float32/float64 precision training in pytorch ASR and LM 1119 by ShigekiKarita
- [**New Features**] transfer learning 1103 by b-flo
- [**New Features**] New beam-search framework: ScorerInterface, CPU/GPU float16/32/64 decoding, and new language models (SeqRNNLM and TransformerLM) 1092 by ShigekiKarita
- [**New Features**] Support pretrained WaveNet vocoder 1081 by kan-bayashi
- [**New Features**] RNN-Transducer 1065 by b-flo

Acknowledgements
Special thanks to Fhrozen, ShigekiKarita, Xiaofei-Wang, akreal, b-flo, hirofumi0810, kamo-naoyuki, kan-bayashi, sw005320, takaaki-hori, zh794390558.



v.0.5.1
Bugfix
- [**Bugfix**] Fix conda installation error 1076 by kan-bayashi
- [**Bugfix**] Minor fix batchsize log when batchsize = 0 1068 by kan-bayashi
- [**Bugfix**] Fix spm decode 1062 by ShigekiKarita
- [**Bugfix**] Minor fix to use fastspeech in synth_wav.sh 1061 by kan-bayashi
- [**Bugfix**] Fix help message to enable line break 1059 by kan-bayashi
- [**Bugfix**] Fix tensorboard interval in validation 1054 by ShigekiKarita
- [**Bugfix**] Update E2E-ASR test 1041 by kan-bayashi
- [**Bugfix**] Fix Loss Calculation 1039 by Fhrozen

Refactoring
- [**Refactoring**] Remove unused conf 1070 by kan-bayashi
- [**Refactoring**] [Reopen] Support default arguments 1067 by kan-bayashi
- [**Refactoring**] Refactor E2E-TTS test 1042 by kan-bayashi

CI
- [**CI**] Add TTS integration test 1069 by kan-bayashi
- [**CI**] Make test smaller to speed up 1044 by kan-bayashi
- [**CI**] Separate tasks in each job of circleci 1043 by kan-bayashi

Recipe
- [**Recipe**] Add data augmentation to ami recipe 1066 by Jzmo
- [**Recipe**] Update accum_grad for a single gpu in CSJ 1050 by kan-bayashi
- [**Recipe**] add commonvoice recipe 1000 by YosukeHiguchi
- [**Recipe**] REVERB multi-channel E2E recipe 985 by Xiaofei-Wang

New Features
- [**New Features**] Support multi gpu in pytorch lm 1063 by ShigekiKarita

Enhancement
- [**Enhancement**] Use librosa's fast Griffin-Lim 1058 by kan-bayashi
- [**Enhancement**] Add option to select the integration type of speaker embedding 1047 by kan-bayashi
- [**Enhancement**] update tedlium3 recipe with transformer 1037 by ShigekiKarita
- [**Enhancement**] update tedlium2 config 1036 by ShigekiKarita
- [**Enhancement**] Support of other recipe in recog_wav.sh 1026 by hiratake55

Acknowledgements
Special thanks to Fhrozen, Jzmo, ShigekiKarita, Xiaofei-Wang, YosukeHiguchi, hiratake55, kan-bayashi.

v.0.5.0
CI
- [**CI**] Integration test with mini AN4 1035 by ShigekiKarita
- [**CI**] codecov support 850 by ShigekiKarita

Bugfix
- [**Bugfix**] [Bug] Fix error calculator for report false 1032 by Fhrozen
- [**Bugfix**] fix unk scoring 1002 by sw005320
- [**Bugfix**] make tensorboard logging done every 100 iters 996 by sw005320

Refactoring
- [**Refactoring**] TTS: avoid using asr module in TTS 1031 by r9y9
- [**Refactoring**] Exit 1 when source command return 1 1030 by kan-bayashi
- [**Refactoring**] Refactor FileReaderWrapper and FileWriterWrapper 947 by kamo-naoyuki

Enhancement
- [**Enhancement**] Use pypi sentencepiece 1029 by ShigekiKarita
- [**Enhancement**] Add log of the inference speed of TTS models 1027 by kan-bayashi
- [**Enhancement**] Add GPU decodable test for TTS modules 1025 by kan-bayashi
- [**Enhancement**] Support multi-speaker FastSpeech 1006 by kan-bayashi
- [**Enhancement**] Custom Training extensions for ASR chainer 1004 by Fhrozen
- [**Enhancement**] Support multi-speaker Transformer 1001 by kan-bayashi
- [**Enhancement**] RFC: Add keep_all_data_on_mem option 999 by r9y9
- [**Enhancement**] Support saving of attention weights and probability in decoding 995 by kan-bayashi
- [**Enhancement**] Implement Fast Speech 848 by kan-bayashi
- [**Enhancement**] Transformer Chainer 774 by Fhrozen
- [**Enhancement**] Neural Machine Translation 563 by hirofumi0810

Recipe
- [**Recipe**] fix bugs to make a swbd recipe run 1024 by sw005320
- [**Recipe**] Add multi-speaker Transformer config in LibriTTS 1022 by kan-bayashi
- [**Recipe**] Rename RESULTS to RESULTS.md 1021 by kan-bayashi
- [**Recipe**] Clean LibriTTS RESULTS.md 1020 by kan-bayashi
- [**Recipe**] Clean LJSPeech RESULTS.md 1019 by kan-bayashi
- [**Recipe**] Update JSUT TTS RESULTS.md 1018 by kan-bayashi
- [**Recipe**] Add Transformer config in JSUT 1009 by kan-bayashi
- [**Recipe**] Update libri trans 949 by hirofumi0810
- [**Recipe**] iwslt18 NMT recipe 937 by hirofumi0810
- [**Recipe**] libri_trans NMT recipe 931 by hirofumi0810
- [**Recipe**] Add fastspeech.v2 result 925 by kan-bayashi

Documentation
- [**Documentation**] [Docstrings] Removing empty init files to avoid docs 1016 by Fhrozen
- [**Documentation**] add egs info 1015 by sw005320
- [**Documentation**] Update docstrings in espnet.nets.chainer_backend 974 by Masao-Someki
- [**Documentation**] Reformat docstrings in espnet/asr 914 by Masao-Someki
- [**Documentation**] Update TTS module’s docstrings and refactor some modules 898 by kan-bayashi

Acknowledgements
Special thanks to Fhrozen, Masao-Someki, ShigekiKarita, hirofumi0810, kamo-naoyuki, kan-bayashi, r9y9, sw005320.

v.0.4.3
Enhancement

- [**Enhancement**] Use queue-freegpu.pl in all cmd.sh [1013](https://github.com/espnet/espnet/pull/1013)

Documentation

- [**Documentation**] nbsphinx support [1003](https://github.com/espnet/espnet/pull/1003)
- [**Documentation**] Update docstrings [994](https://github.com/espnet/espnet/pull/994)

Recipe

- [**Recipe**] CSJ asr1: prettify RESULTS.md [1008](https://github.com/espnet/espnet/pull/1008)
- [**Recipe**] WSJ asr1: prettify RESULTS.md [1007](https://github.com/espnet/espnet/pull/1007)

Bugfix

- [**Bugfix**] fix Cupy Import Error 969 [1010](https://github.com/espnet/espnet/pull/1010)
- [**Bugfix**] Fix a bug in synthesis_wav.sh [989](https://github.com/espnet/espnet/pull/989)
- [**Bugfix**] Fix lm_n_average in lang_model [988](https://github.com/espnet/espnet/pull/988)

Refactoring

- [**Refactoring**] Remove "free-gpu" from *_train and create queue-freegpu.pl [938](https://github.com/espnet/espnet/pull/938)

CI

- [**ci**] reduce travis jobs [1011](https://github.com/espnet/espnet/pull/1011)

Acknowledgements

Special thanks to Fhrozen kamo-naoyuki Magic-Bubble ShigekiKarita takenori-y Xiaofei-Wang.




v.0.4.2
Bugfix

- [**Bugfix**] Fix pytorch LM GPU training without cupy [981](https://github.com/espnet/espnet/pull/981)
- [**Bugfix**] make tensorboard logging done every 100 iters [966](https://github.com/espnet/espnet/pull/966)
- [**Bugfix**] FiX ER calculator [955](https://github.com/espnet/espnet/pull/955)
- [**Bugfix**] Fix a typo bug in computing guided attention loss [956](https://github.com/espnet/espnet/pull/956)
- [**Bugfix**] run.sh should exit if sourcing path.sh return error [954](https://github.com/espnet/espnet/pull/954)

Recipe

- [**Recipe**] Update Librispeech recipe [970](https://github.com/espnet/espnet/pull/970)
- [**Recipe**] New RNN and Transformer result of AMI recipe(ihm) [978](https://github.com/espnet/espnet/pull/978)
- [**Recipe**] BPE support for SwitchBoard & Transformer config [909](https://github.com/espnet/espnet/pull/909)
- [**Recipe**] Update li10 [965](https://github.com/espnet/espnet/pull/965)
- [**Recipe**] Update libri trans [949](https://github.com/espnet/espnet/pull/949)

Enhancement

- [**Enhancement**] transform: expose pad_mode for logmelspectrogram [957](https://github.com/espnet/espnet/pull/957)


Acknowledgements

Special thanks to Fhrozen, geekboood, hirofumi0810, Jzmo, naxingyu, r9y9, ShigekiKarita.

v.0.4.1
Bugfix
- [**Bugfix**] Fix a bug in calculate_all_attentions [862](https://github.com/espnet/espnet/pull/862)
- [**Bugfix**] Fix bugs in frontend [875](https://github.com/espnet/espnet/pull/875)
- [**Bugfix**] Fix grad noise v2 [912](https://github.com/espnet/espnet/pull/912)
- [**Bugfix**] Fix plot fail [913](https://github.com/espnet/espnet/pull/913)
- [**Bugfix**] Fix tgz typo [892](https://github.com/espnet/espnet/pull/892)
- [**Bugfix**] Fix: Output dimension of Conv2dSubsampling 822 [921](https://github.com/espnet/espnet/pull/921)
- [**Bugfix**] Fix: espnet/transform/transformation.py [866](https://github.com/espnet/espnet/pull/866)
- [**Bugfix**] Fixed certain typos [893](https://github.com/espnet/espnet/pull/893)
- [**Bugfix**] Modified if conditions [908](https://github.com/espnet/espnet/pull/908)
- [**Bugfix**] fix bugs in grad noise [886](https://github.com/espnet/espnet/pull/886)
- [**Bugfix**] CER/WER & CER_CTC in Transformer pytorch [936](https://github.com/espnet/espnet/pull/936)
- [**Bugfix**] Update iwslt18 recipe [808](https://github.com/espnet/espnet/pull/808)

Documentation
- [**Documentation**] Add model link [899](https://github.com/espnet/espnet/pull/899)
- [**Documentation**] Document espnet tools and modules [884](https://github.com/espnet/espnet/pull/884)
- [**Documentation**] Fix typo [930](https://github.com/espnet/espnet/pull/930)
- [**Documentation**] Reformat docstrings in espnet/asr [914](https://github.com/espnet/espnet/pull/914)
- [**Documentation**] Update CONTRIBUTING.md [880](https://github.com/espnet/espnet/pull/880)
- [**Documentation**] add recipe related documentations to CONTRIBUTING.md [872](https://github.com/espnet/espnet/pull/872)
- [**Documentation**] skip ci when gh-pages is deployed [901](https://github.com/espnet/espnet/pull/901)
- [**Documentation**] use only conda to build doc [895](https://github.com/espnet/espnet/pull/895)

Enhancement
- [**Enhancement**] Script for docker builds from the local repo [877](https://github.com/espnet/espnet/pull/877)
- [**Enhancement**] Demo script for TTS [871](https://github.com/espnet/espnet/pull/871)
- [**Enhancement**] Fix plot attention for chainer transformer [940](https://github.com/espnet/espnet/pull/940)
- [**Enhancement**] Implement Fast Speech [848](https://github.com/espnet/espnet/pull/848)
- [**Enhancement**] Move the dependency links to github from Makefile to setup.py [858](https://github.com/espnet/espnet/pull/858)
- [**Enhancement**] Support new version in Docker containers [836](https://github.com/espnet/espnet/pull/836)
- [**Enhancement**] gradient noise injection from std normal dis [881](https://github.com/espnet/espnet/pull/881)
- [**Enhancement**] [Discussion] Create show_result.sh [874](https://github.com/espnet/espnet/pull/874)

Recipe
- [**Recipe**] Add Jsut asr recipe [793](https://github.com/espnet/espnet/pull/793)
- [**Recipe**] AURORA4 RESULTS.md file [835](https://github.com/espnet/espnet/pull/835)
- [**Recipe**] Add Librispeech French corpus [882](https://github.com/espnet/espnet/pull/882)
- [**Recipe**] Add transformer config in m_ailabs/tts1 recipe [924](https://github.com/espnet/espnet/pull/924)
- [**Recipe**] Change librispeech_french to libri_trans [903](https://github.com/espnet/espnet/pull/903)
- [**Recipe**] Fix: utils/show_result.sh [915](https://github.com/espnet/espnet/pull/915)
- [**Recipe**] Minor update for speech translation recipe [907](https://github.com/espnet/espnet/pull/907)
- [**Recipe**] Transformer for CHiME4 Single Channel [837](https://github.com/espnet/espnet/pull/837)
- [**Recipe**] Update LJSpeech RESULTS.md [861](https://github.com/espnet/espnet/pull/861)
- [**Recipe**] Update LJSpeech RESULTS.md [887](https://github.com/espnet/espnet/pull/887)
- [**Recipe**] Update Librispeech recipe [885](https://github.com/espnet/espnet/pull/885)
- [**Recipe**] Update fisher callhome spanish for speech translation [868](https://github.com/espnet/espnet/pull/868)
- [**Recipe**] libri_trans NMT recipe [931](https://github.com/espnet/espnet/pull/931)

Refactoring
- [**Refactoring**] Refactor TTS Transformer [865](https://github.com/espnet/espnet/pull/865)
- [**Refacotring**] test: avoid using grep and sed in subprocess and use python stdlib instead [854](https://github.com/espnet/espnet/pull/854)
- [**Refactoring**] Update TTS module’s docstrings and refactor some modules [898](https://github.com/espnet/espnet/pull/898)

Acknowledgements
Special thanks to 27jiangziyan, Fhrozen, Masao-Someki, ShigekiKarita, SuperGops7, creatorscan, hirofumi0810, kamo-naoyuki, lumaku, naxingyu, r9y9, simpleoier, takenori-y.



v.0.4.0
New features and improvements
- E2E Mulchi channels system 596
- Changed to use pip-install for pytorch_wpe 843
- Transformer
- ASR chainer 655
- ASR pytorch 690
- TTS pytorch 752
- Specaugment 734 745 754
- Streaming attention encoder-decdoer E2E-ASR 757
- Offline recognition demo 809
- New batch making strategies 759
- Guided Attention Loss 816

Important changes
- **drop python2 support**
- use `utils/fix_data_dir.sh` as default 660
- CPU-only installation 677 687 704
- fix to use python2 as default in travis 685
- add CUDA_VERSION in Makefile 687
- use Pytorch 1.0.1 as default 721
- use `yaml` format configuration file 722
- modularize TTS components 746 815
- use Chainer/Cupy 6.0.0 as default 753
- reinforce CI 763
- Google drive downloader 798
- New scripts to pack model and get system info 790 802
- change the scoring in multi-speaker case from shell to python 805
- update patience in TTS recipes 817
- `n_average` option in TTS 823
- update TTS recipes to use config files 780
- make `ngpu=1` as default for all of the recipes 800
- deprecate `egs/librispeech/tts1` recipe 806
- maintain the pytorch warp-ctc under espnet 838


New recipes
- AURORA4 722 770 824
- JNAS 725
- LibriTTS 795
- Tedlium release3 739
- added the model link and missing files 831
- TIMIT 698
- Russian Open STT 768

Recipe updates
- Aishell
- support Transformer 827
- fix the indent of RESULTS.md in the aishell recipe 828
- CSJ
- support Transformer 737 742 782
- HKUST
- support Transformer 840
- IWSLT18
- add missing files for iwslt18 recipe 767
- Librispeech
- support Transformer 781
- LJSpeech
- added more samples 825 842
- support Transformer 752
- Tedlium release2
- support word LM in TEDLIUM recipe 683
- fix duplicated line in tedlium recipe 714
- fix a bug in the TEDLIUM recipe 771
- support Transformer 803
- Voxforge
- bugfix in voxforge 684
- unify rnn and transformer recipes for the voxforge task 769
- support Transformer 758
- update config files in the voxforge recipe 783
- WSJ
- support Specaugment 745
- support Transformer 655 690

Documentation
- add citation bibtex entry for ESPnet 676
- add NACCL paper repliation link for CMU Wilderness Multilingual Speech Dataset 717 731
- update library information 789
- Add table of contents 812
- add GPU decoding document Documentation 813
- minibatch explanation 821

Bugfix
- fix recognize_batch for 2d, location_reccurent, multi-head attentions for 665 and add test 681
- fix CER/WER calculation during training 678
- add version check for matplotlib installation 679
- make sure `hlens` is tensor in recognize_batch 680
- fix choice between pytorch and pytorch-cpu 702
- fix `merge_json` behavior (699) when no labels for 708
- fix `check_install.py` 728
- use `ensure_ascii=False` to make json human-readable 730
- Fix argument name for SummaryWriter 747
- use scikit-learn 0.20 749
- fix pytorch for chainer v6.0.0 772
- fix model compatibility 799
- fix minor typos in the recipes 801
- bug fix: `egs/chime4/asr1_multich/conf/train.yaml` 826
- bug fix: `espnet/utils/training/batchfy.py` 833
- fix to use sentencepiece v.0.1.82 839

Acknowledegements
Special thanks to 27jiangziyan, akreal, bobchennan, creatorscan, danoneata, Fhrozen, gtache, hirofumi0810, jan-schuchardt, jnishi, kamo-naoyuki, Masao-Someki, oadams, simpleoier, sknadig, ShigekiKarita, takenori-y

v.0.3.1
New improvements

- Add instant speech recognition 581
- Add CTC greedy decoding CER monitor 587
- Add Streaming encoder 638
- Add Uni-directional encoder 624 629
- Add model compatibility test 615 649
- Update fisher_callhome_spanish recipe 625
- Improve swbd scoring 614 620
- Improve memory usage in json merge script 579
- Improve background job failure check in decoding state 627 643 648
- Separate installation of basic tools and extra tools 628

Bugfix

- Fix CTC type selection 617 618
- Fix MultiProcessIterator 613
- Fix chainer sortgrad bug
- Fix installer 594 595 604 609 622
- Fix WSJ-mix recipe 610 630 641
- Fix remove_longshortdata.sh 646

Thank you for a lot of contributions kamo-naoyuki, gtache, simpleoier, takenori-y, Fhrozen, JaejinCho, pzelasko, zh794390558, kan-bayashi, sw005320.


v.0.3.0-beta
New features and improvements

- Support Pytorch 1.0 553
- Support the use of Tensorboard 506
- Support early stopping 508
- Support `stop_stage` option 539
- Support sortgrad 550
- Add GRU architecture 496
- Add GPU batch decoding 318
- Support HDF5 format instead of kaldi ark 412 493
- Add speech separation recipe 531
- Add TTS recipes (German, Spanish, Italy, Japanese...) 562 569 519
- Add ASR recipes 574 519
- Improve ASR recipes 491 521 546 435 467 469
- Improve speech translation recipes 468
- Improve Python2/3 compatibility 567
- Improve cmd.sh usage 538 547
- Add test scripts for shell scripts 484 498
- Change to use conda with Python3.7 as default 567
- Python code modularization 440 484

We really appreciate a lot of contributions, gtache, kamo-naoyuki, hirofumi0810, ShigekiKarita, takenori-y, simpleoier, Fhrozen, sas91, mn5k, JaejinCho. Xiaofei-Wang, jnishi, Magic-Bubble.

v.0.2.0
New feature and improvement

- add data prefetch 340
- add new recipes
- IWSLT speech translation recipe 325
- REVERB challenge recipe 359
- add test codes
- for checking warp ctc behaviors in the multitask mode 369
- for a multiple GPU 362
- for a single GPU 376
- for read/write models 362 376
- add check script for python library installation 373 389
- improve some ASR baseline recipes by using a shallow and wide BLSTM encoder and subwords
- librispeech 354 386
- CSJ 326
- HKUST 366

Important changes

- fix to use PyTorch 0.4.1 (stop to support PyTorch 0.3.x) 332
- rename some functions
- `e2e_asr_attctc.py` -> `e2e_asr.py`
- `e2e_asr_attctc_th.py` -> `e2e_asr_th.py`
- change the format of model.conf from pickle to JSON 342
- remove deprecated options 336
- unify the data converter with TTS one 343
- unify model variable arguments between TTS and ASR 337
- fix pytorch backend snapshot functions including the save of optimizers 362
- avoid to use `feat-to-len`. Use `write_utt2num_frames=true`, and read utt2num instead of executing `feat-to-len` 339
- refacor `asr_pytorch.py` and `asr_chainer.py`.
- refactor the recog part in asr_chainer.py and asr_pytorch especially after it gets nbest. 370
- make `nets/e2e_common.py`, and move some common functions there

Bug fix

- warpctc gradient scaling (Thanks jnishi)
- warpctc multi-gpu bug (Thanks jnishi)
- undefined gpuid bug in cpu RNN training 379
- no hypothesis bug 378
- Python3 compatibility 375 341 (Thanks akreal)

v.0.1.5
- update the Librispeech ASR recipe and use subword modeling as default.
- attached Librispeech ASR model (librispeech_asr1.tgz):
- RNNLM: `exp/train_rnnlm_2layer_bs256_unigram2000/rnnlm.model.best`
- ASR models: `exp/train_960_vggblstm_e4_subsample1_2_2_1_1_unit1024_proj1024_d1_unit1024_location1024_aconvc10_aconvf100_mtlalpha0.5_adadelta_bs30_mli800_mlo150_unigram2000/results/{model.acc.best,model.conf}`
- performance:

| | WER (%) |
|-----------|:----:|

Page 7 of 7

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.