Speechrecognition

Latest version: v3.10.4

Safety actively analyzes 629503 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 5

3.8.1

Lots of changes since June! Summary below. Get all of these and more with a quick `pip install --upgrade SpeechRecognition`.

* **Snowboy hotwords support** for highly efficient, performant listening (thanks beeedy!). This is implemented as the `snowboy_configuration` parameter of `recognizer_instance.listen`.
* **Configurable Pocketsphinx models** - you can now specify your own acoustic parameters, language model, and phoneme dictionary, using the `language` parameter of `recognizer_instance.recognize_sphinx` (thanks frawau!).
* `audio_data_instance.get_segment(start_ms=None, end_ms=None)` is a new method that can be called on any AudioData instance to **get a segment of the audio** starting at `start_ms` and ending at `end_ms`. This is really useful when you want to get, say, only the first five seconds of some audio.
* The `stopper` function returned by `listen_in_background` now accepts one parameter, `wait_for_stop` (defaulting to `True` for backwards compatibility), which determines whether the function will wait for the background thread to fully shutdown before returning. One advantage is that if `wait_for_stop` is `False`, **you can call the `stopper` function from any thread**!
* New example, demonstrating how to **simultaneously listen to and recognize speech** with the threaded producer/consumer pattern: [threaded_workers.py](https://github.com/Uberi/speech_recognition/blob/master/examples/threaded_workers.py).
* Various improvements and bugfixes:
* [Python 3 style type annotations](https://github.com/Uberi/speech_recognition/blob/master/reference/library-reference.rst) in library documentation.
* `recognize_google_cloud` now uses the v1 rather than the beta API (thanks oort7!).
* `recognize_google_cloud` now returns timestamp info when the `show_all` parameter is `True`.
* `recognize_bing` won't time out as often on credential requests, due to a longer default timeout.
* `recognize_google_cloud` timeouts respect `recognizer_instance.operation_timeout` now (thanks reefactor!).
* Any recognizers using FLAC audio were broken inside Linux on Docker - this is now fixed (thanks reefactor!).
* Various documentation and lint fixes (thanks josh-hernandez-exe!).
* Lots of small build system improvements.

3.7.1

As usual, get it with `pip install --upgrade SpeechRecognition`

* **New `grammar` parameter for `recognizer_instance.recognize_sphinx`** - now, you can specify a JSGF or FSG grammar to PocketSphinx (thanks aleneum!).
* **Update PyAudio to version 0.2.11** - this fixes a couple memory management issues users have been experiencing.
* **Update FLAC to 1.3.2 on all platforms** - this will make it easier to support more audio formats in the near future.
* **Fixes for various APIs on Python 3.6+** - small changes in `urllib.request` behavior made requests fail in certain situations.
* **Fixes for Bing Speech API timing out** due to some backwards incompatible changes to their API.
* **Restore original IBM audio segmentation behaviour** - previously, it would stop recognizing after the first pause. Now, it will recognize all speech in the input audio, as it did before IBM's changes.
* Fix links in PocketSphinx docs and library reference. Add-on language models now available from Google Drive, including the now-officially-supported Italian model.
* New troubleshooting entries for JACK server in README.
* Documentation and build process updates.

3.6.5

Quick bugfix for `PortableNamedTemporaryFile`:

* Fix file descriptor opening on Python 2.
* Add tests for Sphinx keyword matching.

3.6.4

Bugfix release!

* Fix `tempfile.NamedTemporaryFile` on Windows, by replacing it with a `PortableNamedTemporaryFile` class. Previously, it didn't necessarily support the file being re-opened after originally opened.
* Documentation/troubleshooting improvements (thanks hassanmian!).
* Add support for 24-bit FLAC audio files (thanks sudevschiz!).
* Fix `phrase_time_limit` being ignored for `listen_in_background` (thanks dodysw!)
* Added lots of new audio regression tests.
* Code cleanup for tests and examples.

3.6.3

Small bugfix release:

* Handle case when GSR doesn't return a confidence value (thanks jcsilva!).
* Config, style, and release improvements.
* Fix console window sometimes popping up when on Windows (thanks Qdrew!)
* Switch release over to universal Wheels rather than source distribution.

3.6.0

This is more of a maintenance release, but a few features slipped in as well:
- **Support for the Google Cloud Speech API** with `recognizer_instance.recognize_google_cloud` (thanks Thynix!), plus documentation and examples.
- **Automatic sample rate detection** in `speech_recognition.Microphone` - this should fully resolve all the "Invalid sample rate" issues from PyAudio.
- Project now has **automated tests and continuous integration** with TravisCI. It's pretty nifty, and has already caught a few things during development!
- Keywords example for `recognizer_instance.recognize_sphinx`.
- Documentation improvements and updated advice in troubleshooting and library reference.
- Bugfix - Google Speech Recognition sometimes didn't return the text with the highest confidence (thanks akabraham!).
- Bugfix - `EOFError` upon encountering malformed audio files; a proper exception message is now given.
- Updated FLAC binaries for OS X.
- Bugfix - invalid FLAC binary path on OS X (thanks akabraham!).
- Code cleanup.

Page 2 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.