Changelogs » Chardet

PyUp Safety actively tracks 308,767 Python packages for vulnerabilities and notifies you when to upgrade.

Chardet

186413.51111111112


        

181965.46637744035


        

177349.00634249471


        

38176.31067961165


        

25559.439366240098

big5: 7.187002209518091 X
  cp932: 4.71090956645177 X
  cp949: 2.937256786994428 X
  euc-jp: 4.870580412090848 X
  euc-kr: 6.6910755971933416 X
  euc-tw: 87.71098043480079 X
  gb2312: 6.614302607154443 X
  ibm855: 27.595893549680685 X

25055.57945041816


        

3024.2295767539117


        

211.27653358317968


        

108.62174360115105


        

99.48409117053738


        

90.230745070368


        

61.008422013862194


        

59.25262902122995


        

44.181691968506


        

43.16933400329809


        

41.21560517643845


        

39.7069713674529


        

33.30225548069821


        

31.402474369805002


        

29.93483661732791

iso-2022-jp: 3379.5052775763434 X
  iso-2022-kr: 26181.67290886392 X
  iso-8859-1: 120.63424740403983 X
  iso-8859-5: 32.65106262196898 X
  iso-8859-7: 62.480089080556084 X

23.038910006925768


        

16.15113643694104

Total time: 268.0230791568756s (13.394368915143872 calls per second)
  
  
  
  
  Thank you to aaaxx, edumco, hrnciar, hroncok, jdufresne, mdamien, saintamh , xeor for submitting pull requests, to all of our users for being patient with how long this release has taken.
  
  Full changelog
  
  
  - Convert single-byte charset probers to use nested dicts for language models (121) dan-blanchard
  - Add API option to get all the encodings confidence (111) mdamien
  - Make sure pyc files are not in tarballs (d7c7343) dan-blanchard
  - Add benchmark script (d702545, 8dccd00, 726973e, 71a0fad) dan-blanchard
  - Include license file in the generated wheel package (141) jdufresne
  - Drop support for Python 2.6 (143) jdufresne
  - Remove unused coverage configuration (142) jdufresne
  - Doc the chardet package suitable for production (144) jdufresne
  - Pass python_requires argument to setuptools (150) jdufresne
  - Update pypi.python.org URL to pypi.org (155) jdufresne
  - Typo fix (159) saintamh
  - Support pytest 4, don't apply marks directly to parameters (PR 174, Issue 173) hroncok
  - Test Python 3.7 and 3.8 and document support (175) jdufresne
  - Drop support for end-of-life Python 3.4 (181) jdufresne
  - Workaround for distutils bug in python 2.7 (165) xeor
  - Remove deprecated license_file from setup.cfg (182) jdufresne
  - Remove deprecated 'sudo: false' from Travis configuraiton (200) jdufresne
  - Add testing for Python 3.9 (201) jdufresne
  - Adds explicit os and distro definitions (140) edumco
  - Remove shebang from nonexecutable script (192) hrnciar
  - Remove use of deprecated 'setup.py test' (187) jdufresne
  - Remove unnecessary numeric placeholders from format strings (176) jdufresne
  - Update links (152) aaaxx
  - Remove shebang and executable bit from chardet/cli/chardetect.py (171) jdufresne
  - Handle weird logging edge case in universaldetector.py (056a2a4) dan-blanchard
  - Switch from Travis to GitHub Actions (204) dan-blanchard
  - Properly set CharsetGroupProber.state to FOUND_IT (PR 203, Issue 202) dan-blanchard
  - Add language to detect_all output (1e208b7) dan-blanchard

14.408875278821073


        

14.248865889128146


        

13.72481001727257

maccyrillic: 33.018537255804496 X
  shift_jis: 4.996013583677438 X
  tis-620: 14.323112928341818 X
  utf-16: 166771.53081510935 X
  utf-32: 198782.18009478672 X
  utf-8: 13.966236809766901 X
  utf-8-sig: 193732.28637413395 X

12.86915132656389


        

8.16386823884839


        

7.282976434315926


        

6.336261495718825

Total time: 357.05358052253723s (10.054513372323958 calls per second)

4.9091652743515155


        

4.656400877065864


        

4.329381447610525


        

4.0.0

Benchmarking chardet 4.0.0 on CPython 3.7.5 (default, Sep  8 2020, 12:19:42)
  [Clang 11.0.3 (clang-1103.0.32.62)]
  --------------------------------------------------------------------------------
  .......................................................................................................................................................................................................................................................................................................................................................................
  Calls per second for each encoding:

3.0.4

This minor bugfix release just fixes some packaging and documentation issues:
  
  -  Fix issue with `setup.py` where `pytest_runner` was always being installed. (PR 119, thanks zmedico)
  -  Make sure `test.py` is included in the manifest (PR 118, thanks zmedico)
  -  Fix a bunch of old URLs in the README and other docs. (PRs 123 and 129, thanks qfan and jdufresne)
  -  Update documentation to no longer imply we test/support Python 3 versions before 3.3 (PR 130, thanks jdufresne)

3.0.3

This release fixes a crash when debugging logging was enabled.  (Issue 115, PRs 117 and 125)

3.0.2

Fixes an issue where `detect` would sometimes return `None` instead of a `dict` with the keys `encoding`, `language`, and `confidence` (Issue 113, PR 114).

3.0.1

This bugfix release fixes a crash in the EUC-TW prober when it encountered certain strings (Issue 67).

3.0.0

This release is long overdue, but still mostly serves as a placeholder for the impending 4.0.0 release, which will have retrained models for better accuracy.  For now, this release will get the following improvements up on PyPI:
  
  -  Added support for Turkish ISO-8859-9 detection (PR 41, thanks queeup)
  -  Commented out large unused sections of Big5 and EUC-KR tables to save memory (8bc4b89)
  -  Removed Python 3.2 from testing, but add 3.4 - 3.6
  -  Ensure that stdin is open with mode `'rb'` for `chardetect` CLI. (PR 38, thanks lpsinger)
  -  Fixed `chardetect` crash with non-ascii file names (PR 39, thanks nkanaev)
  -  Made naming conventions more Pythonic throughout (no more `mTypicalPositiveRatio`, and instead `typical_positive_ratio`)
  -  Modernized test scripts and infrastructure so we've got Travis testing and all that stuff
  -  Rename `filter_without_english_words` to `filter_international_words` and make it match current Mozilla implementation (PR 44, thanks rsnair2)
  -  Updated `filter_english_letters` to match C implementation (c6654595)
  -  Temporarily disabled Hungarian ISO-8859-2 and Windows-1250 detection because it is very inaccurate (da6c0a079)
  -  Allow CLI sub-package to be importable (PR 55)
  -  Add a `hypotheis`-based test (PR 66, thanks DRMacIver)
  -  Strip endianness from UTF with BOM predictions so that the encoding can be passed directly to `bytes.decode()` (PR 73, thanks snoack)
  -  Fixed broken links in docs (PR 90, thanks roskakori)
  -  Added early exit to `chardetect` when encoding is detected instead of looping through entire file (PR 103, thanks jpz)
  -  Use `bytearray` objects internally instead of `wrap_ord` calls, which provides a nice performance boost across the board (PR 106)
  -  Add `language` property to probers and `UniversalDetector` results (PR 180)
  -  Mark the 5 known test failures as such so we can have more useful Travis build results in the meantime (d588407)

2.3.0

In this release, we:
  - Added support for CP932 detection (thanks to hashy).
  - Fixed an issue where UTF-8 with a BOM would not be detected as UTF-8-SIG (8).
  - Modified `chardetect` to use `argparse` for argument parsing.
  - Moved docs to a `gh-pages` branch.  You can now access them at http://chardet.github.io.

2.2.1

Fix missing paren in chardetect.py

2.2.0

First version after merger with charade. Loads of little changes.