Changelogs » Rapidfuzz

PyUp Safety actively tracks 295,363 Python packages for vulnerabilities and notifies you when to upgrade.

Rapidfuzz

1.1.1

Fixed
  - Fix result conversion process.extract (see 79)

1.1.0

Changed
  - string_metric.normalized_levenshtein supports now all weights
  - when different weights are used for Insertion and Deletion the strings are not swapped inside the Levenshtein implementation anymore. So different weights for Insertion and Deletion are now supported.
  - replace C++ implementation with a Cython implementation. This has the following advantages:
  - The implementation is less error prone, since a lot of the complex things are done by Cython
  - slighly faster than the current implementation (up to 10% for some parts)
  - about 33% smaller binary size
  - reduced compile time
  - Added **kwargs argument to process.extract/extractOne/extract_iter that is passed to the scorer
  - Add max argument to hamming distance
  - Add support for whole Unicode range to utils.default_process
  
  Performance
  - replaced Wagner Fischer usage in the normal Levenshtein distance with a bitparallel implementation

1.0.2

Fixed
  - The bitparallel LCS algorithm in fuzz.partial_ratio did not find the longest common substring properly in some cases.
  The old algorithm is used again until this bug is fixed.

1.0.1

Changed
  - string_metric.normalized_levenshtein supports now the weights (1, 1, N) with N >= 1
  
  Performance Improvements
  - The Levenshtein distance with the weights (1, 1, >2) do now use the same implementation as the weight (1, 1, 2), since
  `Substitution > Insertion + Deletion` has no effect
  
  Fixed
  - fix uninitialized variable in bitparallel Levenshtein distance with the weight (1, 1, 1)

1.0.0

Changed
  - all normalized string_metrics can now be used as scorer for process.extract/extractOne
  - Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future.
  - increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future
  - improved docstrings of functions
  
  Performance Improvements
  - Added bit-parallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2).
  - Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bit-parallel implementation.
  - Improved performance of `fuzz.partial_ratio`
  -> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance.
  - Improved performance of `process.extract` and `process.extractOne`
  
  Deprecated
  - the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0
  These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`.
  
  Added
  - added normalized version of the hamming distance in `string_metric.normalized_hamming`
  - process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff
  
  Fixed
  - multiple bugs in extractOne when used with a scorer, that's not from RapidFuzz
  - fixed bug in `token_ratio`
  - fixed bug in result normalization causing zero division

0.14.2

Fixed
  - utf8 usage in the copyright header caused problems with python2.7 on some platforms (see 70)

0.14.1

Fixed
  - when a custom processor like `lambda s: s` was used with any of the methods inside fuzz.* it always returned a score of 100. This release fixes this and adds a better test coverage to prevent this bug in the future.

0.14.0

Added
  - added hamming distance metric in the levenshtein module
  
  Changed
  - improved performance of default_process by using lookup table

0.13.4

Fixed
  - Add missing virtual destructor that caused a segmentation fault on Mac Os

0.13.3

Added
  - C++11 Support
  - manylinux

0.13.2

Fixed
  - Levenshtein was not imported from \_\_init\_\_
  - The reference count of a Python Object inside process.extractOne was decremented to early

0.13.1

Improved
  - process.extractOne  exits early when a score of 100 is found. This way the other strings do not have to be preprocessed anymore.

0.13.0

Fixed
  - string objects passed to scorers had to be strings  even before preprocessing them. This was changed, so they only have to be strings after preprocessing similar to process.extract/process.extractOne
  
  Improved
  - process.extractOne is now implemented in C++ making it a lot faster
  - When token_sort_ratio or partial_token_sort ratio is used inprocess.extractOne the words in the query are only sorted once to improve the runtime
  
  Changed
  - process.extractOne/process.extract do now return the index of the match, when the choices are a list.
  - process.extractIndices got removed, since the indices are now already returned by process.extractOne/process.extract

0.12.5

Fixed
  - fix documentation of process.extractOne (see 48)

0.12.4

Changed
  - Added wheels for
  - CPython 2.7 on windows 64 bit
  - CPython 2.7 on windows 32 bit
  - PyPy 2.7 on windows 32 bit

0.12.3

Fixed
  - fix bug in partial_ratio (see 43)

0.12.2

Fixed
  - fix inconsistency with fuzzywuzzy in partial_ratio when using strings of equal length

0.12.1

Fixed
  - MSVC has a bug and therefore crashed on some of the templates used. This Release simplifies the templates so compiling on msvc works again

0.12.0

Improved
  - partial_ratio is using the Levenshtein distance now, which is a lot faster. Since many of the other algorithms use partial_ratio, this helps to improve the overall performance

0.11.3

Fixed
  - fix partial_token_set_ratio returning 100 all the time

0.11.2

Changed
  - add rapidfuzz.\_\_author\_\_, rapidfuzz.\_\_license\_\_ and rapidfuzz.\_\_version\_\_

0.11.1

Fixed
  - do not use auto junk when searching the optimal alignment for partial_ratio

0.11.0

Changed
  - support for python 2.7 added 40
  - add wheels for python2.7 (both pypy and cpython) on MacOS and Linux

0.10.0

Changed
  - wheels are now build for Python3.9 aswell
  
  Fixed
  - tuple scores in process.extractOne are now supported 39