Errant

Latest version: v3.0.0

Safety actively analyzes 629811 Python packages for vulnerabilities to keep your Python projects secure.

Page 2 of 3

2.2.2

1. Added a copy of the NLTK Lancaster stemmer to `errant.en.lancaster` and removed the NLTK dependency. It was overkill to require the entire NLTK package just for this stemmer so we now bundle it with ERRANT.

2. Replaced the deprecated `tokens_from_list` function from spaCy v1 with the `Doc` function from spaCy v2 in `Annotator.parse`.

2.2.1

Fixed key error in the classifier for rare spaCy 2 POS tags: _SP, BES, HVS.

2.2.0

1. ERRANT now works with spaCy v2.2. It is 4x slower, but this change was necessary to make it work on Python 3.7.

2. SpaCy 2 uses slightly different POS tags to spaCy 1 (e.g. auxiliary verbs are now tagged AUX rather than VERB) so I updated some of the merging rules to maintain performance.

2.1.0

1. The character level cost in the sentence alignment function is now computed by the much faster [python-Levenshtein](https://pypi.org/project/python-Levenshtein/) library instead of python's native `difflib.SequenceMatcher`. This makes ERRANT 3x faster!

2. Various minor updates:
* Updated the English wordlist.
* Fixed a broken rule for classifying contraction errors.
* Changed a condition in the calculation of transposition errors to be more intuitive.
* Partially updated the ERRANT POS tag map to match the updated [Universal POS tag map](https://universaldependencies.org/tagset-conversion/en-penn-uposf.html). Specifically, EX now maps to PRON rather than ADV, LS maps to X rather than PUNCT, and CONJ has been renamed CCONJ. I did not change the mapping of RP from PART to ADP yet because this breaks several rules involving phrasal verbs.
* Added an `errant.__version__` attribute.
* Added a warning about using ERRANT with spaCy 2.
* Tidied some code in the classifier.

2.0.0

1. ERRANT has been significantly refactored to accommodate a new API (see README). It should now also be much easier to extend to other languages.

2. Added a `setup.py` script to make ERRANT `pip` installable.

3. The Damerau-Levenshtein alignment code has been rewritten in a much cleaner Python implementation. This also makes ERRANT ~20% faster.

Note: All these changes do **not** affect system output compared with the previous version. For the first `pip` release, we wanted to make sure v2.0.0 was fully compatible with the [BEA-2019 shared task](https://www.cl.cam.ac.uk/research/nl/bea2019st/) on Grammatical Error Correction.

Thanks to [sai-prasanna](https://github.com/sai-prasanna) for inspiring some of these changes!

1.4

1. The `compare_m2.py` evaluation script was refactored to make it easier to use.

2. We tweaked the alignment code and merging rules to not only make ERRANT ~700% faster, but also slightly more accurate.

Specifically, we simplified the lemma cost to not repeatedly call the lemmatiser for different parts-of-speech, and also replaced the character cost with python's native `difflib.SequenceMatcher` instead of a character based Damerau-Levenshtein alignment.

This significantly increased the speed, but also slightly decreased performance (~0.5 F1 worse), so we additionally revisited the merging rules. The new implementation now processes the largest combinations of adjacent non-matches first, instead of processing one alignment at a time, and now also features some new or slightly modified rules (see `scripts/align_text.py` for more information).

The differences between the old and new version are summarised in the following table.

| Dataset | Sents | Setting | P | R | F1 | Time (secs) |
|--------------|------:|-----------:|---------------:|---------------:|-------------------:|----------------:|
| FCE Dev | 2371 | Old New | 82.77 84.00 | 85.22 85.52 | 83.98 **84.75** | 260 **40** |
| FCE Test | 2805 | Old New | 83.88 85.17 | 85.84 85.93 | 84.85 **85.55** | 300 **45** |
| FCE Train | 30200 | Old New | 82.69 84.06 | 85.12 85.38 | 83.89 **84.72** | 2965 **340** |
| CoNLL-2013 | 1381 | Old New | 82.64 83.27 | 82.45 82.24 | 82.54 **82.75** | 315 **45** |
| CoNLL-2014.0 | 1312 | Old New | 78.48 79.02 | 80.38 80.18 | 79.42 **79.59** | 350 **45** |
| CoNLL-2014.1 | 1312 | Old New | 82.50 84.04 | 82.73 82.85 | 82.61 **83.44** | 385 **50** |
| NUCLE | 57151 | Old New | 70.14 73.20 | 80.27 81.16 | 71.95 **76.97** | 7565 **725** |

Page 2 of 3

Releases

Has known vulnerabilities

Previous Next

Errant

Page 2 of 3

2.2.2

2.2.1

2.2.0

2.1.0

2.0.0

1.4

Page 2 of 3

Links

Releases