Somajo

Latest version: v2.4.2

Safety actively analyzes 628903 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 9

1.10.2

- The error that 1.10.1 tried to fix was not really caused by the
version numbers of regex but by specifying our own version number in
__init__.py where we also indirectly load required modules.

1.10.1

- Use semantic versioning to specify minimal required version of
regex. This fixes a bug where the dependency was not correctly
installed.

1.10.0

- Treat emoji sequences that render as a single grapheme as a single
token. This includes flags and sequences containing modifiers and
zero-width joiners.
- Recognize underscores used for "underlining" and split them off.
- Added a few Unicode formatting characters to the “nasty” characters.
- Replaced POSIX character classes with built-ins or Unicode
properties.

1.9.0

- New method Tokenizer.tokenize_file for easy tokenization of files
from Python
- Added text and emoji variation selectors.
- Added new English abbreviation (Appl'n.).

1.8.3

- Fixed a bug that caused abbreviations with internal dots but without
final dot to be split up erroneously (e.g. E.ON).

1.8.2

- Fixed a bug with degree measurements in English (°F, etc.).
- Fixed a bug that caused SoMaJo to hang when an XML tag occured
within a token that is allowed to contain whitespace.

Page 6 of 9

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.