Pythainlp

Latest version: v5.0.3

Safety actively analyzes 630052 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 13 of 21

2.0.6

- fixed 230
- new train ThaiNER

2.0.5

- Clean word lists in `pythainlp.corpus` (remove duplicates, etc.)
- Fix/add return type hinting for functions in `pythainlp.corpus`
- Fix deprecated inline flag for regular expression in `pythainlp.corpus.tnc` (Thai National Corpus)
- Bug fix: reorder condition checks in `pythainlp.tokenize.dict_trie` so it catch `Trie` before `Iterable`

2.0.4

- `word_tokenize()`'s argument `whitespaces` is now `keep_whitespace` to make is more explicit, default behavior is to keep whitespaces
- `word_tokenize()` can now take a custom dictionary throught `custom_dict` parameter
- `dict_word_tokenize()` will be deprecated soon

2.0.3

- Fix TCC (Thai Textbook Corpus) corpus always downloading new file issue
- Words and their frequencies from TTC (Thai Textbook Corpus) now has a local copy at `ttc_freq.txt` inside `pythainlp.corpus`.
- Other refactoring and code improvements, including ones related to subword tokenization (Thai Character Cluster / TCC and ETCC), see 193

2.0.2

- Fixed tree map
- Subword tokeniser documentation improvement https://github.com/PyThaiNLP/pythainlp/pull/190

2.0.1

- Add Tokenizer from pythainlp.tokenize.Tokenizer 79432c2
- NER fixes, code cleaning, and type hinting 186

Page 13 of 21

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.