Documentation: [https://pythainlp.github.io/docs/2.3/index.html](https://pythainlp.github.io/docs/2.3/index.html
)
Report bug: [https://github.com/PyThaiNLP/pythainlp/issues](https://github.com/PyThaiNLP/pythainlp/issues)
You can install or upgrade using *pip install -U pythainlp*
See [PyThaiNLP 2.3 change log](https://github.com/PyThaiNLP/pythainlp/issues/445) #445
Deprecation and other API changes
- NER change a ThaiNER model (from ThaiNER 1.4 to ThaiNER 1.5). If you need use ThaiNER 1.4 model, You can use version in ThaiNameTagger class. `pythainlp.tag.named_entity.ThaiNameTagger(version: str = '1.4')` (Docs: https://pythainlp.github.io/dev-docs/api/tag.html#pythainlp.tag.named_entity.ThaiNameTagger)
Tokenizer
- 484 Add: model option for `attacut.tokenize()`
- 502 Add: `corpus.util.revise_wordset()` to revise tokenization dictionary
- 503 Add: `NERCut` tokenization engine
Corpus
- **License change:**
- All corpora, datasets, and documentation created by PyThaiNLP project are now released under [Creative Commons Zero 1.0 Universal Public Domain Dedication License](https://creativecommons.org/publicdomain/zero/1.0/) (CC0).
- All language models created by PyThaiNLP project are released under [Creative Commons Attribution 4.0 International Public License](https://creativecommons.org/licenses/by/4.0/) (CC-by).
- 449 Fix: remove instances with `[` or `]` from etcc.txt
- 467 Add: `corpus.common.provinces()` can now return romanized names
- 476 Add: `thai_family_names()` to get a set of Thai family names
- 487 Fix: `thailand_provinces_th.csv` not found issue
- 492 Fix: remove erroneous `AITT` tag from ORCHID to UD table -- thanks c4n for the fix
POS Tagger
- 464 Add: `LST20` language model for part-of-speech tagging
- 468 Add: port `PerceptronTagger` from NTLK. POS tagging no longer needs NLTK for dependency.
- 478 Update: ORCHID POS tags documentation
Name Entity Tagging
- 526 Update ThaiNER 1.4 to ThaiNER 1.5
- 538 Add ThaiNameTagger version and add ThaiNER 1.4 support
Transliterate
- 485 Fixed Romanize failed in some examples
- 511 Add Thai W2P (Thai Word-to-Phoneme converter)
Text Summarize
- 523 Add mT5 text summarize to `pythainlp.summarize`
Chunk parser
- 524 Add `pythainlp.tag.chunk`
Util
- 481 Fix: `remove_repeat_vowels()` bug that remove spaces between different vowels
- 483 Add: add `remove()` method to remove a word from a trie -- thanks korakot
- 490 Fix: `thai_strftime()` - normalize output for unsupported directive (running in glibc and musl should produce the same output)
- 512 Add: `emoji_to_thai()` to convert emoji to Thai description -- thanks ppirch for the development
- 513 Add: `thai_keyboard_dist()` to calculate euclidean distance between two characters according to their location on a Thai keyboard layout -- thanks ppirch for the development
Thanks all the [contributors](https://github.com/PyThaiNLP/pythainlp/graphs/contributors). (Image made with [contributors-img](https://contributors-img.firebaseapp.com))
<a href="https://github.com/PyThaiNLP/pythainlp/graphs/contributors">
<img src="https://contributors-img.firebaseapp.com/image?repo=PyThaiNLP/pythainlp" />
</a>
We build Thai NLP.
PyThaiNLP