English
Hello World. Today, we're happy to announce the availability of PyThaiNLP. It has been four years since PyThaiNLP's the first release. Thank you very much for supporting PyThaiNLP.
Summary – Release Highlights
New Features
Tokenizer
- Fix **longest** engine, last character is now consumed
- Add **CRFCut** sentence segmentation
Transliteration
- Add Thai Grapheme-to-Phoneme (Thai G2P) deep learning sequence-to-sequence model
Normalization
- Add more normalize functions, like remove zero-width characters, remove duplicate spaces, etc.
Utilities
- Add thaiword_to_date() and thaiword_to_time()
- Fix countthai() to handle a case where the text has only numbers and symbols
Command line
- Update command and sub-command syntax - see[ command line docs](https://github.com/PyThaiNLP/pythainlp/blob/dev/docs/notes/command_line.rst)
**Others**
- Code improvement: Move non-init code out of __init__.py files, etc.
- Remove dependency: Unigram POS tagger no longer need NLTK module
Installation
You can install or upgrade using *pip install -U pythainlp*
Change log: https://github.com/PyThaiNLP/pythainlp/issues/330