- 2021-12-10, Euclase
- Important changes
- Upgrade: spaCy v3.2 and Sudachi.rs(SudachiPy v0.6.2)
- Change token information fields 208 209
- `doc.user_data[“reading_forms”][token.i]` -> `token.morph.get(“Reading”)`
- `doc.user_data[“inflections”][token.i]` -> `token.morph.get(“Inflection”)`
- `force_using_normalized_form_as_lemma(True)` -> `token.norm_`
- All spaCy models, including non-Japanese, are now available with the ginza command 217
- Download and analyze the model at once by specifying the model name in the following form 219
- `ginza -m en_core_web_md`
- `ginza -f json` option always analyze the line which starts with `` regardless the option value of `-c`. 215
- Improvements
- Batch analysis processing speeds up by 50-60% in GPU environment and 10-40% in CPU environment
- Improved processing efficiency of parallel execution options (`ginza -p {n_process}` and `ginzame`) of ginza command 204
- add tests 198 210 214
- add benchmark 207 220