Annif

Latest version: v1.1.0

Safety actively analyzes 630656 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 5 of 8

0.51.0

This release includes a new [STWFSA backend](https://github.com/NatLibFi/Annif/wiki/Backend%3A-STWFSA) which is a wrapper around [STWFSAPY](https://github.com/zbw/stwfsapy), a lexical algorithm based on finite state automata. It achieves best results with short texts, i.e., titles and author keywords, and is best suited for English language data.

The NN ensemble backend has been improved with better handling of source weights. Retraining NN ensemble models after updating Annif to this version is recommended, since the quality of results can decrease if old models are used. A new option for several CLI commands has been added: `--docs-limit/-d` option can be used to limit the number of documents to process, for example to create learning-curve data. Also several bugs have been fixed.

New features:
438 Lexical STWFSAPY Backend (credit mo-fu)
465 Limit document number CLI option

Improvements:
457/458 Improved handling of source weights in NN ensemble

Bug fixes:
454/455 Address SonarCloud complaints
459/460 Pass limit parameter to Maui Server during train
463 Fix TruncatingCorpus iterator

0.50.0

This release introduces a setting to use only a part of the input text for subject indexing: the new `input_limit` project parameter truncates the input text to the given character number. This can improve the quality of the suggestions as the beginning of a long document typically includes an abstract and introduction. The default value for `input_limit` is zero, which means that truncation is not performed.

Improvements include better handling of cached data in nn_ensemble training and optimization of memory usage in evaluation by using sparse matrices for suggested subjects. Many dependencies have been updated and a few minor issues fixed.

New features:
446 Add a backend paratemer to limit input characters in suggest
452 Apply the input_limit backend parameter to texts in train & learn

Improvements:
441 Sparse subjects (credit mo-fu)
443/444 Allow use of cached data after cancelled training of nn_ensemble backend

Maintenance:
448 Upgrade dependencies
445 Upgrade LMDB dependency from 0.98 to 1.0.0
449 Resolve DeprecationWarning: change warn to warning

Bug fixes:
447 Fix missing default params in pav and nn ensemble

0.49.0

This release introduces the hyperopt CLI command for hyperparameter optimization. Initially it can only be used for finding optimal ensemble weights. The Web UI now follows the same visual style as the annif.org website. There are also some improvements to CLI commands, memory optimizations and bug fixes.

New features:
* 240/321/414 Hyperparameter optimization of ensemble weights

Improvements:
* 424/426 New style for Web UI
* 430 Define short form for CLI options and fix some of their docstrings
* 428 Memory optimization: Avoid double allocation of NumPy arrays in eval operation

Maintenance:
* 437 Upgrade TensorFlow to version 2.3.0 (from 2.2.0)

Bug fixes:
* 431 Problem parsing timestamps from Maui Server
* 432 Make modification timestamps timezone-aware

0.48.0

This release brings a major upgrade of the fastText library, switching from the old fasttextmirror package to the new official fasttext Python bindings. The generation of fastText training files has been rewritten. The release also introduces an experimental feature to speed up model evaluation using multiprocessing; a `--jobs N` option can be used with [the `eval` command](https://github.com/NatLibFi/Annif/wiki/Commands#evaluate-on-a-collection-of-manually-indexed-files) to perform evaluation in N parallel jobs. Another new feature is the addition of project state details to project information listings (is a project trained or not, and timestamp of training). Also minor improvements and bug fixes are included.

New features:
- 65/417/418/425 Evaluate documents in parallel
- 329/415 Show project train state and modification time

Improvements:
- 290/292/409/412 Upgrade fastText to official version 0.9.2 (credit: mvsjober)
- 413 Upgrade to omikuji 0.3.x

Maintenance:
- 411 Run Travis CI fastText tests on Python 3.7 instead of 3.6
- 421 Pin SciPy to 1.4.1 as required by TensorFlow 2.2.0

Bug fixes:
- 422 Assign first retrieved project to selected variable (credit: mo-fu)
- 419 WEB-UI: Remove empty entry from list of projects (credit: mo-fu)
- 357/410 fastText training file incorrectly generated

0.47.1

This patch release installs [Tensorflow 2.2 without GPU support](https://pypi.org/project/tensorflow-cpu/) (introduced by default in [TF 2.1](https://github.com/tensorflow/tensorflow/releases/tag/v2.1.0)) as currently Annif does not benefit from the GPU support but it takes quite much disk space. This patch reduces the size of Annif's Docker image from 2.4 GB to 1.4 GB.

0.47.0

This release changes the Python version requirement to 3.6+ and drops the usage of Pipenv in development installations. The TensorFlow library has been upgraded to version 2.2, which means that all features are now supported also under Python 3.8.

The `eval` command is supplemented by introducing weighted subject average as a metric and the possibility to output metrics separately for each subject (thus allowing to explore e.g. how often a subject was suggested correctly); also some metrics are given more specified interpretation in the output.

Other changes include the possibility to display notation codes (when available) in web UI as well as minor improvements, bug fixes, and maintenance tasks.

New features:
- 392 Evaluate samples: specify interpretation of metrics (credit: Veldhoen)
- 391/393 Evaluation per-subject (credit: Veldhoen)
- 390/397 Show notations in web UI results list

Improvements:
- 405/403 Upgrade to TensorFlow 2.2, Python 3.6+, drop Pipenv
- 395/396 Don't give suggestions for empty input
- 389/401/402 Improved error handling in maui backend
- 399 Miscellaneous minor improvements for readthedocs builds

Maintenance:
- 400 Dockerfiles reorg and cleanup
- 407 Adding secrets needed by new Drone instance

Bug fixes:
- 394 Fix click 7.1 compatibility in tests
- 398 Fix silently failing readthedocs builds

Page 5 of 8

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.