Python-sumo

Latest version: v0.3.0

Safety actively analyzes 621409 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.3.0

------------------
* Updated class structure, to allow for addition of new solvers.
* Implemented a *supervised solver* for sumo, which allows to include "a priori" knowledge about labels of fraction of samples to improve the factorization results. This solver is automatically enabled when the '-labels' parameter is used.
* Fixed error that prevented using sumo *interpret* with newer hyperopt versions (>0.2.5)

0.2.7

------------------
* Add random seed parameter for sumo *run*.
* Add more arrays in sumo_results.npz files:
- 'steps' array, with number of iterations/steps reached in each repetition of the factorization;
- 'config' array with simulation parameters (including sparsity).
* Add warning in eta*.log files if more then 90% of factorization repetitions finished in reaching set maximum number of iterations.
* Update plotting function and add 'steps' plot, produced when using the -DEBUG flag,
* Remove incorrect assertion about Euclidean Distance being bound to [0,1] range.
* Add entry point to run sumo directly from the repository (run.py).
* Updated the function checking is feature matrix is standardized in sumo *prepare*. Now reporting a range of feature means and standard deviations.

0.2.6

------------------
* Updated REs scaling for consensus matrix creation.
* Add sample identifiers to sumo *run* result files.
* Updated documentation to include the detailed description of arrays in each .npz result file and an example of integration of somatic mutation data into SUMO workflow.
* Improved execution speed of sumo *prepare* by updating the filtering of loaded datasets and incorporating numba for euclidean distance calculation.
* Improved execution speed of sumo *run* by updating the resampling during the factorization.

0.2.5

------------------
* Added Dockerfile.
* Improved clustering quality assessment by better utilization of consensus clustering in sumo *run*:
- introduced clustering different random of subsets of samples in each run of factorization (fraction of samples removed in each run can be set with '-subsample' parameter);
- increased default number of runs and introduced creation of multiple consensus matrices based on subsets of runs;
- results .npz file now contains multiple PAC and CCC values (which are calculated for each consensus matrix);
- updated plotting of PAC and CCC curves to show error bars.
* Updated scikit-learn version requirement.

0.2.4

------------------
* Sumo *interpret* now creates two output files:
- .tsv file containing matrix (features x clusters), where the value in each cell is the importance of the feature in that cluster;
- .hits.tsv file containing features of most importance (number of top hits can be set with '-hits' parameter).
* Fixed training dataset in *interpret* to contain 80% of every unique class label.

0.2.3

------------------
* Handle NaN values of cophenetic correlation coefficient.
* Update vignette.
* Fix issue resulting in not closing log files in *run*.
* If output directory of *run* already exists, remove it instead of overwriting.
* Change error information for data not meeting standardization thresholds in *prepare*.
* Add column-wise normalization of H matrix in *run* before cluster extraction using max_value method.

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.