Sourmash

Latest version: v4.8.8

Safety actively analyzes 625610 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 10

4.5.0

sourmash v4.5.0 provides several minor bug fixes, as well as a number of new features.

This release also includes two minor Python API breaking changes - by default, `SourmashSignature` objects loaded from files are "frozen", and we force explicit keyword arguments on `MinHash` object construction.

Finally, this release updates the sourmash documentation with several new tutorials, including one on using `sourmash tax` to classify metagenomes with MAGs + GTDB.

Bug fixes

* Fix `sourmash tax` argument parsing for multiple `-g` and `-t` arguments (2218)
* Prevent loading multiple independent gather results files in `sourmash tax` (2244)
* Fix `query_abundance` column when `--ignore-abundance` is set in gather (2251)
* fix pickle protocol to properly adjust `ksize` in `__getstate__` (2265)
* clean up zip error handling for bad zip files (2270)

Minor new features

* Use the bias factor for containment when estimating ANI (2057)
* add human output format to `sourmash tax`; provide tutorials (2158)
* add kreport output format to tax metagenome (2239, 2249)
* add `--distance-matrix` option to `sourmash compare` (2225)
* update database load UX for `gather` etc. (2204)
* add generic support for gzipped and zipfile CSVs (2195)
* implement `tax grep` to produce identifier picklists from taxonomies (2178)

Cleanup and documentation fixes

* add `sourmash tax` tutorial (2158)
* revise command-line docs for `sourmash sig` subcommands (1714, 1717)
* Clarify containment direction for matrix output (2215)
* Add ANGUS tutorial to docs (1114)
* update links to static rmd (1177)
* update `search` documentation, help, and output. (2222)
* Fix signature filter command (2159)
* fix notification message about query scaled (2183)
* adjust gather output width on terminal (2176)

Developer updates

* Add `FrozenSourmashSignature` (1610)
* force explicit kwargs on MinHash constructor (2174)
* fix ReadTheDocs by using a more recent conda version (2231)
* refactor and add tests for containment direction for ANI calculation (2215)
* fix `test_storage_convert` to allow success of `sourmash convert` (2232)
* Updating `tests/test_sourmash.py::test_storage_convert` to use `runtmp` fixture instead of `utils.TempDirectory()` (1739)
* Bump pypa/cibuildwheel from 2.8.1 to 2.9.0 (2207)
* use stderr for test output printing (2217)
* fix for sphinx 5.10 (2147)

4.4.3

Minor new features:
* use and report ANI from `tax genome` summarization (2005)

Performance improvements:
* avoid instantiating a hashes class (2132)

Cleanup and documentation fixes:
* update various descriptions to talk about k-mers, not just DNA (2137)

Developer updates:
* fix docs building for pip 22.2 (2143)
* change dependabot rebase-strategy to disabled for rust dependencies (2142)
* Rust deps and nix flakes updates (2141)
* add `pytest-xdist` and `-n4` to pytest and tox configs (2138)
* update release instructions after v4.4.2 (2131)

4.4.2

Minor fixes and performance improvements:

* circumvent a very slow `MinHash.remove_many(...)` call in `sourmash gather` (2123)

Developer updates:

* substantial refactoring of `CounterGather` and related `Index` code. (2116)
* update `Index` protocol tests to include tests for `peek` and `consume` (2111)
* Bump pypa/cibuildwheel from 2.7.0 to 2.8.0 (2118)
* test insert after downsample for LCA_Database (2117)
* update release notes & pyproject.toml after v4.4.1 (2114)

4.4.1

Major new features:

* less stringent size accuracy parameters for ANI accuracy reporting (2074)
* only skip dist est if containment/jaccard are 0 or 1 (2060)
* emit fewer warnings about potential ANI estimation issues (2061)

Minor new features:

* fix `lca summarize` to support general collections for queries (2107)
* add compare --avg-containment (2056)

Documentation updates:

* fix search and gather docs (2105)
* fix `CITATION.cff` YAML and add a test for parseability and content. (2103)

Developer updates:

* move setup.cfg into pyproject.toml (2097)
* Fix downsample_scaled in `core` (2108)
* add picklist tests; support for allow_empty (2106)
* remove LazyLoadedIndex (2104)
* Bump web-sys from 0.3.57 to 0.3.58 (2092)
* Bump getrandom from 0.2.6 to 0.2.7 (2090)
* Bump wasm-bindgen-test from 0.3.30 to 0.3.31 (2093)
* Bump pypa/cibuildwheel from 2.6.1 to 2.7.0 (2089)
* Build: nix updates (2088)
* CI: split wheel building (2087)
* rust version bumps (2086)
* Update sphinx requirement from <5,>=4.4.0 to >=4.4.0,<6 (2068)
* Bump actions/setup-python from 3 to 4 (2080)
* Bump myst-parser from 0.17.2 to 0.18.0 (2081)
* Bump pypa/cibuildwheel from 2.5.0 to 2.6.1 (2079)
* remove unnecessary `object` from `class` definitions (2077)

4.4.0

This release contains many new features! Of particular note:
* sourmash now estimates and outputs average nucleotide identity (ANI) based on k-mer measures;
* `sourmash sketch translate` is no longer unusably slow;
* we provide Mac OS 'arm64' wheels for the new M1 Macs;
* we've added a number of support features for managing large collections of signatures and building very large databases;
* and we've added support for SQLite databases that can be used for storing and searching signatures and doing Kraken-style LCA analysis of genomes and metagenomes.

In addition, we have built updated Genbank genome databases (with contents from March 2022) as well as GTDB R07-RS207 databases; see [the prepared databases page](https://sourmash.readthedocs.io/en/latest/databases.html). We've also made some benchmarks available for these databases, so you can get some idea of the necessary computational resources for your searches.

Last but by no means least, we have begun providing a number of examples and recipes for using sourmash - see the new [sourmash examples](https://sourmash-bio.github.io/sourmash-examples/) Web site!

---

Major new features:

* add ANI output to search, prefetch, and gather (1934, 1952, 1955, 1966, 1967, 2011, 2031, 2032)
* new GTDB and Genbank database releases (2013, 2038)
* provide macos arm64 wheels (1935)
* support for SQLite databases (1808)
* implement `sourmash sketch fromfile` (1884, 1885, 1886, 2009)
* add `sourmash sig check` for comparing picklists and databases (1907, 1915, 1917)
* add `sig collect` command (2036) for building standalone manifests from many databases
* Add direct loading of manifest CSVs as sourmash indices (1891)
* add `-A/--abundance-from` to `sig subtract` & add `sig inflate` (1889)
* advanced database format documentation (2025)

Minor new features:

* add `-d/--debug` to `sourmash sig describe`; upgrade output errors. (1782)
* add `sum_hashes` to `sourmash sig describe` output. (1882)

Bug fixes:

* catch TypeError in search w/abund vs flat at the command line (1928)
* speed up `SeqToHashes` `translate` (1938, 1946)

Cleanup and documentation fixes:

* better handle some pickfile errors (1924)
* remove unnecessary downsampling warnings (1971)
* use same wording for dayhoff/hp as for dna/protein (1929)
* rename `covered_bp` property to better reflect function (2050)

Developer updates:

* provide "protocol" tests for `Index`, `CollectionManifest`, and `LCA_Database` classes (1936)
* remove khmer CI tests (1950)
* Benchmarks for seq_to_hashes in protein mode (1944)
* add some tests for Jaccard output ordering (1926)
* Oxidize ZipStorage (1909)
* cleanup and commenting of `test_index.py` tests. (1898, 1900)
* rationalize `_signatures_with_internal` (1896)
* Convert nix to flakes (1904)
* fix docs build (1897)
* Fix build/CI and unused imports papercuts (1974)
* fix hypothesis CI (2028)
* dependabot version updates (1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1991, 1993, 1994, 1995, 1996, 1997, 1998, 2017, 2019, 2020, 2021, 2022, 2023, 2042)

4.3.0

New features:

* add `sourmash sig grep` (1864)
* add `sourmash sig summarize` (1837, 1863)
* add `--include-db-pattern` and `--exclude-db-pattern` to many commands (1871)
* update lca summarize output to output total counts (1838)

Bug fixes:

* fix `sourmash prefetch` to work when db scaled is larger than query scaled (1870)
* fix `sourmash prefetch` for multiple ksizes in database (1866)
* allow missing columns in tax CSV files (1869)
* fix containment calculation for nodegraphs (1862)
* fix `tax prepare` SQL code for empty/blank taxonomic ranks (1843)

Cleanup and documentation fixes:
* clean up 'describe' a little bit, add a test (1861)
* add --output-dir as alias for every --outdir (1817)
* fix doc titles in `command-line.md` and update description a bit (1874)

Developer updates:

* move greyhound-core into sourmash (1238)
* drop Python 3.7, default most of CI to Python 3.10 (1839)
* reorganize traits for easier wasm and native compilation (1836)
* update asv to newly released version (1834)
* pin setuptools < 60 (1879)

Page 3 of 10

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.