Nextstrain-augur

Latest version: v24.4.0

Safety actively analyzes 629908 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 19

24.1.0

Features

* `augur.io.read_metadata`: A new optional `dtype` argument allows custom data types for all columns. Automatic type inference still happens by default, so this is not a breaking change. [1252][] (victorlin)
* `augur.io.read_vcf` has been removed and usage replaced with TreeTime's function of the same name which has improved validation of the VCF file. [1366][] (jameshadfield)

Bug Fixes

* filter, frequencies, refine: Speed up reading of the metadata file. [1252][] (victorlin)
* traits: Previously, columns with only numeric values were treated as numerical data. These are now treated as categorical data for discrete trait analysis. [1252][] (victorlin)
* Support Biopython `≥1.82` by requiring bcbio-gff `≥0.7.1`. [1400][] (victorlin)

[1252]: https://github.com/nextstrain/augur/pull/1252
[1366]: https://github.com/nextstrain/augur/pull/1366
[1400]: https://github.com/nextstrain/augur/pull/1400

24.0.0

Major Changes

* ancestral, translate: For VCF inputs please ensure you are using TreeTime 0.11.2 or later. A large number of bugfixes and improvements have been added in both Augur and TreeTime. [1355][] and [TreeTime 263][] (jameshadfield)
* ancestral, translate: GenBank files now require the (GFF mandatory) source feature to be present. [1351][] (jameshadfield)
* ancestral, translate: For GFF files, we extract the genome/sequence coordinates by inspecting the sequence-region pragma, region type and/or source type. This information is now required. [1351][] (jameshadfield)

Features

* ancestral, translate: Improvements to VCF inputs / outputs. [1355][] and [TreeTime 263][] (jameshadfield)
* Output VCF will better match the input VCF, including CHROM name and ploidy encoding.
* VCF inputs now require `--vcf-reference-output`
* AA sequences are now exported for the tree root
* VCF writing is now 3 orders of magnitude faster (dataset dependent)
* ancestral, translate: A range of improvements to how we parse GFF and GenBank reference files. [1351][] (jameshadfield)
* translate will now always export a 'nuc' annotation in the output JSON, allowing it to pass validation
* Gene/CDS names of 'nuc' are now forbidden.
* If a Gene/CDS in the GFF/GenBank file is unparsed we now print a warning.
* ancestral: For VCF alignments, a VCF output file is now only created when requested via `--output-vcf`. [1344][] (jameshadfield)
* ancestral: Improvements to command line arguments. [1344][] (jameshadfield)
* Incompatible arguments are now checked, especially related to VCF vs FASTA inputs.
* `--vcf-reference` and `--root-sequence` are now mutually exclusive.
* translate: Tree nodes are checked against the node-data JSON input to ensure sequences are present. [1348][] (jameshadfield)
* utils::load_features: This function may now raise `AugurError`. [1351][] (jameshadfield)
* export v2: Automatically minify large outputs. Use `--no-minify-json` to disable this default behavior. [1352][] (victorlin)
* Added a new file [DEPRECATED.md](./DEPRECATED.md) to document timelines and progress of deprecated features in the Augur CLI and Python API. [1371][] (victorlin)

Bug Fixes

* ancestral, translate: Various fixes to VCF inputs / outputs. [1355][] and [TreeTime 263][] (jameshadfield)
* Fix incorrect (but passing) tests
* Fix case-sensitive sequence comparisons between the root and reference sequences.
* Fix a bug where ambiguous alleles are not inferred (see [1380][] for full details).
* Fix a bug where positions with no sequence information were assigned a base because the mask was not being computed (see [1382][] for full details).
* More than one ALT allele is now correctly parsed
* Mutations followed by an insertion are now parsed
* Unchanged ref genotypes are now encoded as '0' rather than '.'
* ALT alleles "*" are now valid (introduced in VCF spec 4.2, but observed in VCF 4.1 files)
* Positions with no variation are no longer exported
* ancestral, translate: Fixes for JSON (non-VCF) inputs. [1355][] (jameshadfield)
* The "reference" translations are now from the provided reference sequence, not from the root of the tree. [1355][] (jameshadfield)
* Fix a bug where positions with no sequence information were assigned a base because the mask was not applied (see [1382][] for full details)
* ancestral, translate: Avoid incompatibilities with Biopython >=1.82. [1374][], [1387][] (victorlin)
* ancestral, translate: Address Biopython deprecation warnings. [1379][] (victorlin)
* ancestral: Previously, the help text for `--genes` falsely claimed that it could accept a file. Now, it can truly claim that. [1353][] (victorlin)
* translate: The 'source' ID for GFF files is now ignored as a potential gene feature (it is still used for overall nuc coords). [1348][] (jameshadfield)
* translate: Improvements to command line arguments. [1348][] (jameshadfield)
* `--tree` and `--ancestral-sequences` are now required arguments.
* separate VCF-only arguments into their own group
* translate: Fixes a bug in the parsing behaviour of GFF files whereby the presence of the `--genes` command line argument would change how we read individual GFF lines. Issue [1349][], PR [1351][] (jameshadfield)
* If `TreeTimeError` is encountered Augur now exits with code 2 rather than 0. (This restores the original behaviour.) [1367][] (jameshadfield)
* Deprecate `read_strains` from `augur.utils` and add it to the public API under `augur.io`. [1353][] (victorlin)


[1344]: https://github.com/nextstrain/augur/pull/1344
[1348]: https://github.com/nextstrain/augur/pull/1348
[1351]: https://github.com/nextstrain/augur/pull/1351
[1349]: https://github.com/nextstrain/augur/issues/1349
[1367]: https://github.com/nextstrain/augur/pull/1367
[1371]: https://github.com/nextstrain/augur/pull/1371
[1374]: https://github.com/nextstrain/augur/pull/1374
[1379]: https://github.com/nextstrain/augur/pull/1379
[1352]: https://github.com/nextstrain/augur/pull/1352
[1353]: https://github.com/nextstrain/augur/pull/1353
[1355]: https://github.com/nextstrain/augur/pull/1355
[1380]: https://github.com/nextstrain/augur/issues/1380
[1382]: https://github.com/nextstrain/augur/issues/1382
[1387]: https://github.com/nextstrain/augur/pull/1387
[TreeTime 263]: https://github.com/neherlab/treetime/pull/263

23.1.1

Bug Fixes

* Fix Python 3.11 installation for Conda environments. [1334][] (victorlin)
* Bump `pyfastx` dependency to major versions 1 and 2. [1335][] (victorlin)

[1334]: https://github.com/nextstrain/augur/issues/1334
[1335]: https://github.com/nextstrain/augur/pull/1335

23.1.0

Features

* Support treetime 0.11.* [1310][] (corneliusroemer)
* export: Allow minimal export using only a (newick) tree in `augur export v2`. [1299][] (jameshadfield)
* A number of schema updates and improvements [1299][] (jameshadfield)
* We now require all nodes to have `node_attrs` on them with one of `div` or `num_date` present
* Some never-used properties are removed from the schemas, including a pattern for defining nucleotide INDELs which was never used by augur or auspice.
* Tip label defaults are now settable within the auspice-config JSON
* Empty colorings definitions are allowed (the tree will be grey in Auspice)

Bug fixes

* ancestral: Export amino acid sequences inferred for the root node of the tree in the node data JSON output for compatibility with `augur translate` output. [1317][] (huddlej)

[1299]: https://github.com/nextstrain/augur/pull/1299
[1310]: https://github.com/nextstrain/augur/pull/1310
[1317]: https://github.com/nextstrain/augur/pull/1317

23.0.0

Major Changes

* Drop support for Python 3.7. [1296][] (victorlin)

Features

* export v2: Allow the root-sequence data to be included (inlined) in the main dataset JSON file, avoiding the need for a sidecar `_root-sequence.json` file. [1295][] (jameshadfield)

[1295]: https://github.com/nextstrain/augur/pull/1295
[1296]: https://github.com/nextstrain/augur/pull/1296

22.4.0

Features

* refine: Export covariance matrix and standard deviation for clock rate regression in the node data JSON output when these values are calculated by TreeTime. These new values appear in the `clock` data structure of the JSON output as `cov` and `rate_std` keys, respectively. [1284][] (huddlej)

Bug fixes

* clades: Fix outputs for genes named `NA` (previously the value was replaced by `nan`). [1293][] (rneher)
* distance: Improve documentation by describing how gaps get treated as indels and how users can ignore specific characters in distance calculations. [1285][] (huddlej)
* Fix help output compatibility with non-Unicode streams. [1290][] (victorlin)

[1284]: https://github.com/nextstrain/augur/pull/1284
[1285]: https://github.com/nextstrain/augur/pull/1285
[1290]: https://github.com/nextstrain/augur/pull/1290
[1293]: https://github.com/nextstrain/augur/pull/1293

Page 2 of 19

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.