Treesapp

Latest version: v0.11.4

Safety actively analyzes 619345 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.11.4

Added
- Centroid inference for pOTUs based on the midpoint, or balance point, of all cluster members.
- A table summarizing the intra-cluster evolutionary distances ('phylotu_cluster_stats.tsv').
- Automatically removes trailing semicolons from accession2lin and seq2lineage tables.

Fixed
- Estimation of local-alignment distances for a taxonomic rank - now only considers sequences from monophyletic taxa.
- `--min_seq_length` always overrides the minimum profile HMM proportion threshold in `treesapp update`.
- Fixed previously unhandled exceptions if reference package training failed during `treesapp create`

Changed
- The minimum sequence length of 30 (AA) has been removed be default, but can still be used as before with `--min_seq_length`.
Results likely will not change as more stringent filtering thresholds were already applied in downstream steps.
- `--pc` was removed from `treesapp create`'s argument list

0.11.3

Added
- Option to use pairwise local-alignment clustering (with MMSeqs2) in `treesapp phylotu`

Fixed
- Bad error statement when estimating _alpha_ in `treesapp phylotu`
- Can append unannotated features to "Unknown" label if already present in taxa_map.
- Problem updating a reference package with sequences from UniProt (>sp|... header).

Changed
- Checkpointing is improved in `treesapp assign`.
It is able to pick up outputs at any stage and decide what needs to be ran for each reference package.
Reference package targets can be modified between reruns.

0.11.2

Added
- Option '--unknown_colour' for `treesapp colour` where a colour for "Unknown" features or taxa are included in the iTOL files.
- New options for pre-clustering the classified sequences using either Barberra et al.'s placement-space method or
pairwise alignment to speed up pOTU inference. Controlled with the "-p/--pre_mode" argument
- Dynamic evolutionary distance threshold for query sequences based on branch lengths descendent from placement position
- RecA, RadA and RpoB reference packages being distributed as part of the core set
- The new '--query_coverage' command-line parameter is available in `treesapp assign` and drastically improves precision
and recall in conjunction with '--hmm_coverage'. Both are set to 80% by default.
- '--delete' flag added to `treesapp phylotu` to optionally remove all intermediate files and directories.
Useful for de novo methods when multiple phylogenies are inferred.
- Silent mode in `treesapp assign` can be activated by the '--silent' flag.
No logging to console but log file is still populated.

Fixed
- `treesapp package edit` assigns a leaf node only to the most resolved feature annotation
- Estimating `treesapp phylotu`'s alpha threshold is improved
- Setting distal and pendant lengths during aelw summary allows placements to be correctly filtered
- Final rank of a query sequence's assigned taxonomic lineage is not adjusted with aELW placement summary
- Detecting input sequence type for `treesapp evaluate`

Changed
- Non-taxonomic features are coloured in alphabetical order (according to the palette used) in `treesapp colour`
- iTOL colour-strip files dataset labels are now the feature name
- Users are warned if multiple feature annotations are assigned to a leaf node during `treesapp package edit`
- `treesapp phylotu`'s _de novo_ pOTU workflow adds the most related reference sequences when inferring each phylogeny
to handle truncated query sequences

0.11.0

TreeSAPP version 0.11.0 changes how users store and interact with reference package feature annotations.
These feature annotations are clade-specific labels that indicate some extra-taxonomic features that are characteristic of sequences in the reference package.

For example, in the particulate methane monooxygenase and ammonia monooxygenase subunit A reference package, XmoA,
the feature annotations indicate which paralog is represented by a clade (PmoA, AmoA, EmoA, etc.)
As another example, the methyl coenzyme M reductase subunit A (McrA) reference package contains feature annotations for
each pathway of methanogenesis that is used by the different clades.

We recommend updating to this version, and updating reference packages you have created.

Added
- A new attribute called 'feature_annotations' has been introduced to reference packages.
It can store what was previously saved to iTOL-compatible annotation files by `treesapp colour`.
- `treesapp package edit` accepts a taxonomy-phenotype mapping file to populate the feature_annotations attribute.
See [Wiki](https://github.com/hallamlab/TreeSAPP/wiki/Reference-package-operations) for details.
- `treesapp update` with automatically propagate feature annotations from the original reference package by mapping
the reference sequences through their unique descriptions (organism name and accession).
- `treesapp package view tree` will print a Newick tree with each leaf node's accession and description.
- `treesapp abundance` creates a simple_bar.txt file for each sample analyzed.
- Ability to automatically detect the sequence type based on the input provided.
- PQuery classification data is stored in each reference package in the 'training_df' attribute as a pandas.DataFrame.
- Improved query sequence filtering by phylogenetic placement information in `treesapp update`
- Now able to update a reference package's 'lineage_ids' attribute with `treesapp package edit`
- `treesapp create` is able to accept multiple fasta files through --fastx_input and concatenate them into the one
file used to build the reference package.

Fixed
- Segmentation fault from Prodigal is no longer possible as `treesapp assign` verifies input presence earlier.
- `treesapp purity` bug where the reference package path was not correctly passed to `treesapp assign` if in the same directory
- Calculation of tree coverage in `treesapp purity`

Changed
- Renamed the classification table made by `treesapp assign` (and used by subcommands like `layer`) 'classifications.tsv'.
- The reference package attribute 'refpkg_code' is automatically set and
does not need to be changed as it is guaranteed to be unique.
- The reference package disband path has been changed to just the reference package code.
- `treesapp colour` accesses and uses the 'feature_annotations' to write iTOL-compatible annotation files
(i.e. colour_strip.txt and colours_styles.txt). It no longer accepts taxonomy-phenotype tables.
- `treesapp layer` uses the 'feature_annotations' attribute in reference packages to annotate classified sequences.
- The versioned sequence accessions (or first split for unformatted sequence headers) are used in the
ReferencePackage lineage_ids attribute. This ensures unique sequence IDs and helps with iterative updates.

0.10.4

Fixed
- Checkpoint determination in `treesapp abundance`
- '--report append' and 'report update' was not working properly in `treesapp abundance`.
Fixed by deduplicating PQueries prior to appending.

Changed
- Checks whether all FASTQ file paths exist earlier in `treesapp abundance`

0.10.3

Fixed
- Replaced duplicate SAM file paths for unique ones when multiple fastqs are provided to `treesapp abundance`
- Prevent `treesapp abundance` from overwriting `treesapp assign` outputs when '--overwrite' is used

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.