Metaphlan

Latest version: v4.1.0

Safety actively analyzes 613682 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 5

4.1.0

Database updates
* We just released the new vJun23 database
* Addition of ~45k reference genomes from NCBI
* Addition of ~50k MAGs from ocean, ~40k MAGs from soil, ~30k MAGs from domestic animals and non-human primates, ~4k MAGs from giant turtles, ~7.5k MAGs from skin microbiome, ~20k MAGs from dental plaque, ~15k MAGs from Asian populations, ~2.7k MAGs from ancient and modern Bolivians and other small datasets from diverse sources
* Expansion of the markers database with 36,822 SGBs (6,272 more SGBs than in vOct22)
* Inclusion of the new Viral Sequence Clusters (VSCs) database
* Containing 3,944 VSCs clustered into 1,345 Viral Sequence Groups (VSGs).
* Including a total of 45,872 representative VSGs sequences.
* Each cluster/group is labeled as known (kVSG) or unknown (uVSG) depending on the presence of at least a viral RefSeq reference genome within the cluster/group.
New features
* [MetaPhlAn] The new `--profile_vsc` parameter (together with `--vsc_out` and `--vsc_breadth`) enables the profiling of viral sequence clusters.
* [MetaPhlAn] The `--subsampling` now subsamples the FASTQ files and not the mapping results
* [MetaPhlAn] The new `--mapping_subsampling` parameter enables the previous mapping subsampling behaviour
* [MetaPhlAn] The new `--subsampling_output` parameter enables to save the subsampled FASTQ file
* [MetaPhlAn] The new `create_toy_database.py` script enables the custom filtering of the MetaPhlAn databases
Changed features
* [MetaPhlAn] The average read length is included in the output header with the -t rel_ab_w_read_stats parameter
* [StrainPhlAn] Quasi-markers behaviour in line with that of MetaPhlAn
* [StrainPhlAn] sample2markers.py output is now in JSON format
* [StrainPhlAn] Simplified sample and marker filtering parameters, integrated with primary/secondary samples
* [StrainPhlAn] Faster inference of small and medium phylogenies
* [StrainPhlAn] Faster execution of the parameter –-print_clades_only

4.0.6

Changed features
* [MetaPhlAn] The GTDB taxonomic assignment for the vOct22 database is now available.

4.0.5

Database updates
* We just released the new vOct22 database
* Addition of ~200k new genomes
* 3,580 more SGBs than the vJan21
* 2,548 genomes considered reference genomes in vJan21 were relabelled as MAGs in NCBI -> 1,550 kSGBs in vJan21 are now uSGBs in vOct22
* Removed redundant reference genomes from the vJan21 genomic database using a MASH distance threshold at 0.1%
* Local reclustering to improve SGB definitions of oversized or too-close SGBs
* Improved GGB and FGB definitions by reclustering SGB centroids from scratch
* Improved phylum assignment of SGBs with no reference genomes at FGB level using MASH distances on amino acids to find the closest kSGB
Changed features
* [StrainPhlAn] Improved StrainPhlAn's speed when running with the --print_clades_only option
Missing features
* [MetaPhlAn] The GTDB taxonomic assignment for the vOct22 database is not available yet (expected release: end of Feb 2023)
* [MetaPhlAn] The phylogenetic tree of life for the vOct22 database is not available yet (expected release: TBD).

4.0.4

Changed features
* [MetaPhlAn] Download of the pre-computed Bowtie2 database is now the default option during installation
* [StrainPhlAn] Improved StrainPhlAn's sample2makers.py script performance and speed
Fixes
* [StrainPhlAn] Fixes error when using --abs_n_samples_threshold in the PhyloPhlAn call

4.0.3

Changed features
* [MetaPhlAn] Removal of the NCBI taxID from the merged profiles produced by the `merge_metaphlan_profiles.py` script
* [StrainPhlAn] Improved StrainPhlAn's performance in the markers/samples filtering step
Fixes
* [MetaPhlAn] `-t rel_ab_w_read_stats` now produces the reads stats also at the SGB level
* [MetaPhlAn] Fixes overstimation of reads aligned to known clades
* [MetaPhlAn] Fixes error when not providing the number of reads using SAM files as input
* [StrainPhlAn] Fixes `No markers were found for the clade` error while executing StrainPhlAn without providing the clade markers FASTA file

4.0.2

New features
* [MetaPhlAn] The new `--subsampling` parameter allows reads' subsampling on the flight
* [MetaPhlAn] The new `--subsampling_seed` parameter enables a deterministic or randomized subsampling of the reads
* [MetaPhlAn] The new `--gtdb_profiles` of the `merge_metaphlan_profiles.tsv` allows the merge of GTDB-based MetaPhlAn profiles
* [StrainPhlAn] The new `--breadth_thres` parameter allows StrainPhlAn to filter the consensus markers sequences after the execution of `sample2markers.py`
* [StrainPhlAn] Interactive selection of the available SGBs when the clade is specified at the species level
* [StrainPhlAn] The new `--non_interactive` parameter disables user interaction when running StrainPhlAn
* [StrainPhlAn] The new `--abs_n_markers_thres` and `--abs_n_samples_thres` parameters enables the specification of the samples/markers filtering thresholds in absolute numbers
* [StrainPhlAn] The new `--treeshrink` parameter enables StrainPhlAn to run TreeShrink for outlier removal in the tree
* [StrainPhlAn] Addition of the `VallesColomerM_2022_Jan21_thresholds.tsv` for compatibility with the mpa_vJan21 database
* [StrainPhlAn] The new `--clades` parameter enables `sample2markers.py` to restrict the reconstruction of markers to the specified clades

Changed features
* [StrainPhlAn] The `-c` parameter of the `extract_markers.py` script now allows the specification of multiple clades
* [StrainPhlAn] The `--print_clades_only` parameter now produces an output `print_clades_only.tsv` report
* [StrainPhlAn] Compatibility with clade markers compressed in bz2 format
* [StrainPhlAn] The `strain_transmission.py` script now uses by the default the `VallesColomerM_2022_Jan21_thresholds.tsv` thresholds
Fixes
* [MetaPhlAn] `metaphlan2krona.py` and `hclust2` have been added to the bioconda recipe

Page 1 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.