Quast

Latest version: v5.2.0

Safety actively analyzes 630450 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 4

4.2

- physical coverage is added (the coverage of the reference by the paired-end fragments,
counting the reads and the gap between the paired-end reads as covered);
- normal/log scale switcher is added;
- samtools max coverage depth is increased, so now even very large coverages are shown.

2. Icarus update -- search window is added:
- searching contigs and genes/operons;
- tooltip shows possible suggestions as one types.

3. Icarus update -- Contig Alignment Viewer changes:
- coordinates are counted for each chromosome separately;
- all good alignments of ambiguous contigs are shown if --ambiguity-usage=all.

4. Read coverage histograms are added for assemblies with SPAdes/Velvet-like contig
naming style (i.e. all names are ..._length_X_cov_Y_...).

5. Several new options are added:
- "--references-list" for specifying list with reference names to be searched
and downloaded from NCBI (MetaQUAST only);
- "--min-identity" for setting threshold on minimal IDY% of alignments (Nucmer
parameter, default is 95%);
- "--ambiguity-score" for choosing how close good ambiguous alignments should be to
the best one (default is 0.99);
- "--sam" and "--bam" for specifying SAM and BAM files instead of/in addition to
raw reads.

6. Several new metrics are added:
- Total aligned length (total size of contig fragments aligned to reference);
- Icarus similarity statistics (in HTML report only, at least 2 assemblies are
needed).

7. MetaQUAST updates:
- requests to NCBI database in no-reference mode are now repeated if error occurs;
- Krona updated to v2.6, total aligned length is used for building Krona charts
(instead of Total length in previous versions);
- possibly interspecies translocations are not reported if unaligned parts consist
mostly of Ns;
- Total length and Largest contigs are substituted with their "aligned" version in
summary HTML report (i.e. calculated for contig aligned fragments rather than
original contigs).

8. HTML report is updated:
- Genome statistics are moved to the top, Statistics without reference are moved to
the bottom;
- reference line is added to GC content plot.

9. GeneMark now outputs nucleotide and protein sequences for all found genes.

10. Best set selection is improved (misassembly detection algorithm in case of
multiple ambiguous mappings of contig fragments). Now BSS can find not only
the very best set but all sets with scores close to the best one (controlled by
--ambiguity-usage and --ambiguity-score options). Also several minor bugfixes
are made and BSS become deterministic in case of equivalent scores.

11. Embedded E-MEM aligner is updated to version 2.

12. Significant refactoring of the source code. Most changes are in Icarus,
Contigs Analyzer, QUAST and MetaQUAST running scripts.

13. Fixed several minor bugs.

14. Icarus citation is updated.

4.1

- all blocks are drawn with transparency, so overlaps are more visible;
- controls for switching between overlapped blocks are added to Contig info panel;
- y-coordinates offsets and shades of colors for neighbouring contigs are removed.

2. Icarus update -- integration of Contig Size Viewer with Contig Alignment Viewer
(if reference genome is available and thus alignment information is present):
- contigs are colored according to their mapping on the reference (correct,
misassembled, unaligned);
- positions of the breakpoints in the misassembled contigs are shown;
- full scale Contig info is now available with links to Contig Alignment Viewer;

3. Nucmer aligner is replaced with significantly faster E-MEM (for Linux only).

4. Significant speed up of misassembly detection algorithm in case of complicated
genomes (with many short repeats).

5. New option is added:
- "--no-sv" for skipping structural variants calling and processing. Make sense
only if reads are specified: in this case they will be used for building Icarus
read coverage histograms only.

6. Embedded Manta is updated to version 0.29.6.

7. Reference line on cumulative plot is redesigned to be more consistent with
assemblies lines (starts at 0, depends on number of chromosomes/plasmids).

8. Fixed several minor bugs.


4.0 1. Icarus is added! Icarus stands for Interactive Contig Assessment browseR.
Icarus complements QUAST output with interactive visualisations of assembly
alignments to the reference genome, and all assembly features detected by
QUAST (e.g. misassemblies, genes). Icarus generates two types of viewers:
- Contig alignment viewer (available if a reference genome is provided). Shows
contig alignments to the reference, misassemblies, similarities between assemblies,
genome annotations (if genes/operons are provided), read coverage along the genome
(if reads are provided).
- Contig size viewer. Draws contigs ordered from largest to smallest. Allows hiding
of short contigs. Explicitly shows contigs of length N50, N75 (and NG50, NG75).

2. Improved misassembly detection algorithm in case of multiple ambiguous mappings of
contig fragments. The algorithm identifies best set of non-overlapping contig fragments
mappings. The set minimises possible number of misassemblies, i.e. this algorithm
reduces number of false negatives (erroneously detected misassemblies).

3. Several new options are added:
- "--significant-part-size" for setting threshold on detecting partially unaligned
contigs with both significant aligned and unaligned parts (default: 500 bp as earlier).
- "--fragmented" for notifying QUAST about fragmented reference genome (e.g. scaffold
reference). QUAST will try to detect misassemblies caused by the fragmentation and
mark them "fake".
- "--unique-mapping" for forcing -a=one in QUAST run on the combined reference (MetaQUAST only).
- "--sv-bedpe" for specifying BEDPE file with structural variations. Note that QUAST may
create such BEDPE automatically using Manta SV caller if reads are specified.
However, it is rather slow because include reads alignment to the reference.

4. Fixed bug with skipping "broken" scaffolds in per reference runs of MetaQUAST.

5. Slightly rearranged MetaQUAST summary HTML report. New order is: statistics,
plots, table with links to per reference runs (the table was in beginning previously).

6. Fixed several minor bugs.

7. Embedded samtools updated to version 1.3.

8. Citation for MetaQUAST paper is updated, citation for Icarus paper is added.

3.2

feature, try with care):
- reads should be provided with --reads1 (or -1), --reads2 (or -2) options,
- reads are aligned to reference genome using bowtie2 (embedded),
- Manta structural variation (SV) calling tool (embedded) is run on bowtie2 output,
- found SVs are used for classifying QUAST misassemblies into true ones and fake
ones (caused by structural differences between reference sequence and
sequenced organism). Fake misassemblies are excluded from " misassemblies" metric
and reported in novel " structural variants" metric.

2. HTML reports content is reformatted, especially in MetaQUAST reports:
- GC % and all metrics based on genome length (NG50, NGA50, LGA75, etc) are excluded
from the combined reference statistics (they don't make sense there);
- N50 is replaced with more fair and comparable metrics such as "Total length >= 1000 bp",
"Total length >= 10000 bp" in the main MetaQUAST report. Extended version still has N50.
- N50 is also hidden in single-genome QUAST reports and exposed NGA50 instead.

3. Scaffold gap size misassemblies are introduced. They are reported only when --scaffolds
is used.

4. Several new options are added:
- "--memory-efficient" for running QUAST with minimal memory consumption (but significantly slower)
- "--test-sv" for testing structural variants mode (see 1.)
- "--silent" for minimal output to stdout (full verbose output is saved in the logs anyway)

5. MetaQUAST output directory content is reformatted. Per reference reports are saved
inside the directory <output_dir>/runs_per_reference, summary reports are under summary directory.
The only top-level files are metaquast.log and report.html (summary HTML report with links to
all subreports).

6. Colors in HTML reports and PDF/PNG plots are synchronized, now they are the same
for the same assemblies.

7. Additional check for Matplotlib v1.1 (needed for drawing PDF/PNG plots since QUAST v3.1)

8. Fixed several minor and major bugs.

9. Citation for MetaQUAST paper added.

3.1

- more specific algorithm for reference searching and downloading, particularly
only Bacteria and Archaea are downloaded;
- Krona charts are added for showing taxonomic profile based on found references;
- metric-level and misassemblies plots are added to summary HTML report;
- better structure for text reports and plots in summary folder.

2. Significantly reduced size of the installation package. This is done by removing
BLAST binaries which are needed only for MetaQUAST run without references. On the
first such run, BLAST binary for target OS will be automatically downloaded.

3. Heatmaps are added to HTML reports.

4. Separate install.sh and more complex install_full.sh scripts for installing regular
versions of QUAST and MetaQUAST or extended one (with ability to run MetaQUAST without
reference genomes).

5. New option "--plots-format" for selecting output format for plots (PDF by default).

6. Default number of threads is changed from 100% of CPUs to 25% of CPUs.

7. Changes in one-letter options, including replacing confusing -T to -t for --threads
and -M to -m for --min-contig.

8. Skipping broken version of scaffolds from analysis if they are equal to original
ones (see --scaffold option for details).

9. Fixed several minor bugs.

3.0

- if no references are provided, MetaQUAST downloads references from NCBI database
based on best hits of assemblies alignments vs SILVA 16S rRNA database
(included in QUAST package);
- multiple summarising reports are added: plots and text tables for each metric
(all assemblies vs all references in one file), histograms of misassemblies
per reference for all assemblies separately, summary HTML-report with all metrics
per each assembly and each reference (expandable lines);
- new metrics: interspecies translocations and possibly misassembled contigs;
- fixed handling of reference with more than one entry in FASTA file
(chromosome plus plasmids or multiple chromosomes);
- option --reference now accepts directory (takes all references from this dir);
- fixed handling of closely related species by using --ambiguity-usage 'all' on
combined reference run.

2. Speed ups:
- 5x-100x speed up in case of running QUAST without reference;
- processing of large (> 50 Mbp) multi-chromosome references in parallel (per chromosome);
- new speed up options: --no-check, --no-gc, --no-snps, --fast (a combination of other).

3. More reports about misassemblies (in <output_dir>/contigs_reports/):
- misassemblies_plot.pdf with histogram of misassembly types distribution per assembly.
- contigs_report_<assembly_name>.mis_contig.info with brief details about misassembled
contigs only.

4. Improved and updated misassemblies detection algorithm:
- more accurate algorithm for processing multiple ambiguous alignments;
- using --ambiguity-usage value for processing internal overlaps between adjacent aligned
blocks of a misassembled contig;
- marking of local misassemblies with small inconsistency (<= 85 bp) as fake misassemblies
(indels or mismatches) if gap on the contig is filled mostly with Ns;
- option --extensive-mis-size/-x to set extensive misassembly size, i.e. min inconsistency
size for relocations (default is 1000 as earlier);
- option --min-alignment/-i to set minimum alignment length (Nucmer's parameter), default
is 0.

5. Fixed HTML-reports issues:
- Y-axis coordinates on plots interactive hangover tooltips;
- NAx plot small bug.

6. GeneMark-ES is used for predicting genes in eukaryotic genomes instead of Glimmer-HMM.
For using Glimmer-HMM, a new option --glimmer added.

7. GeneMarkS incorporation is fixed. Previously it was run on predefined heuristic models based
on GC content. Now it uses self-training module for getting the correct model.

8. Fixed several minor bugs.

9. GeneMark licenses is updated.

10. Updated LICENSE (third-party tools details) and Manual.

2.3

detection algorithm and one major bug caused by linear representation of circular
references and contigs in fasta format.

2. Added contig alignment plots. See details in manual and in the QUAST paper (Fig. 1)

3. Genome analyzer module (computation of genome fraction, duplication ratio,
number of genes and operons) is parallelized.

4. Option --test become an installation util analogue. It compiles all required binaries and
checks correctness of QUAST and MetaQUAST execution on test datasets.

5. Former plots.pdf is upgraded with report tables and renamed to report.pdf. Now it is
a file with all tables and plots generated by QUAST.

6. New option "--no-plots" for speeding up computation if plots are not needed.

7. GeneMark license is updated, instructions for manual updating are added.

8. Generation of misleading single-columns histograms is removed (when only single assembly
file was specified).

9. More error and exception handlers are added.

10. Fixed bug with indel counting (caused slightly overestimated indels rate in some cases).

11. Fixed several minor bugs.

12. Code is refactored.

Page 3 of 4

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.