Megalodon

Latest version: v2.5.0

Safety actively analyzes 626208 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 5

2.2.2

This release adds per-site modified base ground truth aided basecaller training data set creation. See [docs](https://nanoporetech.github.io/megalodon/modbase_training.html).

This release also fixes a number of bugs:
* Correct base quality scores in mod_mappings.bam output ( 56 )
* `--num-reads` option functionality restored
* Fixed bug where reads were skipped
* Add threading and multiprocessing to read enumeration and raw signal extraction to recover more robust performance ( 49 )

2.2.1

This release includes two internal features to produce more robust processing speeds.

1. Read batching

- Each worker process sends a batch of reads (default 50) to the Guppy basecall server while processing results as they are returned to the client.
- This should enable the Guppy basecall server to process reads more efficiently

2. Signal extraction threading

- Extract read IDs and raw signal using multiple threads (default 8).
- This makes signal extraction more efficient on slower disk or when analyzing single read FAST5 format data.

2.2.0

This release includes a number of new features, optimizations and performance improvements:
- Optimized read processing enabling reference anchored (highest quality) modified base calling the same speed as standalone guppy.
- Live processing mode with all outputs enabled.
- Support for (and new requirement of) Guppy 4.0+.
- Basecall-anchored modified base calls now output in hts-spec unmapped BAM format ([see specification here](https://github.com/samtools/hts-specs/pull/418)).
- Optimized modified base data base scheme for smaller memory footprint, faster processing and faster aggregation.
- Fixed bug for sequencing summary file when processing multi-FAST5 reads (Fixes 45).
- Fixed bug where scaling factors were incorrect for signal mapping output in some settings (used for basecaller training).
- Other various optimizations and minor bug fixes.

2.1.1

Add support for updated Taiyaki signal mapping interface.

2.1.0

This release includes a number of new features, optimizations and performance improvements:

- Default calibration files for all released modified base models, including newly released Rerio "research" models (CpG 5mC for Minion/GridION and PromethION; CpG 5mC and 5hmC model for MinION/GridION). Sequence variant calibration files are also provided for all currently released Flip-flop models ( fixes 22 )
- New output type, `mod_mappings`, to visualize per-read modified base calls in a genome browser. This output annotates per-read reference sequence with modified basecalls including confidence scores. This output can be visualized via a genome browser using existing bisulfite setting [see example here](https://nanoporetech.github.io/megalodon/file_formats.html#modified-base-mapping).
- The modified base processing steps have now been optimized to output modified basecalls in all context more efficiently. Internal tests show that all-context modified base output can now keep up with basecalling using the Guppy backend on 2 V100 GPUs.
- Updates to make model training data preparation easier. Training a new basecalling model can be performed in two command line steps (once software is successfully installed). [See documentation for this process here](https://nanoporetech.github.io/megalodon/model_training.html). New methods have been added to prepare training datasets for modified base basecalling models. These are models that specifically detect modified bases along with canonical bases. See documentation for [these new commands here)[https://nanoporetech.github.io/megalodon/modbase_training.html].
- Support for analysis of direct RNA sequencing data including the output of RNA training datasets. This does not include the release of any RNA modified base models or default calibration files.
- Megalodon helper scripts are now accessible via the command line `megalodon_extras` command. These commands have been refactored to be more user-friendly and be accessible from a standard pip or Conda installation.
- The standard sequencing summary file output by Guppy is now output from Megalodon, when the `basecalls` output is selected ( fixes 24 ).
- This release also includes various bug fixes and other optimizations ( fixes 28 ; fixes 33 )

2.0.0

* Added support for Guppy basecalling backend (via pyguppy).
* Optimized modified base and variant processing.
* PromethION biological context model modified base support.
* Bug fixes, specifically fixed bug in modified base calling in FAST5 input mode.

Page 4 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.