Bcbio-nextgen

Latest version: v1.1.5

Safety actively analyzes 629004 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 5 of 8

0.9.1

- Fix novoalign to work with parallel split alignments. Thanks to Tyler Funnell.
- Move lumpy-sv to latest version which uses lumpyexpress instead of speedseq.
- Support the manta SV caller from Illumina. Validations: http://imgur.com/a/Gajsg
- Remove high depth regions from structural variant calling exclusion file
to avoid false positives with lumpy. Thanks to Miika Ahdesmaki.
- Move some structural variant calling, like CNV detection, prior to variant
calling. Allows use of CNV calls as inputs for variant detection tools.
- Generalize support for interaction with blob storage and graphing to support
alternative cloud providers. Initial support for interacting with Azure.
Thanks to Alexandru Coman.
- Remove VarDict call lines where reference and alternative allele are
identical.
- Fix assignment issues during prioritization with new GEMINI and sqlite.
- Support updated versions of sambamba, which provide headers for window depth
commands.

0.9.0

- GATK 3.4: support HaplotypeCaller by avoiding setting downsampling (-dcov)
option by default.
- Single sample structural variant calling: corectly handle multiple variant
callers. Thanks to Sven-Eric Schelhorn.
- Make VarDictJava the default caller when `vardict` specified. `vardict-perl`
is now required to specifically use the Perl version.
- VarDict and VarDictJava: limit regions to 1Mb with overlaps to avoid memory
errors. Ignore regions without BED reads which can lead to large genomic
sections and memory errors.
- VarDict and VarDictJava: annotate outputs with dbSNP.
- Add `tools_on` configuration with `svplots` option. This turns off structural
variant plotting by default, which can be time consuming compared to calling.
- Add a `--only-metadata` argument to template preparation that will only
include BAM or fastq files in sample YAML if they are present in the metadata
CSV file.
- samblaster: support -M flag in 0.1.22 release
- Fix VEP/GEMINI incompatibility where empty fields are included in VCF output.
- VarDict: restrict maximum region size within a BED file to 2Mb to avoid high
memory usage and failures for longer regions.
- Include snpEff effects summary file in output directory when used for effects
prediction.

0.8.9

- Upgrade variant effect predictor (VEP) to the latest Ensembl version (79) with
support for hg38. The latest VEP has better support for multiple versions
but incompatible database naming. This requires an update of tools and data in
a two step process. First `bcbio_nextgen.py upgrade -u stable --tools`
(or `-u development`) then `bcbio_nextgen.py upgrade --data`.
- Improve de-duplication for split alignments. Do not sort/merge during splits,
and instead perform a global merge sort and de-duplication of the final set of
reads.
- Initial support for new human genome build (hg38/GRCh38) including alternative
alleles. Usage is in place but still requires validation and additional testing.
- Remove alternative alleles from downstream variant calling after using in alignment
to avoid issues with chromosome names like `HLA*`.
- Enable installation of external conda-managed tools. Adds in builds for
heterogeneity analysis.
- Clean up preparation process for multi-allelic inputs to GEMINI to avoid
needing to split/merge. Thanks to Sven-Eric Schelhorn.

0.8.8

- Automatically calculate `coverage_interval` based on coverage calculations,
avoiding need to set this directly in input configuration.
- Update vt decompose to handle additional multi-allelic adjustments including
all format attributes, providing full support for new GEMINI changes. Thanks
to Brent Pedersen and Adrian Tan.
- Add `default` configuration target to `bcbio_system.yaml` reducing the need
to set program specific arguments for everything.
- Ensure `resources` specified in input YAML get passed to global system
configuration for making parallelization decisions. Thanks to Miika Ahdesmaki.
- Run upload process on distributed machines, allowing upload to S3 on AWS to take
advantage of machines with multiple cores. Thanks to Lorena Pantano.
- Re-write interactions with external object stores like S3 to be more general
and incorporate multiple regions and future support for non-S3 storage.
- Scale local jobs by total memory usage when memory constrains resource usage
jinstead of cores. Thanks to Sven-Eric Schelhorn and Lorena Pantano.
- Disambiguation: improve parallelization by disambiguating on split alignment
parts prior to merging. Thanks to Sven-Eric Schelhorn.
- Disambiguation: ensure ambiguous and other organism reads are sorted, merged
and passed to final upload directory. Thanks to Sven-Eric Schelhorn.
- Fix problem with sambamba name sorting not being compatible with samtools.
Thanks to Sven-Eric Schelhorn.
- FreeBayes: update to latest version (0.9.21-7) with validation
(http://imgur.com/a/ancGz).
- Allow bz2 files in bcbio_prepare_sample.py script.
- Ensure GEMINI statistics run for project summary file. Thanks to Luca
Beltrame.
- Better error checking for booleans in input configuration. Thanks to Daryl
Waggott.
- Implement qualimap for RNAseq QC metrics, but not active yet.
- collect statistics graphing capabilities moved from bcbio-nextgen-vm, enabling
plotting of resource usage during runs. Thanks to John Morrissey and Lorena
Pantano.

0.8.7

- Run snpEff 4.1 in back-compatibility mode to work with GEMINI database
loading. Fixes snpEff 4.1/GEMINI effects loading.
- Add PED file to GEMINI database load, containing family, gender and phenotype
information from bcbio metadata. Thanks to Luca Beltrame and Roy Ronen.
- Enable specification of input PED files into template creation, extracting
family, gender and phenotype information. Any sample rows from PED files get
used when creating the GEMINI database.
- Fix preparation of multi-allelic inputs to GEMINI by implementing custom merge
of bi-allelic and split multi-allelic. Previous implementation using GATK
CombineVariants re-merged some split multi-allelic, losing effects annotations.
- Skip contig order naming checking with bedtools 2.23.0+ to avoid potential
issues with complex naming schemes.
- Installation and upgrade: Set pip SSL certificates to point at installed conda
SSL package if present. Avoids SSL errors when pip can't find system
certificates. Thanks to Andrew Oler.
- Enable support for PBSPro schedulers through ipython-cluster-helper.

0.8.6

- Calculate high depth regions with more than 20x median coverage as targets for
filtering in structural variants. Attempts to detect and avoid spurious calls
in repetitive regions.
- Support snpEff 4.1, including re-download of snpEff databases on demand if out
of sync with older versions.
- Split multi-allelic variants into bi-allelic calls prior to loading into
GEMINI, since it only handles bi-allelic inputs. Thanks to Pär Larsson.
- Pass ploidy to GATK HaplotypeCaller, supporting multiple ploidies and correct
calling of X/Y/MT chromosomes. Requires GATK 3.3.
- Remove extra 'none' sample when calling tumor-only samples using
MuTect. Harmonizes headers with other tumor-only callers and enables
tumor-only ensemble calling. Thanks to Miika Ahdesmaki.
- Perform variant prioritization as part of tumor-only calling, using population
based frequencies like 1000 genomes and ExAC and presence in known disease
causing databases like COSMIC and Clinvar.
- Switch to samtools sort from sambamba sort during alignment streaming. Saves
steps in processing and conversions on single sample no deduplication inputs.
- On AWS, download inputs for S3 instead of streaming into fastq preparation to
avoid issues with converting BAM to fasta. Thanks to Roy Ronen.
- Provide better defaults for mincores that packs together multiple single IPython
processes on a single cluster request -- use core specification from input
configuration. Thanks to Miika Ahdesmaki.

Page 5 of 8

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.