Changelogs » Biobakery-workflows

PyUp Safety actively tracks 266,622 Python packages for vulnerabilities and notifies you when to upgrade.



* Added check in visualization to throw error if feature only has one type.
  * Increase metaphlan2 and strainphlan task times for larger grid runs.
  * Added option for custom databases for 16s usearch/vsearch methods.


* Added WDL workflow.
  * Increased default strainphlan slurm memory request to allow for larger runs.
  * Change formatting of panphlan task command to allow for grid runs.


* Allow for single QC database (fix ratio table in wmgx vis).
  * Add target for strainphlan best tree if created with generic clade name.


* Fix error when running wmgx_vis with metadata for ec heatmaps.


* Added heatmaps for ecs to wmgx vis.
  * Changed functional profiling outputs to optional in wmgx vis.
  * In demultiplex script make the index file optional.
  * Add script to pull out reads mapping to metaphlan2 species by marker.
  * Add two utility scripts for renaming tables and fastq files to sample ids.
  * Add script to update anadama2 database with new files.
  * Add workflow burst script.


* Added isolate workflow.
  * Increase kneaddata tasks time/memory to allow for new kneaddata feature which reorders sequences.
  * Add option to use bz2 input files for wmgx workflows.
  * Update dada2 ASV taxonomy table to sync ASV ids.


* Remove the taxonomy from the picrust2 input file to resolve the int/str error.


* Add PICRUSt v2 option (now also included as a task for the dada2 method).
  * Update 16s workflow for python3 compatibility.
  * Modify strainphlan task to use new folder naming convention (to work with latest metaphlan2/strainphlan packages).


* Change kneaddata tasks to work with latest version as to not overwrite final pairs output file.
  * Data2 workflow new options added: minoverlap and maxmismatch.
  * Require min length for cutadapt to prevent reads of zero length passed to dada2 tasks which will cause an error.
  * New option to allow wmgx workflow to just run panphlan.


* Default kneaddata (QC tasks) no longer use the rRNA database for filtering
  * Default kneaddata runs with trf to filter repeats
  * For database install, the metaphlan2 folder name has been updated to work with the latest metaphlan2 version
  * Users can now run without any filtering databases for kneaddata tasks
  * Users can now pass custom arguments to humann2 tasks


* Add option to provide a list of strains to run (instead of default of top 25 by average abundance).


* Add panphlan optional tasks to wmgx workflow.
  * Add ITS option to 16s workflow.


* Add colorbar for continuous data to PCoA plots.


* In 16s usearch/vsearch methods combine truncate and filter (needed for data sets that require more filtering).
  * Add option to bypass msa generation in 16s workflows.


* Add vsearch as 16s workflow option (now the default).
  * Add assembly as option to wmgx workflow.
  * Add fasttree task to 16s workflow.
  * Add dual indexing option to 16s and wmgx workflows.


* Add DADA2 16s workflow option.
  * Add option to bypass taxonomic profiling for the wmgx workflow.
  * Strainphlan option now selects top species by average abundance for profiling.
  * Add genera visualizations to both 16s and shotgun workflows.
  * Add average grouped metadata plots for relative abundance.


* Add metadata input option to wmgx and 16s visualization workflows.
  * Add multiple sequence alignment task for closed reference sequences to 16s workflow.
  * Include 16s data products in 16s report archive.
  * Update 16s visualization workflow to pull variables from data processing workflow log to write report introduction.
  * Add optional picard input files to 16s visualization workflow.
  * Add options to bypass quality control and functional profiling to wmgx workflow.
  * Improve error message in database install script printed when required dependencies are not found.
  * Add counter to track PyPI download stats.
  * Update dependencies to require anadama2 v0.4.0.


* Change the default size of heatmaps in reports based on format. Increase the size in pdf and decrease in html.


* Added utility to automatically install databases for each data processing workflow.
  * Reorganized shotgun output file locations so the products are stored with respect to the software used to generate them.


* Shotgun task names updated to include sample names.
  * Strain profiling was added to the shotgun data processing workflows.
  * Executables are tracked for tasks in the data processing workflows.
  * Tutorial data files were added to the examples folder.
  * Eestats table added to 16s report.
  * Discordant alignments are now allowed in qc shotgun tasks (kneaddata v0.6.1+ now required).
  * Removed intermediate output option added to shotgun data processing workflows to reduce output size.
  * Table of contents added to reports.
  * Added workflow information (ie commands, software versions) to reports.
  * Reports updated to allow for large numbers of samples.
  * Updated qc tables to use serial filtering to show serial filtering in report.
  * Added rna/dna norm to data processing and visualization workflows.
  * Refactored visualization input to use output of data processing to allow for more input files.
  * Rna/dna renorm script refactored to increase speed (minutes vs days).
  * Time/memory equations added for compute intense tasks.
  * Reorganizing and polishing of reports.
  * Allow for different template formats and report formats.
  * Allow for custom comtaminate databases in the shotgun reports.
  * Added archive generation to visualization workflows.
  * Allow for compressed 16s paired end files as input.
  * Allow for fasta shotgun files as input.
  * Initial 16s workflow added.
  * Initial visualization workflows added.


* Initial shotgun data processing workflows added.