Qiime

Latest version: v1.9.1

Safety actively analyzes 621498 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 3

1.9.1

===========

Bug fixes
---------

* **Critical**: Updated minimum required version of the [qiime-default-reference](http://github.com/biocore/qiime-default-reference) package to 0.1.2. **This release includes an important bug fix described in more detail in [this QIIME blog post](https://qiime.wordpress.com/2015/04/15/qiime-1-9-0-bug-affecting-pynast-alignment-of-16s-amplicons-generated-with-non-515f806r-primers/) and in [biocore/qiime-default-reference14](https://github.com/biocore/qiime-default-reference/issues/14).**
* **Critical**: Fixed bug in ``differential_abundance.py`` fitZIG algorithm ([1960](https://github.com/biocore/qiime/pull/1960)). **This was a serious bug that was encountered when users would call ``differential_abundance.py -a metagenomeSeq_fitZIG``. Any results previosuly generated with that command should be re-run.**
* **Critical**: Fixed bug in ``observation_metadata_correlation.py``, described in [2009](https://github.com/biocore/qiime/issues/2009). **All previous output generated with ``observation_metadata_correlation.py`` was incorrect, and analyses using those results should be re-run.** This most commonly would have resulted in massive Type 2 error (false negatives), where observations whose abundance is correlated with metadata are not reported, though Type 1 error (false positives) are also possible.
* ``count_seqs.py`` no longer fails on empty files. [1991](https://github.com/biocore/qiime/issues/1991)
* Updated minimum required version of [biom-format](http://github.com/biocore/biom-format) package to 2.1.4. This is a bug fix release. Details are available in the [biom-format ChangeLog](https://github.com/biocore/biom-format/blob/master/ChangeLog.md).
* Updated minimum required version of [Emperor](http://github.com/biocore/emperor) package to 0.9.51.
* Forced BIOM table type to "OTU table" for all tables written with QIIME. This fixes [1928](https://github.com/biocore/qiime/issues/1928).
* The ``--similarity`` option in ``pick_otus.py`` now only accepts sequence similarity thresholds between 0.0 and 1.0 (inclusive). Previous behavior would allow values outside this range, which would cause uninformative error messages to be raised by the external tools that ``pick_otus.py`` wraps ([1979](https://github.com/biocore/qiime/issues/1979)).
* ``split_libraries_fastq.py`` now explicitly disallows ``-p 0``. This could lead to empty sequences being written to the resulting output file ([1984](https://github.com/biocore/qiime/issues/1984)).
* Fixed issued where ``filter_samples_from_otu_table.py`` could only filter the mapping file when ``--valid_states`` was passed as the filtering method ([2003](https://github.com/biocore/qiime/issues/2003)).
* Fixed bug where distance matrix files generated by QIIME (e.g., using ``beta_diversity.py``) could have diagonals with values that were close to zero in rare cases (depending on input data, machine architecture, installed dependencies, etc.). These files could not be loaded by QIIME scripts that accepted distance matrix files as input (e.g., ``principal_coordinates.py``) and would result in an error message stating that the distance matrix was not hollow. Values on the diagonal that are close to zero are now set to 0.0 ([1933](https://github.com/biocore/qiime/issues/1933)).

Usability enhancements
----------------------

* Removed parallel PyNAST ``formatdb`` step ([1989](https://github.com/biocore/qiime/issues/1989)). The formatted database wasn't actually being used, this step was just left over from when BLAST was required by PyNAST.
* ``count_seqs.py`` can now count records in fastq files that have the ``.fq`` extenstion. This previously was only possible for fastq files that have the ``.fastq`` extension.
* If ``temp_dir`` is not defined in the QIIME config file, QIIME will use the system's default temporary directory instead of assuming that ``/tmp`` is present and writeable. Note that the location of this default temporary directory [can be changed with environment variables](https://docs.python.org/2/library/tempfile.htmltempfile.tempdir) ([1995](https://github.com/biocore/qiime/issues/1995)).
* Improve error reporting from ``filter_taxa_from_otu_table.py``, ``filter_otus_from_otu_table.py``, and ``filter_samples_from_otu_table.py`` when all OTUs/samples are filtered out resulting in an empty table ([1963](https://github.com/biocore/qiime/issues/1963)), and generally when attempting to write an empty BIOM table from QIIME.
* Added ability to pass user-defined runtime limit for jobs to ``start_parallel_jobs_slurm.py``. This can be achieved by setting the ``slurm_time`` variable in ``qiime_config``, or by passing ``--time`` to ``start_parallel_jobs_slurm.py``.
* Distances matrices and UPGMA trees generated from the full (unrarefied) OTU table are now stored under ``unrarefied_bdiv`` in the output directory from ``jackknifed_beta_diversity.py``. That UPGMA tree is optionally used (if the user passes ``--master_tree full``). This change makes their content more explicit so they're less likely to be used by accident ([2024](https://github.com/biocore/qiime/issues/2024)).

1.9.1dev

=============================================

Bug fixes
---------

* **Critical**: Fix incorrect list of taxa in ``compute_taxonomy_ratios.py``. **This was a serious bug that was encountered when users would call ``compute_taxonomy_ratios.py`` using the MD-index, custom ratios did not suffer from this bug. Any computations of the MD-index previously generated with that command should be re-run.**.
* Add ``--read_arguments_from_file`` to ``split_libraries_fastq.py``, thus preventing ``multiple_split_libraries_fastq.py`` from failing with an `Argument list too long error` when the number of input files is large, see [2069](https://github.com/biocore/qiime/issues/2069).
* Fixed bug in start_parallel_jobs_slurm.py, which would cause jobs to not run if ``slurm_memory`` was specified in ``qiime_config``.

1.9.0

===========

New scripts
-----------

* ``observation_metadata_correlation.py``: Allows the calculation of correlations between feature abundances and continuous-valued metadata. This script replaces the continuous-valued correlation functionality that was in ``otu_category_significance.py`` in QIIME 1.7.0 and earlier.
* ``compare_trajectories.py``: Allows analysis of volatility using different algorithms.
* ``compute_taxonomy_ratios.py``: Implements the microbial dysbiosis index (MD-index) from [Gevers et al 2014](http://www.ncbi.nlm.nih.gov/pubmed/24629344).
* ``collapse_samples.py``: Allows collapsing groups of samples in BIOM tables and mapping files based on their metadata (see [1678](https://github.com/biocore/qiime/issues/1678)). This can be used, for example, to collapse samples belonging to a replicate group. This also has replaced ``summarize_otu_by_cat.py`` (see discussion on [1798](https://github.com/biocore/qiime/issues/1798)).
* ``multiple_split_libraries_fastq.py``, ``multiple_join_paired_ends.py``, and ``multiple_extract_barcodes.py``: Facilitate initial QIIME processing of already-demultiplexed fastq files, as these are commonly being provided by sequencing centers.
* ``differential_abundance.py``: Supplements ``group_significance.py`` to support metagenomeSeq's fitZIG algorithm and DESeq2's negative binomial algorithm. The input for this is an unnormalized, raw BIOM table.
* ``normalize_table.py``: Adds support for BIOM table normalization algorithms in addition to rarefaction. Supported methods are metagenomeSeq's CSS and DESeq's variance stabilizing transformation.
* ``start_parallel_jobs_slurm.py``: Allows for parallel job submission using [slurm](https://computing.llnl.gov/linux/slurm/).
* ``split_libraries_lea_seq.py``: Allows for demultiplexing of sequences using the LEA-Seq protocol, described in [Faith et al. (2013)](http://www.sciencemag.org/content/341/6141/1237439). This script should be considered to be in **beta testing status**.
* ``extract_reads_from_interleaved_file.py``: Splits an interleaved FASTQ file (like the ones produced by JGI) into forward and reverse reads. See [this section](http://qiime.org/tutorials/processing_illumina_data.htmlprocessing-joint-genome-institute-fastq-files) of the Illumina data preparation tutorial for more details.
* ``parallel_pick_otus_sortmerna.py``: Perform parallel OTU picking with SortMeRNA ([Kopylova et al. (2012)](http://www.ncbi.nlm.nih.gov/pubmed/23071270).

Features
--------

* ``split_otu_table.py`` now allows multiple fields to be passed to split a biom table, and optionally a mapping file. Check out the new documentation for the naming conventions (which have changed slightly) and an example.
* Added new options to ``make_otu_heatmap.py``:
* ``--color_scheme``, which allows users to choose from different color schemes [here](http://matplotlib.org/examples/color/colormaps_reference.html)
* ``--observation_metadata_category``, which allows users to select a column other than taxonomy to use when labeling the rows
* ``--observation_metadata_level``, which allows the user to specify which level in the hierarchical metadata category to use in creating the row labels.
* ``-g``/``--imagetype``, ``--dpi``, ``--width``, and ``--height``, which offer more control over the generation of heatmap figures.
* ``-m/--mapping_fps`` is no longer required for split_libraries_fastq.py. The mapping file is not required when running with ``--barcode_type 'not-barcoded'``,but the mapping file would fail to validate when passing multiple sequence files and sample ids but a mapping file without barcodes (see [1400](https://github.com/biocore/qiime/issues/1400)).
* Added alphabetical sorting option (based on boxplot labels) to ``make_distance_boxplots.py``. Sorting by boxplot median can now be performed by passing ``--sort median`` (this was previously invoked by passing ``--sort``). Sorting alphabetically can be performed by passing ``--sort alphabetical``.
* Scripts that write an OTU table will now write BIOM files in HDF5 format if HDF5 is installed. This improves performance for very large OTU tables.
* ``merge_mapping_files.py`` can now take an argument to convert the header names to upper case, so it will merge for example a category named `treatment` and another one named `TREATMENT` from two different mapping files.
* The script ``make_distance_histograms.py`` has been removed. This functionality should be accessed through ``make_distance_boxplots.py``.
* Beta support has been added for performing OTU picking with open source software:
* subsampled open reference OTU picking using SortMeRNA ([Kopylova et al. (2012)](http://www.ncbi.nlm.nih.gov/pubmed/23071270) (for the closed-reference steps) and [SumaClust](http://metabarcoding.org/sumatra) (for the open reference steps). This can be accessed with ``pick_open_reference_otus.py -m sortmerna_sumaclust``.
* closed-reference OTU picking using SortMeRNA ([Kopylova et al. (2012)](http://www.ncbi.nlm.nih.gov/pubmed/23071270). This can be accessed with ``pick_closed_reference_otus.py -p params.txt`` where params.txt includes the line ``pick_otus:otu_picking_method sortmerna``.
* de novo OTU picking using [SumaClust](http://metabarcoding.org/sumatra) or swarm ([Mahe et al. (2014)](https://peerj.com/articles/593/)). This can be accessed with ``pick_de_novo_otus.py -p params.txt`` where params.txt includes the line ``pick_otus:otu_picking_method sumaclust`` or ``pick_otus:otu_picking_method swarm``.
* sumaclust v1.0.00, swarm 1.2.19, and sortmerna 2.0 are now optional dependencies (see the [QIIME install docs](http://qiime.org/install/install.html) for details).
* Renamed ``split_fasta_on_sample_ids_to_files.py`` to ``split_sequence_file_on_sample_ids.py``, which now supports splitting FASTQ files, as well. Added a parameter, ``--file_type``, which is used to specify the type of the input file.
* Added ``--assign_taxonomy`` option to ``pick_closed_reference_otus.py`` to allow taxonomy assignment using a classifier, rather than the default of using the taxonomic assignment of the cluster centroid.
* Added ``--suppress_taxonomy_assignment`` option to ``pick_closed_reference_otus.py``.
* Updated output of ``identify_paired_differences.py`` to include more information in the pseudo-mapping file that it generates. This includes the "pre" and "post" values for all of the analysis categories on a per-subject basis. This is useful for plotting with other tools, or for generating legends for the plots that are currently generated by the script (see [issue 1707](https://github.com/biocore/qiime/issues/1707)).
* Added ``pick_otus_reference_seqs_fp`` to the QIIME config file. This is a filepath to reference sequences to use with QIIME's OTU picking scripts/workflows. See the [QIIME config docs](http://qiime.org/install/qiime_config.html) and [1696](https://github.com/biocore/qiime/issues/1696) for more details.
* The QIIME config settings ``assign_taxonomy_id_to_taxonomy_fp``, ``assign_taxonomy_reference_seqs_fp``, ``pick_otus_reference_seqs_fp``, and ``pynast_template_alignment_fp`` now default to reference data files in the [qiime-default-reference project](http://github.com/biocore/qiime-default-reference).
* Installing QIIME via ``pip install qiime`` now works out-of-the-box by providing a functioning QIIME minimal (base) install (see [1696](https://github.com/biocore/qiime/issues/1696)).
* ``cluster_jobs_fp`` in the QIIME config file now defaults to ``start_parallel_jobs.py``. ``seconds_to_sleep`` now defaults to 1.
* Added ``--negate_sample_id_fp`` option to ``filter_samples_from_otu_table.py`` (see [1117](https://github.com/biocore/qiime/issues/1117)).
* Added ``--percent_variation_below_one`` flag to ``make_2d_plots.py`` for when the percent variation is actually below 1 and not a relative measure.
* The default confidence threshold for the Naive Bayes taxonomy assigners (RDP Classifier and mothur) is now ``0.50``, as [recommended by the RDP Classifier developers](https://rdp.cme.msu.edu/classifier/class_help.jspconf) for partial sequences.

Usability enhancements
----------------------

* Simplified and improved QIIME install documentation.
* Errors raised by scripts are easier to read and include a supplementary message on how to get help (see [1794](https://github.com/biocore/qiime/issues/1794)).
* QIIME is now easier to install! Removed ``qiime_scripts_dir``, ``python_exe_fp``, ``working_dir``, ``cloud_environment``, and ``template_alignment_lanemask_fp`` from the QIIME config file. If these values are present in your QIIME config file, they will be flagged as unrecognized by ``print_qiime_config.py -t`` and will be ignored by QIIME. QIIME will now use the ``python`` executable and QIIME scripts that are found in your ``PATH`` environment variable, and ``temp_dir`` will be used in place of ``working_dir`` (this value was used by some parts of parallel QIIME previously). ``filter_alignment.py`` will now use the 16S alignment Lane mask (Lane, D.J. 1991) by default if one is not provided via ``--lane_mask_fp``.
* ``--tail_type`` option in ``compare_distance_matrices.py`` now accepts "two-sided" instead of "two sided" for specifying a two-sided alternative hypothesis. The new name is easier to specify via the command-line (quotes aren't needed because it is a single word).
* ``print_qiime_config.py -t`` now tests a QIIME minimal (base) install instead of a QIIME full install. ``print_qiime_config.py -tf`` tests a QIIME full install.
* Standardized use of underscores in option longnames. Affected scripts and options:
* ``scripts/demultiplex_fasta.py``
* `start-numbering-at` is now `start_numbering_at`
* ``scripts/denoiser.py``
* `low_cut-off` is now `low_cut_off`
* `high_cut-off` is now `high_cut_off`
* ``scripts/multiple_rarefactions.py``
* `num-reps` is now `num_reps`
* ``scripts/multiple_rarefactions_even_depth.py``
* `num-reps` is now `num_reps`
* ``scripts/parallel_multiple_rarefactions.py``
* `num-reps` is now `num_reps`
* ``scripts/plot_rank_abundance_graph.py``
* `no-legend` is now `no_legend`
* ``scripts/split_libraries.py``
* `min-seq-length` is now `min_seq_length`
* `max-seq-length` is now `max_seq_length`
* `trim-seq-length` is now `trim_seq_length`
* `min-qual-score` is now `min_qual_score`
* `keep-primer` is now `keep_primer`
* `keep-barcode` is now `keep_barcode`
* `max-ambig` is now `max_ambig`
* `max-homopolymer` is now `max_homopolymer`
* `max-primer-mismatch` is now `max_primer_mismatch`
* `barcode-type` is now `barcode_type`
* `dir-prefix` is now `dir_prefix`
* `max-barcode-errors` is now `max_barcode_errors`
* `start-numbering-at` is now `start_numbering_at`
* Removed ``--output_dir`` optional option from ``make_otu_heatmap.py`` and replaced it with the required option ``--output_fp``.
* The parameters ``--uclust_min_consensus_fraction`` and ``--uclust_similarity`` in ``*_assign_taxonomy_*`` scripts have been changed to ``--min_consensus_fraction`` and ``--similarity`` since both of these parameters apply to the SortMeRNA taxon assigner as well.
* Several changes were made to ``alpha_diversity.py`` metric names:
* ``ACE`` is now ``ace``
* ``chao1_confidence`` is now ``chao1_ci``
* Added ``observed_otus``, which is equivalent to ``observed_species`` but is generally a more accurate name. ``observed_species`` is retained for backward-compatibility.
* SortMeRNA 2.0, SUMACLUST 1.0.00, and swarm 1.2.19 are now installed automatically when QIIME is installed (e.g., via `pip install qiime`).

Bug fixes
---------

* Relaxed sanity tests for ``compare_categories.py --method adonis`` so that unique values are only checked for categories that are non-numeric (see [issue 1316](https://github.com/biocore/qiime/issues/1360)).
* ``core_diversity_analyses.py`` now requires ``--tree_fp`` unless ``--nonphylogenetic_diversity`` is passed (see [1671](https://github.com/biocore/qiime/issues/1671)).
* Fixed bug in ``assign_taxonomy.py -m blast`` and ``parallel_assign_taxonomy_blast.py`` that prevented multiple instances of either to run at the same time (see [1768](https://github.com/biocore/qiime/issues/1768)).
* Fixed bug where ``--phred_offset`` in ``split_libraries_fastq.py`` was ignored (see [1656](https://github.com/biocore/qiime/issues/1656)).
* Spaces in taxa will not cause an error when using ``--assignment_method=mothur`` in ``assign_taxonomy.py``.
* Fixed bug where long axis labels were cut off in heatmaps generated by ``make_otu_heatmap.py`` (see [1571](https://github.com/biocore/qiime/issues/1571)).
* Fixed bug where ``-S``/``--suppress_submit_jobs`` was being ignored by several of the parallel scripts (e.g. ``parallel_pick_otus_uclust_ref.py``) (see [1665](https://github.com/biocore/qiime/issues/1665)).
* Fixed bug where ``make_distance_comparison_plots.py`` would create empty groups (see [1627](https://github.com/biocore/qiime/issues/1627)).
* ``qiime/workflow/pick_open_reference_otus.py`` no longer copies the permission bits of the reference file which caused a file permission failure in some cases.
* Fixed bug in ``make_rarefaction_plots.py`` where ``--generate_per_sample_plots`` wasn't working (see [1475](https://github.com/biocore/qiime/issues/1475)).
* Fixed bug that resulted in samples being mislabeled in ``make_otu_heatmap.py`` when one of the following options was passed: ``--category``, ``--map_fname``, ``--sample_tree``, or ``--suppress_column_clustering``. This is discussed in [1790](https://github.com/biocore/qiime/issues/1790).

Removal of outdated and unsupported functionality
-------------------------------------------------

* Removed ``-Y``/``--python_exe_fp`` and ``-N`` options from ``parallel_merge_otu_tables.py`` script as these are not available in any of the other parallel QIIME scripts and we do not have good reason to support them (see QIIME 1.6.0 release notes below for more details).
* Removed ``insert_seqs_into_tree.py``. This code needs additional testing and documentation, and was not widely used. We plan to add this support back in the future, and progress on that can be followed on [1499](https://github.com/biocore/qiime/issues/1499).
* ``summarize_otu_by_cat.py`` has been replaced with ``collapse_samples.py``.
* Removed options ``-c``/``--ci_type``, ``-a``/``--alpha``, and ``-f``/``--f_ratio`` from ``conditional_uncovered_probability.py`` as these weren't being used by the script (i.e., supplying different values didn't change the computed CIs because the default were always used).
* Removed ``tax2tree`` as a method in ``assign_taxonomy.py``.
* Fasttree v1.x is no longer supported by ``make_phylogeny.py`` (see [issue 1516](https://github.com/biocore/qiime/issues/1516)).
* Removed ``submit_to_mgrast.py`` script (see [1780](https://github.com/biocore/qiime/issues/1780)).
* Removed ``make_otu_heatmap_html.py`` in favor of ``make_otu_heatmap.py`` (see discussion on [1724](https://github.com/biocore/qiime/issues/1724)).
* Removed ``-m``/``--include_html_counts`` option from the ``plot_taxa_summary.py`` script as the behavior was no longer useful or accurate.

Performance enhancements
------------------------

* Changed default parameters for uclust-based OTU picking: ``max_accepts`` is now 1 (was 20), ``max_rejects`` is now 8 (was 500), ``stepwords`` is now 8 (was 20), and ``word_length`` is now 8 (was 12). These changes greatly reduce runtime, with minimal effect on the results. See Rideout et al., 2014 ([PeerJ pre-print](https://peerj.com/preprints/411/)) for more details.
* Disabled the prefilter by default in ``pick_open_reference_otus.py``. This change greatly reduces runtime, with minimal effect on the results. See Rideout et al., 2014 ([PeerJ pre-print](https://peerj.com/preprints/411/)) for more details.
* The alpha diversity measures available in QIIME (e.g., ``alpha_diversity.py``) are now powered by [scikit-bio](http://scikit-bio.org/), and several of these methods are now considerably faster! See the scikit-bio docs on [alpha diversity](http://scikit-bio.org/docs/latest/generated/skbio.diversity.alpha.html) for more details on the methods.
* ANOSIM and PERMANOVA (available in ``compare_categories.py``) are now powered by [scikit-bio](http://scikit-bio.org/) and are approximately 1000 times faster than previous implementations. These additionally now provide more useful information in the output file. See the scikit-bio docs for [ANOSIM](http://scikit-bio.org/docs/latest/generated/generated/skbio.stats.distance.anosim.html) and [PERMANOVA](http://scikit-bio.org/docs/latest/generated/generated/skbio.stats.distance.permanova.html) for more detail.
* Renamed ``compare_categories.py``'s BEST method to BIO-ENV to match the name used in R's vegan package (``vegan::bioenv``) and the name of the program in the original paper. Use ``compare_categories.py --method bioenv`` instead of ``compare_categories.py --method best``. The underlying implementation has also been rewritten and is considerably faster than before, and the output more closely matches the vegan package, as environmental variables are now scaled before computing Euclidean distances. See the scikit-bio docs for [BIO-ENV](http://scikit-bio.org/docs/latest/generated/generated/skbio.stats.distance.bioenv.html) for more detail.
* The Mantel test (``--method mantel``) and Mantel correlogram (``--method mantel_corr)`` in ``compare_distance_matrices.py`` are considerably faster than previous implementations. See the scikit-bio docs for [Mantel](http://scikit-bio.org/docs/latest/generated/generated/skbio.stats.distance.mantel.html) for more detail.

1.8.0

=========================
* New script, extract_barcodes.py, and associated tutorial added to support alternative illumina barcoding schemes.
* Added script join_paired_ends.py, which supports joining of overlapping paired-end reads in fastq files. This wraps fastq-join and SeqPrep.
* extract_barcodes.py script added-this script is intended to help process fastq data that is not in a compatible format with split_libraries_fastq.py.
* otu_category_significance.py has been removed in favor of a new script called ``group_significance.py`` which has significantly more functionality.
* map_reads_to_reference.py has a new parameter, ``--genetic_code``, which can be used to specify which genetic code should be used when doing translated searches (from nucleotide sequences against a protein database). Genetic codes are specified numerically, corresponding to the genetic codes detailed on the [NCBI page here](http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi)
* core_diversity_analysis.py has a new parameter, ``--recover_from_failure``, that allows the user to re-run on an existing output directory and will only re-run analyses that haven't already been run. This additionally allows the user to add additional categories to a previous run, which is very common and previously required a full re-run.
* Added new script, ``estimate_observation_richness.py``, which implements some of the interpolation and extrapolation richness estimators in Colwell et al. (2012), Journal of Plant Ecology. IMPORTANT: This script should be considered beta software; it is currently an experimental feature in QIIME.
* QIIME now depends on [qcli 0.1.0](ftp://thebeast.colorado.edu/pub/qcli-releases/qcli-0.1.0.tar.gz), a stand-alone package which performs command line interface parsing and testing.
* make_qiime_rst_file.py has been removed in favor of qcli_make_rst.
* transform_coordinate_matrices.py can now take more than two input coordinate matrices. When used this way, the first coordinate matrix will be treated as the reference, and the 2nd through nth will be compared against that reference. The output file names, which were all previously hard-coded, are now generated on the fly for clarity of the results.
* split_libraries_fastq.py can now handle per-sample, non-barcoded fastq files. Some sequencing centers are now providing data in this way - if this becomes more common, we'll want to make this more convenient, but for now it's possible.
* Added a parallel merge OTUs method that will combine OTU tables in parallel where possible.
* Added identify_paired_differences.py to support paired difference (i.e., Pre/Post) testing as discussed in issue 1040.
* Added new taxonomic assignment method, ``qiime.assign_taxonomy.UclustConsensusTaxonAssigner``. This is accessible through ``assign_taxonomy.py -m uclust``, ``parallel_assign_taxonomy_uclust.py``, ``pick_de_novo_otus.py`` and ``pick_open_reference_otus.py``. This is being tested as an alternative to QIIME's existing taxonomic assignment methods.
* Refactored beta_diversity_though_plots.py, jackknifed_beta_diversity.py, and core_diversity_analyses.py workflows to generate emperor PCoA plots instead of KiNG PCoA plots. QIIME now depends on Emperor 0.9.3. One interface change that will be noticeable to users is that the output PCoA plots from these workflows are no longer separated into "continuous" and "discrete" directories. Users can make these color choices from within emperor, so only one PCoA plot is necessary. This refactoring also involved script interface changes to beta_diversity_through_plots.py, which no longer generates 2d plots (interested users can call make_2d_plots.py directly - these won't be needed as often, since we no longer have a Java dependency) or distance histograms (these data are better accessed through make_distance_boxplots.py, which is better written and tested, though users can still call make_distance_histograms.py directly). As a result, beta_diversity_through_plots.py no longer takes the --suppress_2d_plots, --suppress_3d_plots, or --histogram_categories parameters, and now takes a new --suppress_emperor_plots parameter which can be used to disable PCoA plotting.
* Modified compare_alpha_diversity.py to generate box plots in addition to statistics, and added the ability to pass multiple categories (instead of just a single category) on the command line. Also fixed issue where options contain ``dest`` parameter, and therefore could have a different name then their longform parameter name. This involves several script interface changes: the --category option is now called --categories; script now takes --output_dir instead of --output_fp (because multiple files can be created, instead of just a single file); --alpha_diversity_filepath is now --alpha_diversity_fp; and --mapping_filepath is now --mapping_fp.
* Refactored make_rarefaction_plots.py to add options --generate_per_sample_plots and --generate_average_tables. These are now suppressed by default to reduce run time and size of output.
* Refactored alpha_rarefaction.py to add option --retain_intermediate_files. Rarefied BIOM tables and alpha diversity results for each rarefied BIOM table are now removed by default to reduce size of output.
* Update to rtax 0.984.
* Required PyNAST version is now 1.2.2.
* Updated default taxonomy assigner to be the new uclust-based consensus taxonomy assigner. This was shown to be more accurate and faster than the existing methods in Bokulich, Rideout et al. (submitted).
* Renamed check_id_map.py to validate_mapping_file.py for clarity
* Change short option names in summarize_otu_by_cat.py to be consistent with other scripts.
* Increased default rdp_max_memory from 1500M to 4000M as this was almost always needing to be increased when re-training on modern reference databases.
* Required biom-format version is now 1.3.1.
* convert_unifrac_sample_mapping_to_otu_table.py and convert_otu_table_to_unifrac_sample_mapping.py have been moved to the FastUnifrac repo (https://github.com/qiime/FastUnifrac)
* Required matplotlib version is now >= 1.1.0, <= 1.3.1.
* Required numpy version is now >= 1.5.1, <= 1.7.1.
* QIIME has been added to [PyPi](https://pypi.python.org/pypi) and can be installed using ``pip``.

1.7.0

=========================
* Required biom-format version is now 1.1.2.
* core_qiime_analyses.py has been replaced with core_diversity_analyses.py. This follows a re-factoring to support only "downstream" analyses (i.e., starting with a BIOM table). This makes the script more widely applicable as it's now general to any BIOM data and/or different OTU picking strategies.
* Added support for usearch v6.1 OTU picking and chimera checking. This is in addition to existing support for usearch v5.2.236.
* Added section on using usearch 6.1 chimera checking with ``identify_chimeric_seqs.py`` to "Chimera checking sequences with QIIME" tutorial.
* ``compare_alpha_diversity.py`` output now includes average alpha diversity values as well as the comparison p and t vals.
* ``compare_distance_matrices.py`` has a new option ``--variable_size_distance_classes`` for running Mantel correlogram over distance classes that vary in size (i.e. width) but contain the same number of pairwise distances in each class.
* ``qiime.filter.sample_ids_from_category_state_coverage`` now supports splitting on a category.
* Modified add_qiime_labels.py script to use standard metadata mapping file with a column specified for fasta file names to make more consistent with other scripts.
* otu_category_significance.py now makes better use of the BIOM Table API, addressing a performance issue when using CSMat as the sparse backend.
* Added qiime.group.get_adjacent_distances, which is useful for plotting distances between "adjacent" sample ids in a list provided by the user. This is useful, for example, in plotting distances between adjacent temporal samples in a time series.
* Fixed a bug in make_3d_plots.py related to biplot calculations. This bug would change the placement of taxonomic groups based on how many taxa were included in the biplot analysis. Examples and additional details can be found here: [677](https://github.com/qiime/qiime/issues/677).
* Major refactoring of workflow tests and organization of workflow code. The workflow library code and tests have now been split apart into separate files. This makes it a lot more manageable, which will support a more general refactoring of the workflow code in the future to make it easier to develop new workflows. The workflow tests have also been updated to use the new test data described in [582](https://github.com/qiime/qiime/issues/582), which is now accessible through ``qiime.test. get_test_data()`` and ``qiime.test.get_test_data_fps()``. This provides improved testing of boundary cases in each workflow, as well as more consistent tests across the workflows.
* otu_category_significance.py now supports an input directory of BIOM tables, and can write out either a single collated results file or an individual file for every input table in the directory. The -o output_fp is now a required parameter rather than an optional parameter.
* simsam.py now has a -m/--mapping_fp option and writes output to a directory instead of a single file. -n/--num and -d/--dissim now accept a single number or comma-separated list of values.
* supervised_learning.py can now handle input directorys of otu tables, can write a single collated results file if the input directory is of rarefied otu tables, and the -o output fp option is now a required parameter.
* The qiime_test_data repository has been merged into the main qiime repository, which will facilitate development by not requiring users to time pull requests against two repositories. Users will no longer have to specify qiime_test_data_dir in their qiime_config files to include the script usage tests in runs of all_tests.py. all_tests.py will now know how to find qiime_test_data, and will run all of the script usage tests by default.
* pick_reference_otus_through_otu_table.py now outputs otu_table.biom in top-level output directory rather than nested in the otu picking output directory.
* pick_reference_otus_through_otu_table.py has been renamed pick_closed_reference_otus.py (issue [708](https://github.com/qiime/qiime/issues/708)).
* pick_subsampled_reference_otus_through_otu_table.py has been renamed pick_open_reference_otus.py (issue [708](https://github.com/qiime/qiime/issues/708)).
* pick_otus_through_otu_table.py has been renamed pick_de_novo_otus.py (issue [708](https://github.com/qiime/qiime/issues/708)).
* make_distance_comparison_plots.py now supports auto-sizing of distribution plots via --distribution_width (which is the new default) and better handles numeric label types with very large or small ranges (e.g. elevation) by scaling x-axis units to [1, (number of data points)]. --group_spacing has been removed in favor of the new auto-sizing feature.
* per_library_stats.py removed in favor of biom-format's print_biom_table_summary.py.
* Add SourceTracker tutorial, and changed QIIME to depend on SourceTracker 0.9.5 (which is modified to facilitate use with QIIME).
* Moran's I (in compare_categories.py) now supports identical samples (i.e. zeros in the distance matrix that aren't on the diagonal).
* summarize_taxa.py now outputs taxa summary tables in both classic (TSV) and BIOM formats by default. This will allow taxa summary tables to be used with other QIIME scripts that expect BIOM files as input. This change is the first step towards adding full support for BIOM taxon tables in QIIME. summarize_taxa.py also has two new options: --suppress_classic_table_output and --supress_biom_table_output.
* make_distance_boxplots.py and make_distance_comparison_plots.py now explicitly state the alternative hypothesis used in the t-tests.
* parallel_blast.py now has a different option for providing a blast db (--blast_db). This implies that the current --refseqs_path should be used only for providing a fasta file of reference sequences. The --suppress_format_blastdb option has been removed since it is no longer needed.

1.6.0

=========================
* Added ``filter_taxa_from_otu_table.py`` to support filtering OTUs with (or without) specific taxonomy assignments from an OTU table.
* Added parameters to ``pick_subsampled_reference_otus_through_otu_table.py`` to suppress taxonomy assignment (``--suppress_taxonomy_assignment``), and alignment and tree building steps (``--suppress_align_and_tree``). These are useful for cases where a taxonomy may not exist for the reference collection (not too common) or when the region doesn't work well for phylogenetic reconstruction (e.g., fungal ITS). Additionally fixed a bug where alternate ``assign_taxonomy`` parameters provided in the parameters file would be ignored when running in parallel.
* Detrending of quadratic curvature in ordination coordinates now a feature of QIIME. This approach was used in [Harris JK, et al. "Phylogenetic stratigraphy in the Guerrero Negro hypersaline microbial mat."](http://www.nature.com/ismej/journal/v7/n1/full/ismej201279a.html).
* Supervised learning mislabeling output now includes binary "mislabeled" columns at 5%, 10%, ..., 95%, 99%.
* Added tutorial on Fungal ITS analysis.
* Added tutorial on predicting mislabeled samples.
* Modified the parameters (de novo chimera detection, reference chimera detection, and size filtering) for USEARCH options with ``pick_otus.py`` to ``suppress_X`` and ``False`` by default, rather than ``True`` and turned off by calling, to make them more intuitive to use and work better with the workflow scripts.
* Added a ``simpson_reciprocal`` measure of alpha diversity, which is ``1/D``, following the [definition here](http://www.countrysideinfo.co.uk/simpsons.htm) among other places. Note the measure ``reciprocal_simpson`` is ``1/simpson``, not ``1/D``. It was removed for clarity.
* Added new script, ``compute_core_microbiome.py``, which identifies the core OTUs (i.e., those defined in some user-defined percentage of the samples).
* Major refactoring of parallel QIIME. Repetitive code was consolidated into the ParallelWrapper class, which may ultimately move to PyCogent. The only script interface changes are that the ``-Y/--python_exe_fp``, ``-N (serial script filepath)``, and ``-P/--poller_fp`` parameters are no longer available to the user. These were very infrequently (if ever) modified from defaults, so it doesn't make sense to continue to support these. These changes will allow for easier development of new parallel wrappers and facilitate changes to the underlying parallel functionality.
* Added new script, ``compare_taxa_summaries.py``, and supporting library and test code (``qiime/compare_taxa_summaries.py`` and ``tests/test_compare_taxa_summaries.py``) to allow for the comparison of taxa summary files, including sorting and filling, expected, and paired comparisons using pearson or spearman correlation. Added accompanying tutorial (``doc/tutorials/taxa_summary_comparison.rst``).
* New script for parallel trie otu picker.
* Made ``loaddata.r`` more robust when making mapping files, distance matrices, etc. compatible with each other. There were rare cases that caused some R functions (e.g. ``betadisper``) to fail if empty levels were left in the parsed mapping file.
* Fixed issue in ``ParallelWrapper`` class that could have caused a deadlock if run from within a subprocess with pipes.
* ``make_distance_boxplots.py`` and ``make_distance_comparison_plots.py`` can now perform Student's two-sample t-tests to determine whether a pair of boxplots/distributions are significantly different (using both parametric and nonparametric Monte Carlo-based tests of significance). These changes include three new options to the two scripts (``--tail_type``, ``--num_permutations``, and ``--suppress_significance_tests``), as well as a new function ``all_pairs_t_test`` in ``qiime.stats``. The accompanying tutorial has also been updated to cover the new statistical tests.
* Checks are now in place to prevent asymmetric and non-hollow distance matrices from being used in ``make_distance_boxplots.py``, ``make_distance_comparison_plots.py``, ``make_distance_histograms.py``, ``compare_categories.py``, and ``compare_distance_matrices.py``. The relevant script help and underlying library code has been documented to warn against their use, and the symmetry checks can be easily disabled if performance becomes an issue in the future.
* ``qiime.util.DistanceMatrix`` has new method ``is_symmetric_and_hollow``.
* Added the new Illumina Overview Tutorial which was developed for the ISME 14 Bioinformatics Workshop and added the IPython notebook files that were used in the ISME 14 workshop under the new ``examples/ipynb`` directory. These can be used by changing to the ``ipynb`` directory and running ``ipython notebook`` on a system with IPython and the IPython Notebook dependencies installed. Also moved the ``qiime_tutorial`` directory to the new ``examples`` folder.
* Added support for translated database mapping through ``map_reads_to_reference.py`` and ``parallel_map_reads_to_reference.py`` and related library code, parallel code, etc. This is analogous to closed-reference OTU picking, but can translate queries so is useful for mapping metagenomic or metatranscriptomic data against databases of functional genes (e.g., KEGG). Currently BLAT and usearch are supported for translated searching.
* ``qiime.util.qiime_system_call`` now has an optional shell parameter that is passed through to ``subprocess.Popen``.
* Changed ``compare_categories.py`` script interface such that ``--method rda`` is no longer supported and must now be ``--method dbrda`` as the method we provide is db-RDA (capscale), not traditional RDA; added the ability to pass the number of permutations (``-n``) for PERMDISP and db-RDA (these were previously not supported); updated script documentation, statistical method descriptions, and accompanying tutorial to be of overall better quality and clarity; output filename when method is PERMDISP is now ``permdisp_results.txt`` instead of ``betadisper_results.txt``, which is consistent with the rest of the methods; significant refactor of underlying code to be better tested and maintained easier; added better error checking and handling for the types of categories that are accepted by the statistical methods (e.g. checking that categories are numeric if they need to be, making sure categories do not contain all unique values, or a single value); fixed output format for BEST method to be easier to read and consistent with the other methods; ``qiime.util.MetadataMap`` class has a few new utility methods to suppport some of these changes.
* ``compare_alpha_diversity.py`` now supports both parametric and nonparametric two sample t-tests (nonparametric is the default) with the new optional options ``-t/--test_type`` and ``-n/--num_permutations``. Also fixed a bug that used the wrong degrees of freedom in the t-tests, yielding incorrect t statistics and p-values, and added correction for multiple comparisons.
* Removed tree method ``raxml`` from ``make_phylogeny.py``'s choices for ``-t/--tree_method``. Tree method ``raxml_v730`` should now be used instead. RAxML v703 is no longer supported.
* Minimum PyNAST version requirement upgraded to PyNAST 1.2.
* ``make_distance_boxplots.py``, ``make_distance_comparison_plots.py``, and ``make_distance_histograms.py`` now correctly output TSV data files with ``.txt`` extension instead of ``.xls`` (this allows them to be opened easier in programs such as Excel).
* ``make_distance_boxplots.py`` has a new option ``--color_individual_within_by_field`` that allows the "individual within" boxplots to be optionally colored to indicate their membership in another mapping file field. A legend is also included.
* Added ``sample_ids_from_category_state_coverage`` function to ``qiime/filter.py`` to support filtering of samples based on a subject's category coverage. For example, this function is useful for filtering individuals out of a time series study that do not meet some sort of timepoint coverage criteria.
* ``assign_taxonomy.py`` now supports assignment with tax2tree version 1.0 and mothur version 1.25.0.
* Added new script ``load_remote_mapping_file.py`` and accompanying tutorial to allow exporting and downloading of mapping files stored as Google Spreadsheets.
* Fixed bug in ``parallel_assign_taxonomy_blast.py`` which would cause the script to hang if a relative path was passed for ``-o``.
* Added the [``qiime_test_data``](https://github.com/qiime/qiime_test_data) repository which contains example input and output for most QIIME scripts. The individual script documentation was completely refactored so that usage examples correspond to the example input and output files. The *basic script testing* functionality was removed from ``all_tests.py`` and replaced with more detailed testing of the scripts based on their usage examples.
* ``add_taxa.py`` was removed in favor of ``add_metadata.py`` (a ``biom-format`` project script). See the new [tutorial on adding metadata to BIOM files](biom-format.org/documentation/adding_metadata.html).
* Updated ``qiime.util.get_qiime_library_version`` to return git commit hash rather than svn revision number (as we're using git for revision control now).
* Added java version in output of ``print_qiime_config.py`` to assist with debugging.
* Changed ``plot_rank_abundance_graph.py`` so ``-o`` specifies the filename of the figure, not the output directory anymore.
* Added new script ``add_alpha_to_mapping_file.py`` which adds alpha diversity data to a mapping file for incorporation in plots, etc.
* Moved the QIIME website files from ``Qiime/web`` to their own GitHub repository: [qiime.github.com](https://github.com/qiime/qiime.github.com).
* Fixed bug in installation of QIIME Denoiser with setup.py.
* ``supervised_learning.py`` now produces mislabeling.txt and cv_probabilities.txt that look like QIIME mapping files, allowing them to be used for coloring points in PCoA plots, etc.
* Updated RDP Classifier training code to allow any number of ranks in training files, as long as number of ranks is uniform. This removes the need for special RDP training files in reference OTU collections.
* Added table density and metadata listings to ``per_library_stats.py``.
* Updates to several dependencies. New dependencies (for those that changed in this release) are: Python 2.7.3; PyCogent 1.5.3; biom-format 1.1.1; PyNAST 1.2; usearch 5.2.236; rtax 0.983; AmpliconNoise 1.27; Greengenes OTUs 12_10; and RDP Classifier 2.2.

Page 1 of 3

Releases

Has known vulnerabilities

Qiime

Page 1 of 3

1.9.1

1.9.1dev

1.9.0

1.8.0

1.7.0

1.6.0

Page 1 of 3

Links

Releases