Cutadapt

Latest version: v4.8

Safety actively analyzes 626604 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 8 of 11

1.8

-----------------

* Support single-pass paired-end trimming with the new ``-A``/``-G``/``-B``/``-U``
parameters. These work just like their -a/-g/-b/-u counterparts, but they
specify sequences that are removed from the *second read* in a pair.

Also, if you start using one of those options, the read modification options
such as ``-q`` (quality trimming) are applied to *both* reads. For backwards
compatibility, read modifications are applied to the first read only if
neither of ``-A``/``-G``/``-B``/``-U`` is used. See `the
documentation <http://cutadapt.readthedocs.io/en/latest/guide.html#paired-end>`_
for details.

This feature has not been extensively tested, so please give feedback if
something does not work.
* The report output has been re-worked in order to accomodate the new paired-end
trimming mode. This also changes the way the report looks like in single-end
mode. It is hopefully now more accessible.
* Chris Mitchell contributed a patch adding two new options: ``--trim-n``
removes any ``N`` bases from the read ends, and the ``--max-n`` option can be
used to filter out reads with too many ``N``.
* Support notation for repeated bases in the adapter sequence: Write ``A{10}``
instead of ``AAAAAAAAAA``. Useful for poly-A trimming: Use ``-a A{100}`` to
get the longest possible tail.
* Quality trimming at the 5' end of reads is now supported. Use ``-q 15,10`` to
trim the 5' end with a cutoff of 15 and the 3' end with a cutoff of 10.
* Fix incorrectly reported statistics (> 100% trimmed bases) when ``--times``
set to a value greater than one.
* Support .xz-compressed files (if running in Python 3.3 or later).
* Started to use the GitHub issue tracker instead of Google Code. All old issues
have been moved.

1.7

-----------------

* IUPAC characters are now supported. For example, use ``-a YACGT`` for an
adapter that matches both ``CACGT`` and ``TACGT`` with zero errors. Disable
with ``-N``. By default, IUPAC characters in the read are not interpreted in
order to avoid matches in reads that consist of many (low-quality) ``N``
bases. Use ``--match-read-wildcards`` to enable them also in the read.
* Support for demultiplexing was added. This means that reads can be written to
different files depending on which adapter was found. See `the section in the
documentation <http://cutadapt.readthedocs.org/en/latest/guide.html#demultiplexing>`_
for how to use it. This is currently only supported for single-end reads.
* Add support for anchored 3' adapters. Append ``$`` to the adapter sequence to
force the adapter to appear in the end of the read (as a suffix). Closes
issue 81.
* Option ``--cut`` (``-u``) can now be specified twice, once for each end of the
read. Thanks to Rasmus Borup Hansen for the patch!
* Options ``--minimum-length``/``--maximum-length`` (``-m``/``-M``) can be used
standalone. That is, cutadapt can be used to filter reads by length without
trimming adapters.
* Fix bug: Adapters read from a FASTA file can now be anchored.

1.6

-----------------

* Fix bug: Ensure ``--format=...`` can be used even with paired-end input.
* Fix bug: Sometimes output files would be incomplete because they were not
closed correctly.
* Alignment algorithm is a tiny bit faster.
* Extensive work on the documentation. It's now available at
https://cutadapt.readthedocs.org/ .
* For 3' adapters, statistics about the bases preceding the trimmed adapter
are collected and printed. If one of the bases is overrepresented, a warning
is shown since this points to an incomplete adapter sequence. This happens,
for example, when a TruSeq adapter is used but the A overhang is not taken
into account when running cutadapt.
* Due to code cleanup, there is a change in behavior: If you use
``--discard-trimmed`` or ``--discard-untrimmed`` in combination with
``--too-short-output`` or ``--too-long-output``, then cutadapt now writes also
the discarded reads to the output files given by the ``--too-short`` or
``--too-long`` options. If anyone complains, I will consider reverting this.
* Galaxy support files are now in `a separate
repository <https://bitbucket.org/lance_parsons/cutadapt_galaxy_wrapper>`_.

1.5

-----------------

* Adapter sequences can now be read from a FASTA file. For example, write
``-a file:adapters.fasta`` to read 3' adapters from ``adapters.fasta``. This works
also for ``-b`` and ``-g``.
* Add the option ``--mask-adapter``, which can be used to not remove adapters,
but to instead mask them with ``N`` characters. Thanks to Vittorio Zamboni
for contributing this feature!
* U characters in the adapter sequence are automatically converted to T.
* Do not run Cython at installation time unless the --cython option is provided.
* Add the option -u/--cut, which can be used to unconditionally remove a number
of bases from the beginning or end of each read.
* Make ``--zero-cap`` the default for colorspace reads.
* When the new option ``--quiet`` is used, no report is printed after all reads
have been processed.
* When processing paired-end reads, cutadapt now checks whether the reads are
properly paired.
* To properly handle paired-end reads, an option --untrimmed-paired-output was
added.

1.4

-----------------

* This release of cutadapt reduces the overhead of reading and writing files.
On my test data set, a typical run of cutadapt (with a single adapter) takes
40% less time due to the following two changes.
* Reading and writing of FASTQ files is faster (thanks to Cython).
* Reading and writing of gzipped files is faster (up to 2x) on systems
where the ``gzip`` program is available.
* The quality trimming function is four times faster (also due to Cython).
* Fix the statistics output for 3' colorspace adapters: The reported lengths were one
too short. Thanks to Frank Wessely for reporting this.
* Support the ``--no-indels`` option. This disallows insertions and deletions while
aligning the adapter. Currently, the option is only available for anchored 5' adapters.
This fixes issue 69.
* As a sideeffect of implementing the --no-indels option: For colorspace, the
length of a read (for ``--minimum-length`` and ``--maximum-length``) is now computed after
primer base removal (when ``--trim-primer`` is specified).
* Added one column to the info file that contains the name of the found adapter.
* Add an explanation about colorspace ambiguity to the README

1.3

-----------------

* Preliminary paired-end support with the ``--paired-output`` option (contributed by
James Casbon). See the README section on how to use it.
* Improved statistics.
* Fix incorrectly reported amount of quality-trimmed Mbp (issue 57, fix by Chris Penkett)
* Add the ``--too-long-output`` option.
* Add the ``--no-trim`` option, contributed by Dave Lawrence.
* Port handwritten C alignment module to Cython.
* Fix the ``--rest-file`` option (issue 56)
* Slightly speed up alignment of 5' adapters.
* Support bzip2-compressed files.

Page 8 of 11

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.