Umi-tools

Latest version: v1.1.4

Safety actively analyzes 630052 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 6

1.0.0

This release is intended to be a stable release with no plans for significant updates to UMI-tools functionality in the near future. As part of this release, much of the code base has been refactored. It is possible this may have introduced bugs which have not been picked up by the regression testing. If so, please raise an issue and we'll try and rectify with a minor release update ASAP.

**Documentation**

UMI-tools documentation is now available online: https://umi-tools.readthedocs.io/en/latest/index.html

Along with the previous documentation, the readthedocs pages also include new pages:

- FAQ
- Making use of our Alogrithmns: The API

**New knee method for whitelist**

- The method to detect the "knee" in `whitelist` has been updated (317). This method should always identify a threshold and is now set as the default method. Note that this knee method appears to be slightly more conservative (fewer cells above threshold) but having identified the knee, one can always re-run `whitelist` and use `--set-cell-number` to expand the whitelist if desired
- The old method is still available via `--knee-method=density`
- In addition, to run the old knee method but allow whitelist to exit without error even if a suitable knee point isn't identified, use the new `--allow-threshold-error` option (249)
- Putative errors in CBs above the knee can be detected using `--ed-above-threshold` (309)

**Explicit options for handling chimeric & inproper read pairs** (312)

The behaviour for chimeric read pairs, inproper read pairs and unmapped reads can now be explictly set with the `--chimeric-pairs`, `--unpaired-reads` and `--unmapped-reads`.

**New options**

- `--temp-dir`: Set the directory for temporary files (254)
- `--either-read` & `--either-read-resolve`: Extract the UMI from either read (175)

**Misc**

- Updates python testing version to 3.6.7 and drops python 2 testing
- Replace deprecated imp import (318)
- Debug error with `pysam <0.14` (319)
- Refactor module files
- Moves documentation into dedicated module

0.5.5

Mainly minor debugs and improved detection of incorrect command line options. Minor updates to documentation.

- Resolves issues correctly skipping reads which have not been assigned (191 & 273). This involves the addition of the `--assigned-status-tag` option

Testing for OSX has been dropped due to unresolved issues with travis. We hope to resurrect this in the future!

In line with major python packages (e.g https://www.numpy.org/neps/nep-0014-dropping-python2.7-proposal.html), support for python 2 will be dropped from January 1st 2019.

0.5.4

- The defualt value for `--skip_regex` was incorrectly formatted. Thanks to ekernf01 for spotting
(231/256)

0.5.3

Debugs wide-format output for `count` (227). Thanks kevin199011

0.5.2

- Adds options to specify a delimiter for a cell barcode or UMI which should be concatenated + options to specify a string splitting the cell barcode or UMI into multiple parts, of which only the first will be used. Note, this options will only work if the barcodes are contained in the BAM tag - if they were appended to the read name using `umi_tools extract` there is no need for these options. See 217 for motivation:
- `--umi-tag-delimiter=[STRING]` = remove the delimeter STRING from the UMI. Defaults to `None`
- `--umi-tag-split=[STRING]` = split UMI by STRING and take only the first portion. Defaults to `None`
- `--cell-tag-delimiter=[STRING]` = remove the delimeter STRING from the cell barcode. Defaults to `None`
- `--cell-tag-split=[STRING]` = split cell barcode by STRING and take only the first portion. Defaults to `-` to deal with 10X GEMs
- Reduced memory requirements for `count --wide-format-cell-counts`: 222
- Debugs issues with `--bc-pattern2`: 201, 221
- Updates documentation: 204, 210, 211 - Thanks kohlkopf, hy09 & cbrueffer

0.5.1

Minor update. Improves detection of duplicate reads with paired end reads, reduces run time with `dedup --output-stats` and a few simple debugs.

* Improved identification of duplicate reads from paired end reads - will now use the position of the FIRST splice junction in the read (in reference coords) (187)
* Speeds up `dedup` when running with `--output-stats` - (184)
* Fixes bugs:
* `whitelist --set-cell-number --plot-prefix` -> unwanted error
* `dedup` gave non-informative error when input contains zero valid reads/read pairs. Now raises a warning but exits with status 0 (190, 195)
* `count` errored if gene identifier contained a ":" (198)
* Renames `--whole-contig` option to `--buffer-whole-contig` to avoid confusion with `per-contig` option. `--whole-contig` option will still work but will not be visible in documentation (196)

Page 2 of 6

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.