Kb-python

Latest version: v0.28.2

Safety actively analyzes 619256 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 5

0.28.1

Anndata/loom files now have nascent/mature layers rather than unspliced/spliced layers.

--workflow=custom can take in multiple FASTA inputs

Allow --d-list to have comma-separated multiple FASTA files with URLs

Command-line options menu cleaned up a bit

0.28.0

Implements all the updates detailed in protocols paper: https://doi.org/10.1101/2023.11.21.568164

* kallisto version 0.50.1
* bustools version 0.43.1

0.27.3

General
* Bumped `ngs-tools>=1.7.3`.

`ref`
* **[DEPRECATION]** Split index generation using `-n` has been fully deprecated. (Thanks to amcdavid for catching a bug)

`count`
* Fixed a minor issue with `--workflow kite:10xFB`, where `bustools project` would be called before `bustools correct` (the order should be opposite). This fix required a bump to the `ngs-tools` dependency.
* Support for `--workflow lamanno` for `-x smartseq3`.
* **[DEPRECATION]** Counting using split indices by providing a comma-delimited list to `-i` has been fully deprecated.
* Support for whitelist (`-w option`) for `bulk`, `smartseq2` and `smartseq3` technologies.
* Added support for `-x 10XV3_ULTIMA`.

0.27.2

`count`
* Whitelist for technology `-x BDWTA` is now provided.

0.27.1

General
* **[DEPRECATION]** Support for split indices (with the `-n` option) will be deprecated in the next major release. It is now recommended to use `--include-attribute` and `--exclude-attribute` options, similar to Cellranger's `mkref` options (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/advanced/references), to `kb ref` to reduce index size and memory usage.

`ref`
* A remote URL may be provided as the `fasta` (genomic FASTA) and/or `gtf` (gene annotation GTF) arguments. Support from `ngs_tools 1.5.13`.
* GTF is now allowed to have 0-length segments (https://github.com/pachterlab/kallisto/issues/340).

`count`
* **[DEPRECATION]** Technology `SMARTSEQ` is now deprecated. All future uses should use `BULK`, `SMARTSEQ2` or `SMARTSEQ3`.
* Genes that do not have a gene name will now have their gene IDs in the `gene_name` column (or the `adata.var_names` if `--gene-names` is used).
* Support for `--workflow lamanno` for `-x BULK` and `-x SMARTSEQ2` technologies.

0.27.0

General
* Added the `compile` command. See below for more information. (139)
* Fixed an issue where a call to kallisto would hang indefinitely due to a full stderr buffer.
* Changed docstring style to Google-style. Added typings to all functions.
* Updated kallisto binaries to `v0.48.0`.
* Updated bustools binaries to `v0.41.0`.
* Added binary compatibility checks. If a binary is incompatible, `kb compile` is suggested.

`compile`
* This command can be used to compile the `kallisto` and/or `bustools` binary from source. At the most basic level, it downloads the latest release source distributions from the respective GitHub repositories, compiles them, and places them where `kb` can automatically detect them.
* The `target` positional argument specifies which binary (or both) to compile. Possible values are `kallisto`, `bustools` and `all`.
* The `--url` optional argument may be provided with a URL to a remote archive that will be used instead of the latest GitHub release. When this option is used, `target` may not be `all`.
* * The `--ref` optional argument may be provided with a commit hash or git tag. When this option is used, `target` may not be `all`.
* The `-o` optional argument may be used to place the compiled binaries in a different directory. Note that if this option is used, `--kallisto` and `--bustools` options will have to be set appropriately when running `ref` or `count`.
* The `--view` option may be used to simply view what binaries (their locations and versions) will be used by `kb`.
* The `--remove` option may be used to remove existing compiled binaries.
* The `--overwrite` option may be used to overwrite existing compiled binaries.
* The `kallisto` compilation follows https://pachterlab.github.io/kallisto/source and has the same dependencies.
* The `bustools` compilation follows https://bustools.github.io/source and has the same dependencies.
* The `--cmake-arguments` argument may be used to pass in a string of additional arguments to pass directly to the `cmake` command. For instance, to manually specify additional include directories, `--cmake-arguments "-DCMAKE_CXX_FLAGS='-I /usr/include'"`
* Note that the compilation is performed in shared mode, which means the binary will contain links to shared libraries (i.e. not statically linked).

`ref`
* Added `--include-attribute` and `--exclude-attribute` options which can be used to include/exclude specific GTF entries based on their attributes. The argument to these options must be in the form of a `key:value` pair, where `key` is a GTF attribute name and `value` is the value of the aforementioned attribute to include/exclude. Only one of these two options may be specified, and each option may be specified more than once. When multiple `--include-attribute` are provided, GTF entries that have any one of the attributes will be processed. When multiple `--exclude-attribute` are provided, GTF entries that have any one of the attributes will not be processed.

`count`
* Added `--filter-threshold` option to specify the barcode filter threshold. This option may only be used when also providing `--filter bustools` and indicates the minimum number of times a barcode must appear to be retained from filtering. (142)
* Added `--strand` option to override automatic strandedness setting by `kallisto bus`. Available options are `unstranded`, `forward`, and `reverse`.
* Changed the `transcript_ids` column to be a semicolon-delimited string instead of a list (only applicable when `--tcc` is provided) as a workaround for an issue with writing lists to h5ad with `h5py>=3`. 141
* Added `BULK` and `SMARTSEQ2` technologies. The two technologies behave identically. The FASTQs may be provided either directly via command-line (*only for multiplexed samples*), in which case `kb` will perform demultiplexing, or as a single batch definition text file (*only for demultiplexed samples*). See https://pachterlab.github.io/kallisto/manual section about `batch.txt` for formatting. This batch textfile may also contain remote urls to FASTQ files, which will be streamed for supported operating systems. Additionally, added `--parity`, `--fragment-l` and `--fragment-s` options, which may only be provided for these technologies. The first must always be provided, indicating the parity of the reads (`single`, `paired`), and the latter two may only be provided when `--parity single` is also provided, specifying the mean length of the fragments and standard deviation of the fragment lengths.
* **DEPRECATION** The `SMARTSEQ` technology has been deprecated and will be removed in the next release. Instead, `SMARTSEQ2` should be used. See previous point for more information.
* Added `SMARTSEQ3` technology.
* The full binary path is used for `--dry-run` instead of an alias.
* Added `--umi-gene` option, which deduplicates UMIs by gene. Can not be used with smartseq or bulk technologies.
* Added `--em` option, which estimated gene abundances using the EM algorithm. Can not be used with smartseq or bulk technologies, or with `--tcc`.
* Fixed an issue that occurs when the `-o` option to `bustools count` already exists, but as a directory. For instance, `counts_unfiltered/cells_x_genes`. Such folders are removed before running the command.
* Improved output file validation so that all expected files must exist.
* Added `--gene-names` option, which may only be used with `--h5ad` or `-loom` and not `--tcc`. By specifying this option, the output h5ad or loom matrix will be aggregated by gene names instead of IDs.
* Added support for the following technologies: `BDWTA` (BD Rhapsody), `SPLIT-SEQ`, `Visium` (10x).

Page 1 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.