Swan-vis

Latest version: v3.2

Safety actively analyzes 621498 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 3

3.2

- Fixed dependency issues

3.0

SwanGraph initialization
- Added bool options to SwanGraph initialization (`edge_adata`, `end_adata`, and `ic_adata`) that the user can set to False if they don't wish to track abundance for these individual transcript features

AnnData compatibility
- Allows for addition of abundance information directly from an AnnData object; bypassing dense-matrix representations of the data (`SwanGraph.add_adata()`)

Documentation
- Updated sample data links
- Added AnnData data format to the file format specs
- Added additional examples to showcase functionality added in v3.0 and 2.5

Deprecation of differential gene and transcript expression tests
- Deprecated `SwanGraph.de_gene_test()` and `SwanGraph.de_transcript_test()` as I have not had luck running `diffxpy` in a while
- Added example tutorial on how to directly use a Swan AnnData to perform differential expression testing with PyDESeq2

Known issues
- Currently Cerberus does not output transcript novelty assignments to GTFs and they are therefore not parsed by Swan; will fix in a future update

2.5

SwanGraph structure changes
- Counts and other expression structure (ie TPM, PI) are now stored as sparse matrices to massively save on on-disk as well as in-memory storage
- Capability of storing gene-level abundance information (`SwanGraph.gene_adata`) calculated separately from transcript-level
- Added AnnData to store intron chain level abundance information (`SwanGraph.ic_adata`)
- Added tracking for stable gene ID in cases where reference annotation versions don't match (ie ENSG000000014.5 --> ENSG000000014)

Native compatibility with cerberus transcriptomes
- Will track TSSs, ICs, and TESs called by cerberus based on the names of transcripts provided from the GTF

Other changes
- DIE test now reports top 2 DPI isoforms
- Faster counts and TPM calculations using Scanpy tools
- Added option to sort by isoform's cumulative PI value in the gene report sorting
- Added plotting option for plotting browser models directly on to a preexisting Matplotlib axis `SwanGraph.plot_browser()`
- Added plotting option to plot bed regions `SwanGraph.pg.plot_regions()`
- Added options to calculate TPM across multiple datasets as either the minimum or maximum of the values between the datasets

Minor bug fixes
- Fixed DIE test bug when there are >11 isoforms / gene
- Fixed bugs in `SwanGraph.gen_report()`

2.0

General workflow update
- changed workflow from adding datasets / samples one at a time; now users can pass in one GTF with the union of all expressed transcripts in their data

SwanGraph representation updates
- removed strand from genomic location in `SwanGraph.loc_df`
- added strand to edges in `SwanGraph.edge_df`
- added AnnData representations for tracking transcript abundance as well as automatically-calculated edge, tss, and tes abundance
- added options to represent complex metadata in the AnnData.obs tables
- added a single-cell option to SwanGraph initialization for data with individual cells as samples
- automatically calculates percent isoform use (pi) per dataset per gene (except for in single-cell mode)
- added functions to add and store color palettes for different metadata colors

Analysis options update
- implemented more statistically-robust and published method of isoform switching testing (aka differential isoform expression [DIE] testing), as described by [Joglekar et. al., 2021](https://www.nature.com/articles/s41467-020-20343-5)
- reworked differential gene and transcript expression testing to work smoothly with AnnData representation
- changed output type of intron retention / exon skipping analysis to be a more descriptive pandas DataFrame
- all analysis code will now automatically store results in the `SwanGraph.adata.uns` dictionary using an automatically-generated key that can be easily regenerated to facilitate different pairwise testing and accessing previous results

Gene report update
- removed `indicate_dataset` option
- added option to group datasets based on metadata columns (`groupby`)
- added option to include / exclude / order datasets based on metadata information (`datasets)
- added option to represent datasets using color coded bars either derived from the dataset names or metadata columns (`metadata_cols`), as well as draw a legend for the colors
- added option to either plot TPM or pi (`layer`)
- added option to change what color palette heatmap is plotted in (`cmap`)
- reworked options to indicate differentially-expressed transcripts in conjunction with how differential expression results are now stored (`include_qvals`, `q`, `log2fc`, `qval_obs_col`, `qval_obs_conditions`)
- added option to display values on top of each heatmap cell (`display_numbers`)
- added option to display transcript name as opposed to transcript ID

Other plotting changes
- removed `indicate_dataset` option
- added functions to change the colors of plotted Swan plots and browser plots

Other utilities
- added functions to output calculated edge, tss, or tes abundance along with details of genomic location
- added functions to calculate TPM or percent isoform use (pi) given specific metadata settings
- changed how SwanGraph saving and loading works

**Note**: Saved Swan objects that were generated with previous versions of Swan will *not* be compatible with 2.0!

1.0.3

Fixes a missed dependency allowing for exon entries to be in whatever order beneath the corresponding transcript entries when loading from a GTF.

1.0.2

Minor improvements

* Allows for exon entires to be in whatever order when loading a GTF

Page 1 of 3

Releases

Has known vulnerabilities

Swan-vis

Page 1 of 3

3.2

3.0

2.5

2.0

1.0.3

1.0.2

Page 1 of 3

Links

Releases