htsint

example functional module

htsint (High-Throughput Sequencing INTegrate) is a Python package used to create gene sets for the study of high-throughput sequencing data. The goal is to create functional modules through the integration of heterogeneous types of data. These functional modules are primarily based on the Gene Ontology [Ashburner00], but as the package matures additional sources of data will be incorporated. The functional modules produced can be subsequently tested for significance in terms of differential expression in RNA-Seq or microarray studies using gene set enrichment analysis [Subramian05]. Shown below is the placement of htsint in an example analysis pipeline.

example RNA-Seq pipeline

Who are the intended users?

The software in its current form is an API library and because HTS pipelines have different goals with many varied tasks required to achieve these goals, a flexible library in a scripting language commonly used in bioinformatics was selected. The target audience for htsint are developers that piece together high-throughput sequencing (HTS) pipelines. That being said one important aspect of this project is to provide both abstracted functions for non-Python programmers as well as convenient means to enable high levels of customization.

Features

  1. The data are locally stored and maintained in a database
  2. The database is fully accessible and modifiable through SQLAlchemy.
  3. Visualization tools like heatmaps and interaction networks included
  4. The user has complete control over the information used to generate gene sets
  5. Easy to follow examples that require only a basic knowledge of Python

General contents:

Citation

If you find htsint useful for your research or if you want to learn more about the software please refer to the following publication.

A. J. Richards, A Herrel, & C. Bonneaud. htsint: a Python library for sequencing pipelines that combines data through gene set generation. BMC Bioinformatics, 2015, 16, 307