Getting Started with DiseaseLand
From Array Suite Wiki
Getting started with DiseaseLand
As the name implies, DiseaseLand is an disease-focused public database containing carefully-curated data from multiple data types (RNA-Seq, Expression Array, more). We created these short videos to walk though the basic usage of immunology-focused data within DiseaseLand, but you will also find hundreds of projects related to cardiovascular, metabolic, and kidney diseases. If you have trouble viewing the videos, the full playlist may be found here. Written tutorials for OmicSoft products may be found here.
DiseaseLand Video Tutorials
A first look at DiseaseLand
DiseaseLand contains data from tens of thousands of curated samples related to immune disease. These samples are primarily from studies concerned with comparisons between disease and normal tissue, between diseases, between treatment and control, etc. DiseaseLand is an on-going project, with quarterly updates. The user can first check number and distribution of samples/comparisons in the current DiseaseLand release.
- Open DiseaseLand [00:33]
- Sample distribution view [00:48]
- Change view grouping [01:27]
- View controller in the task tab [01:41]
- Filter samples using metadata [02:00]
- Comparison distribution view [03:00]
Gene search and comparison view
DiseaseLand data are primarily focused on gene expression, from microarrays and NGS studies. The user can search for a gene of interest and narrow down to find interesting projects interactively.
- Search gene (example: serpinb7) [00:08]
- Default comparison view: Disease vs Normal [00:25]
- Details on demand for each comparison [00:55]
- Filter comparisons (example: NGS projects) [01:30]
DiseaseLand microarray projects
Experimental designs in projects within DiseaseLand are quite different, and batch effects in microarray projects are difficult to remove. Omicsoft created project-specific views to display expression values based on experiment design in each project.
- Microarray project view [00:21]
- Filter by project meta data [00:46]
- Customize view using options in task tab [00:58]
- Filter projects by disease areas [01:08]
DiseaseLand RNA-Seq Projects
To maximize inter-study comparisons of RNA-seq data in DiseaseLand , Omicsoft processes data from each study, starting from fastq files, through a commmon pipeline. Expression values from RNA-Seq studies are expressed as FPKM values, with upper quantile normalization. DiseaseLand offer project-specific views and also a merged view from all samples. Views display log transformed FPKM values. Samples and projects can be filtered interactively to allow exploration of data.
- Open RNA-Seq quantification project view [00:05]
- Filter projects [00:28]
- Open RNA-Seq quantification view across projects [00:58]
- Filter NGS samples to skin tissue[01:00]
- Change view grouping/profile to sample pathology [01:16]
- Filter by projects to psoriasis [02:00]
- Details on demand for samples [02:30]
- Change view symbol properties [02:45]
How to create and use SampleSets
The SampleSet is powerful comcept/tool that allow users to create custom sample groupings, based on data in land or imported tables. This video demonstrates several ways to build a SampleSet from data using selection and filters, then uses SampleSets in Land Analytics to scan the entire DiseaseLand to discover differential splicing.
- Create a SampleSet from view selection [00:30]
- Create a SampleSet set from filter [01:00]
- Re-search gene to load with SampleSet [01:42]
- Group a SampleSet based on selection [02:16]
- Use SampleSet in Land analytics: sample grouping to splice [04:00]
- Open result set from Land analytics [05:30]
- Verification of alternative splicing variants in land [05:50]
- Manage SampleSets [06:45]
Displaying isoform differential expression
DiseaseLand provides base pair resolution results for RNA-Seq data. Users can visually inspect isoform differential expression at transcript, exon and exon junction levels.
- RNA-Seq gene RPKM/FPKM view [00:35]
- Change grouping [01:08]
- View selected samples in genome browser [01:15]
- Check BAS (BAM summary) files in the OmicSoft genome browser [01:41]
- View exon junctions in genome browser showing difference between isoforms [02:10]
- RNA-Seq transcript View [02:50]
- RNA-Seq exon detail View [03:30]
Omicsoft has manually curated statistical tests (called comparisons) for each project/study included in DiseaseLand. The Comparison collection is useful for finding the common differential expression patterns/signatures between studies, such as between an microarray and NGS study, or to find links between a knock out experiment and compound treatment study.
- Distribution of comparisons in DiseaseLand [00:30]
- Search gene and view comparisons [01:20]
- Search a comparison and view details for all genes [03:30]
- Filter and then view selected comparisons in volcano plot, PCA, Venn diagram, etc. [04:45]
- Create and then search a gene set [07:25]
- DiseaseLand Views for multiple genes [08:30]
The DiseaseLand GeneSet is a powerful tool for grouping and comparing genes, such as members of a gene family, a pathway, or co-regulated genes. These GeneSets can be used to discover DiseaseLand studies that share "genetic signatures" of common up- or down-regulated genes with your GeneSet.
- Search directly for multiple genes [00:24]
- GeneSet Analysis Plot [01:10]
- Create a GeneSet [01:40]
- Search with a GeneSet [2:45]
- GeneSets with differential expression metadata [03:15]
- Using GeneSet differential expression metadata in GeneSet Analysis [05:00]
Note: You can find more details on GeneSetAnalysis at this wiki page.