Introduction to HumanDisease Content

From Array Suite Wiki

Jump to: navigation, search


HumanDisease_B37 Land

DiseaseLand is an integrated disease genomics database and visualization software that helps users explore public and private genomics datasets using OmicSoft's Land technology. DiseaseLand provides a user-friendly interface to functional genomics data for thousands of normal and disease samples, accelerating discovery of new connections in disease research. Omicsoft officially upgraded ImmunoLand and CVMLand into a single DiseaseLand in 2016. DiseaseLand is accessible via tiered subscriptions; users can choose to subscribe to immunological diseases, metabolic & cardiovascular diseases, or both. DiseaseLand focuses on datasets including, but not limited to, immunological diseases, metabolic diseases and cardiovascular diseases. Inherited from ImmunoLand, DiseaseLand includes immune-related diseases such as Asthma/Respiratory Diseases, Arthritis, Allergies, COPD, IBD, Psoriasis, SLE (systemic lupus erythematosus), Multiple Sclerosis, and Infectious Diseases. Projects from the former CVMLand provide data from cardiovascular diseases, diabetes mellitus, liver disease, lipid metabolism disorders and nutrition disorders.

With a heavy focus on publicly available expression microarray and RNA-Seq data, HumanDisease_B37 Land offers the potential to look at gene expression, all processed through the same pipeline, across many different projects, with the additional value of providing exon level expression and alternative splicing metrics, visualizations and functions. At Omicsoft, thanks to our experienced data curation and processing team, we have a systematic method for data curation. Refer to our Curation Pipeline for details.

DiseaseLand also features Comparison Views, which allows users to easily search and visualize statistical contrasts between groups of samples using common queries: Treated vs Control, Disease vs Normal, Responder vs Non-Responder etc. By searching a gene, user can "visualize" the association with comparisons across thousands of projects, and narrow down to find interesting projects interactively. (Additional reading: ComparisonLand )

Data has been processed using the same genome build: Human.B37.3 and gene model: OmicsoftGene20130723

Data Source





Data Types

  • RNA-Seq data
  • microarray platforms (including Affymetrix and Illumina)
  • Copy number variation
  • Methylation450 BeadChip

Laboratory Methods

Refer to individual projects' clinical metadata for details of how data were generated.

Processing Methods

Expression Data: Omicsoft Affymetrix Microarray Preprocessing

RNA-Seq data: OmicScript Pipeline and Building Land From RNA-Seq Data

Key Meta Data Columns

HumanDisease_B37 is curated at the comparison, sample and project level, with hundreds of meta data columns available.

Comparison level:

  • Comparison Cutoffs: Sample size, fold change, p value and expression cutoffs for each comparison.
  • Comparison details: Comparison Category, Contrast, case and control sample IDs.

Sample level:

  • DiseaseCategory (controlled vocabulary) : Disease category of the sample based on the details disease state. (Primary Grouping column)
  • TissueCategory (controlled vocabulary) : Tissue category such as skin, muscle, heart, kidney etc. (Secondary Grouping column)
  • DiseaseState (controlled vocabulary) : Curated at sample level from each project.
  • SampleSource (controlled vocabulary) : Either cell type or tissue information. When a sample has cell type information, cell type is used. Otherwise, tissue category is used.

Sample Distribution by DiseaseCategory:


Project level:

  • ProjectName: The name of individual projects where the data is from.
  • TherapeuticArea: Specific clinical focus of individual project (can be multiple areas depending on project)

Key Views

Project View

Experimental designs in projects within DiseaseLand are quite different, and batch effects in microarray projects are difficult to remove. Omicsoft created project-specific views to display expression values based on experimental design within each project.


Comparison View:

Nearly all projects in DiseaseLand include at least one comparison between subsets of samples. These comparisons are usually modeled after comparisons in the source publication. All comparison datasets are curated as belonging to different "Comparison Types": Treated vs Control, Disease vs Normal, Responder vs Non-Responder, etc.

In DiseaseLand, you can search for a gene and view its expression in all samples or a single project, or you can visualize which comparisons detected up- or down-regulation of the gene. This way, you can identify projects of interest, and discover trends in your favorite gene's regulation.


Comparison Details Views:

Omicsoft uses manually curated metadata to generate statistical tests (called comparisons) for each project/study included in DiseaseLand, generally following the comparisons in the original paper. The Comparison collection is useful for finding the common differential expression patterns/signatures between studies, such as between an microarray and NGS study, or to find links between a gen knockout experiment and a compound treatment study.


Example views include Volcano Plot (upper left), Venn Diagram (upper right), Comparison Heatmap (bottom left) and Significant Genes (bottom right).

Clinical Details View

The OmicSoft curation team carefully curates sample, comparison, and project meta data, including clinical details. As each project has its own key clinical variables, we recommend that users always look at Clinical Details for any specific project/comparison of interest.


Example Clinical Details view for project GSE45734, excluding non-clinical variables.

[back to top]

Related Articles