Introduction to GTEx Land Content
From Array Suite Wiki
The Genotype-Tissue Expression project (GTEx) aims to create a comprehensive public atlas of gene expression and regulation across multiple human tissues. GTEx project can help to understand the correlation between tissue-specific gene expression and human diseases. According to GTEx Portal, “GTEx will help researchers to understand inherited susceptibility to disease and will be a resource database and tissue bank for many studies in the future.” It contains RNA-Seq and Affymetrix expression data for all normal tissues. It provides high quality normal control samples to benchmark researchers’ patient or drug response sample data.
It can be used in conjunction with other Lands (like TCGA_B37, for instance) to create virtual Lands, and allows comparisons across datasets as we use controlled vocabularies and we process our expression and RNA-Seq data with our standard pipelines.
819 samples with Affymetrix Expression data (HuGene-1_1-st-v1)
9052 samples with RNA-Seq data; based on SRA files
Affymetrix Expression Array
Illumina TrueSeq RNA sequencing
Expression Data: Omicsoft Affymetrix Microarray Preprocessing
- Virus data: View viral sequence counts in Land RNA-seq Data
- 16S Microbial data: Bacterial counts from 16S rRNA
HLA (Class I) identification using the RnaSeq aligned reads. The HLA OptiType program aligns RNA-seq reads to the HLA Reference genome, and then performs an optimization to determine the most likely HLA Class I allele. See OptiType - precision HLA typing from next-generation sequencing data.pdf for a description of the algorithm. GTEx has classified this information as restricted access.
Key Meta Data Columns
Tissue: Tissue category such as brain, blood, heart, lung, kidney etc., using GTEx terminology
Tissue Detail Type: Sub-category within a tissue, such as Brain - Amygdala, Brain - Cortex, Brain - Hippocampus, Brain - Spinal cord (cervical c-1) etc., using GTEx terminology
Land Tissue: is curated by Omicsoft Land cutation team using Omicsoft's control vocabularies. Allow users to easily merge the data with other Lands.
Land Sample Type: is curated by Omicsoft Land cutation team using Omicsoft's control vocabularies. Allow users to easily merge the data with other Lands.
Tumor or Normal: indicates whether a sample is from tumor sample for normal sample. All GTEx data are normal samples.
GTEx data are tissue-specific data. One of the most common way to visualize the data is to group the data by Tissue:
If the user is interested in more detailed information with in a tissue type, the data can be filtered for one or a few tissue types, and then grouped by Tissue Detail Type: