CCLE GTEx TCGA Virtual Land
From Array Suite Wiki
Build a Virtual Land combining GTEx, CCLE, and TCGA
GTEx, CCLE, and TCGA are three of the most popular Lands, because they provide access to thousands of normal tissue samples, over 1000 cell line datasets, and a premier oncology consortium. A Virtual Land combining these three Lands enables quick interrogation of expression, mutation, fusion, and other data across normal, tumor, and cell line data.
In the Sample Distribution View, you will see all available samples for CCLE (cancer cell lines), GTEx (normal tissue), and TCGA (tumor samples).
Enter these parameters into the Virtual Land window, after providing a name like "CCLE.GTEx.TCGA.B38" and selecting CCLE_B38, GTEx_B38, and TCGA_B38. If you select different Lands, the parameters will need to be adjusted accordingly.
Description=TCGA + GTEx + CCLE Human B38 genome virtual land for cross land searching. //Use TissueCategory to specify the default vertical grouping of samples PrimaryGrouping=TissueCategory PrimaryGroupingName=TissueCategory //Secondary Grouping defines the subsetting of data across a Primary Grouping. Sample Type will be used as the "SampleTypeColumn" to pre-color. DiseaseCategory can also be useful here SecondaryGrouping=Sample Type SecondaryGroupingName=Sample Type //Sample Type is a special column that automatically colors samples by Tumor or Normal status, especially useful for oncology databases. Normal Samples are defined by "ControlSampleLevels" SampleTypeColumn=Sample Type ControlSampleLevels=Blood Derived Normal,Bone Marrow Normal,Normal,Solid Tissue Normal,Cord blood,Venous blood,Control Analyte,Buccal Cell Normal,Other Normal,Cell Lines Normal,EBV Immortalized Normal //For each Source Land, specify the mappings of the source columns to Primary and Secondary Grouping CCLE_B38.PrimaryGrouping=TissueCategory //In CCLE, the best column to map to SampleType is "OncoSampleType" CCLE_B38.SecondaryGrouping=OncoSampleType CCLE_B38.TissueColumn=Tissue //In addition to TissueCategory and Onco Sample Type, additional columns Tumor Or Normal, DiseaseState, and Tissue should be included from CCLE CCLE_B38.VirtualColumns=Tumor Or Normal<-Tumor Or Normal,DiseaseState,Tissue GTEx_B38.PrimaryGrouping=TissueCategory //In GTEx, the best column to map to SampleType is "Land Sample Type". Notice the different source Land column naming does not matter since it is being remapped to SecondaryGrouping and we named SecondaryGroupingName to Sample Type GTEx_B38.SecondaryGrouping=Land Sample Type GTEx_B38.TissueColumn=Tissue //Notice the remapping of TumorOrNormal from GTEx to Tumor Or Normal to match the formatting of the other two Lands. In addition to TissueCategory and Onco Sample Type, additional columns Tumor Or Normal, DiseaseState, and Tissue should be included from CCLE GTEx_B38.VirtualColumns=Tumor Or Normal<-TumorOrNormal,DiseaseState,Tissue TCGA_B38.PrimaryGrouping=TissueCategory TCGA_B38.SecondaryGrouping=Land Sample Type TCGA_B38.TissueColumn=Tissue //In addition to TissueCategory and Onco Sample Type, additional columns Tumor Or Normal, DiseaseState, and Tissue should be included from CCLE TCGA_B38.VirtualColumns=Tumor Or Normal<-Tumor Or Normal,DiseaseState,Tissue
Virtual Land usecase
The key value of combining these Lands is to quickly check a gene's expression across cell lines in tissues of interest, in both tumor and normal samples.
For example, after searching for EGFR, you can switch to the Gene FPKM View, and filter for Tissue Categories of interest (e.g. respiratory system, central nervous system, and breast). Either view the data from all three Lands, or profile the columns by TissueCategory+SourceLand to clearly reveal the differences between normal EGFR expression and tumor EGFR expression, and to verify whether common cancer cell lines reflect tumor expression of the gene in your tissue of interest.