Introduction to CCLE Land Content
From Array Suite Wiki
CCLE_B37 and CCLE_B38
The Cancer Cell Line Encyclopedia (CCLE) project is an effort to conduct a detailed genetic characterization of a large panel of human cancer cell lines. OmicSoft's CCLE_B37 Land release provides analysis and visualization of DNA copy number, mRNA expression, mutation data and more, for 1000 cancer cell lines. These data can also provide the link between pharmacologic vulnerabilities and genomic/expression patterns, with Land Measurement Queries.
|Land Version||Genome Build||Gene Model|
CCLE_DepMap_Preview_B37 and CCLE_DepMap_Preview_B38
Starting with the 2019R3 release, we integrated DepMap CRISPR and RNAi dependency data into CCLE Lands, which can be found in CCLE_DepMap_Preview_B37 and CCLE_DepMap_Preview_B38.
- CNV, based on segmented CNV files (downloaded)
- CNV Call, GISTIC2 calls
- Expression Intensity Probes (Affymetrix)
- RNA-Seq, including:
- Single-end and Paired-end fusion calling
- RNA-Seq somatic mutation, from matched tumor/normal pairs
- Exon Junction and Exon Usage
- Expression (Gene- and Transcript- level quantification)
- Gene Dependency (CCLE_DepMap_Preview)
- Affymetrix Expression Array (Affymetrix.HG-U133_Plus_2)
- Illumina HiSeq RNA sequencing (HiSeq 2000)
- Hybrid capture sequencing
Expression Data: Omicsoft Affymetrix Microarray Preprocessing
- Virus data: View viral sequence counts in Land RNA-seq Data
- 16S Microbial data: Bacterial counts from 16S rRNA
HLA (Class I) identification using the RnaSeq aligned reads. The HLA OptiType program aligns RNA-seq reads to the HLA Reference genome, and then performs an optimization to determine the most likely HLA Class I allele. See OptiType - precision HLA typing from next-generation sequencing data.pdf for a description of the algorithm.
Omicsoft does not reprocess other genomic data, but extracts data directly from original datasets.
- CRISPR data: Achilles Gene Effect (2019R3)
- RNAi data: DEMETER2 Data v5 (combined)
DNA-seq mutation calls
OmicSoft mutation calls were extracted from Broad DepMap data 2018Q2, including variant (reference and alternative allele counts) based on the following priority order:
- HC_AC (hybrid-capture)
- RD_AC ("RainDance")
- WES_CCLE (WXS)
Key Meta Data Columns
- Primary Site: The body site where the cell line sample is derived from.
- Histology: Histological types of cancer, such as carcinoma, glioma and sarcoma.
- Land Tissue: The tissue from which the cell line was derived, using OmicSoft's curation Controlled Vocabulary
- Land Sample Type: A detailed description of the cell type from which the cell line was derived, using OmicSoft's curation Controlled Vocabulary
- Tumor or Normal: Indicates whether a sample is from a tumor or normal sample.
Sample Distribution by Primary Site
One of the most common ways to visualize gene expression data is a per-sample Scatter plot (e.g. Gene FPKM), with each sample grouped by Primary Site on the Y-axis, and expression level plotted on the X-axis:
Additional Views include transcript-level and exon-level views, pairwise comparison plots, and direct visualization of RNAseq coverage with the OmicSoft Genome Browser.
Multiple visualizations display frequency and locations of gene mutations in CCLE samples, including the Mutation Landscape View.
Copy Number Variation
Copy number data can be visualized for a gene of interest, grouped by any metadata column, such as Histology.
- Latest Tutorials
- CCLELand Introduction Video
- Introduction to TCGA Land Content
- Introduction to Land Content