Introduction to BeatAML Land

From Array Suite Wiki

(Difference between revisions)
Jump to: navigation, search
Joseph (Talk | contribs)
(Created page with "==BeatAML_B37 and BeatAML_B38== The BeatAML project is an effort to investigate in depth the various genetic classes of AML which have recently been discovered. OmicSoft's Be...")
Newer edit →

Revision as of 23:30, 14 January 2020


BeatAML_B37 and BeatAML_B38

The BeatAML project is an effort to investigate in depth the various genetic classes of AML which have recently been discovered. OmicSoft's BeatAML_B37/B38 Land release provides analysis and visualization of DNA somatic mutations, mRNA expression, and more, for 672 tumor specimens collected from 562 acute myeloid leukemia patients.

These data can also provide the link between pharmacologic vulnerabilities and genomic/expression patterns, with Land Measurement Queries. The drug response measurement data is located here: [1]

Land Version Genome Build Gene Model
BeatAML_B37 Human.B37.3 OmicsoftGene20130723
BeatAML_B38 Human.B38 OmicsoftGenCode.V24

Data Source: Vizome

Data Types

   DNASeq Mutation
   DNASeq Somatic Mutation
   RNA-Seq, including:
       Single-end and Paired-end fusion calling
       RNA-Seq somatic mutation, from matched tumor/normal pairs
       Exon Junction and Exon Usage
       Expression (Gene- and Transcript- level quantification) 

Laboratory Methods

   Illumina HiSeq RNA sequencing (HiSeq 2500)
   Illumina Nextera RapidCapture Exome capture probe sequencing 

Processing Methods

RNA-Seq data: OmicScript RNAseq Pipeline and Building Lands From RNA-Seq Data

OmicSoft does not reprocess other genomic data, but extracts data directly from original datasets. Key Meta Data Columns

   DiseaseState: The type of leukemia the patient was diagnosed with.
   DiseaseStage: If Specimen is obtained at time of relapsed disease or de novo disease or if Patient transformed from another heme malignancy before or at the time of Specimen collection. Three options are available: isRelapse|isDenovo|isTransformed; "NA" means the subject was false for all three.
   Histology: Histological types of cancer, such as carcinoma, glioma and sarcoma.
   Tissue: The tissue from which the cell line was derived, using OmicSoft's curation Controlled Vocabulary
   Sample Type: A detailed description of the cell type from which the cell line was derived, using OmicSoft's curation Controlled Vocabulary
   Tumor or Normal: Indicates whether a sample is from a tumor or normal sample. 

Note: "Unknown" values have been defined by the study authors as "Unknown = not enough information to determine classification"

Primary Grouping: Disease State

Sample Distribution by DiseaseState

BeatAMLSampleDistribution.jpg [back to top]

Key Views

Gene Expression

One of the most common ways to visualize gene expression data is a per-sample Scatter plot (e.g. Gene FPKM), with each sample grouped by DiseaseState on the Y-axis, and expression level plotted on the X-axis:

File:BeatAML B37 RNASeqView.jpg

Additional Views include transcript-level and exon-level views, pairwise comparison plots, and direct visualization of RNAseq coverage with the OmicSoft Genome Browser.

DNA Mutation

DNA sequencing from whole exome sequencing was performed on all samples. Multiple visualizations display frequency and locations of gene mutations in CCLE samples, including the Mutation Landscape View. Many individual genes do not contain DNA mutations in AML cancers, and will display the message "No data is available for charting" because there are no deviations from the wild-type sequence.

To filter down to samples containing only the data type of interest, use the Data filters in the Sample Metadata window.

File:Data filters.jpg

The numeric "Data" filters allow the user to filter samples based on characteristics of the searched gene. For example, RNAseq expression filter would hide any sample that didn't pass the (linear) threshold defined. Similarly, the DNAseq mutation filter would filter out any samples that didn't have the requested number of mutations in a given gene. The user could use it to clean up a plot with a few outliers, or plot expression by number of mutations in the gene of interest. This filter acts, basically, as a quicker -omic data query, for simple tasks.

File:BeatAML B37 DNASeqView.jpg

Where do I find the data presented in the paper?

  • Disease_type (Figure 3) - a term used in the paper, but not in the official BeatAML metadata downloads. The most likely definition of this term, as referenced in BeatAML Land, is DiseaseStage, which was merged from a combination of three columns in the BeatAML metadata: isRelapse|isDenovo|isTransformed
  • Cytogenetics (Figure 3) - this information is contained in the WHO_Fusion column, located in the Clinical Data metadata table.

Additional Notes

Drug response measurement data for 119 samples whose IDs were not listed in the BeatAML metadata table are included in a separate table here: [2]

Benchmark Paper [3]