Introduction to BeatAML Land
From Array Suite Wiki
BeatAML_B37 and BeatAML_B38
The BeatAML project is an effort to investigate in depth the various genetic classes of AML which have recently been discovered. OmicSoft's BeatAML_B37/B38 Land release provides analysis and visualization of DNA somatic mutations, mRNA expression, and more, for 672 tumor specimens collected from 562 acute myeloid leukemia patients.
These data can also provide the link between pharmacologic vulnerabilities and genomic/expression patterns, with Land Measurement Queries. The drug response measurement data is located here: http://omicsoft.com/downloads/land/BeatAML/STable8_Drug_Response.2019R2.txt
|Land Version||Genome Build||Gene Model|
Data Source: Vizome
DNASeq Mutation DNASeq Somatic Mutation RNA-Seq, including: Single-end and Paired-end fusion calling RNA-Seq somatic mutation, from matched tumor/normal pairs Exon Junction and Exon Usage Expression (Gene- and Transcript- level quantification)
Illumina HiSeq RNA sequencing (HiSeq 2500) Illumina Nextera RapidCapture Exome capture probe sequencing
RNA-Seq data: OmicScript RNAseq Pipeline and Building Lands From RNA-Seq Data
OmicSoft does not reprocess other genomic data, but extracts data directly from original datasets. Key Meta Data Columns
DiseaseState: The type of leukemia the patient was diagnosed with. DiseaseStage: If Specimen is obtained at time of relapsed disease or de novo disease or if Patient transformed from another heme malignancy before or at the time of Specimen collection. Three options are available: isRelapse|isDenovo|isTransformed; "NA" means the subject was false for all three. Histology: Histological types of cancer, such as carcinoma, glioma and sarcoma. Tissue: The tissue from which the cell line was derived, using OmicSoft's curation Controlled Vocabulary Sample Type: A detailed description of the cell type from which the cell line was derived, using OmicSoft's curation Controlled Vocabulary Tumor or Normal: Indicates whether a sample is from a tumor or normal sample.
Note: "Unknown" values have been defined by the study authors as "Unknown = not enough information to determine classification"
Primary Grouping: Disease State
Sample Distribution by DiseaseState
One of the most common ways to visualize gene expression data is a per-sample Scatter plot (e.g. Gene FPKM), with each sample grouped by DiseaseState on the Y-axis, and expression level plotted on the X-axis:
Additional Views include transcript-level and exon-level views, pairwise comparison plots, and direct visualization of RNAseq coverage with the OmicSoft Genome Browser.
DNA sequencing from whole exome sequencing was performed on all samples. Multiple visualizations display frequency and locations of gene mutations in CCLE samples, including the Mutation Landscape View. Many individual genes do not contain DNA mutations in AML cancers, and will display the message "No data is available for charting" because there are no deviations from the wild-type sequence.
To filter down to samples containing only the data type of interest, use the Data filters in the Sample Metadata window.
The numeric "Data" filters allow the user to filter samples based on characteristics of the searched gene. For example, RNAseq expression filter would hide any sample that didn't pass the (linear) threshold defined. Similarly, the DNAseq mutation filter would filter out any samples that didn't have the requested number of mutations in a given gene. The user could use it to clean up a plot with a few outliers, or plot expression by number of mutations in the gene of interest. This filter acts, basically, as a quicker -omic data query, for simple tasks.
Ex Vivo Drug Sensitivity Data
A key strength of the BeatAML dataset is the collection of drug sensitivity measurements performed on ex-vivo samples from the patients. This allows discovery of correlations between mutation status, expression patterns, and drug sensitivity.
The OmicSoft Server administrator will first need to add the drug measurement data as an orthogonal data type, which will make the data available for analysis.
After this, you can search for any drug ID in the same way you would search for a gene ID, and plot the sensitivity of each sample:
With the measurement data, you can identify gene mutations or expression that correlates with measurement sensitivity, using tools like Measurement to Expression/Measurement to Mutation integration, or by defining cohorts with Omic data queries and using Sample Grouping to Expression/Sample Grouping to Mutation.
Where do I find the data presented in the paper?
- Disease_type (Figure 3) - a term used in the paper, but not in the official BeatAML metadata downloads. The most likely definition of this term, as referenced in BeatAML Land, is DiseaseStage, which was merged from a combination of three columns in the BeatAML metadata: isRelapse|isDenovo|isTransformed
- Cytogenetics (Figure 3) - this information is contained in the WHO_Fusion column, located in the Clinical Data metadata table.
Drug response measurement data for 119 samples whose IDs were not listed in the BeatAML metadata table are included in a separate table here: http://omicsoft.com/downloads/land/BeatAML/STable8_Drug_Response.2019R2.txt
Other terms for searching: beat aml, beataml