RPKM Transcript

From Array Suite Wiki

Jump to: navigation, search

Transcript-Level Expression Quantification

Calculate the number of Reads Per Kilobase of transcript model per Million of mapped reads at transcript level.

This method seeks to normalize for the difference in number of mapped reads between samples. Besides number of mapped reads, the exon length is also considered to make it possible to compare expression levels of different transcripts.

A read is counted if it overlaps an exon or exon junction. Those reads that do not have any overlap with a known exon model are not counted.

The formula is:

(Read Counts of a transcript * 1000 * 1000) / (Length of transcript * Total Count)
Total Count = Number of mapped reads for a sample. If a read is mapped to multiple locations, it is only counted once for "Total Count" in the formula above. If a uniquely mapped single-end read is aligned within an existing transcript (i.e. aligned within an exon region or an annotated exon junction region), it is counted as 1; otherwise 0.
Length of transcript = the length of transcript in nucleotide. If “Automatically trim UTRs” option is selected in Omicsoft Report Gene/Transcript Count function, it is the adjusted transcript length (after trimming UTRs).

In contrast to Transcripts Per Million (TPM), RPKM/FPKM will not always add up to 1,000,000, so a transcript's RPKM/FPKM expression level can be affected by the average length of transcripts expressed in the sample. To compensate for this, RPKM values can either be scaled to TPM, or can be quantile-scaled.

Related Articles


[back to top]