RPKM, FPKM and TPM, clearly explained

from StatQuest

It used to be when you did RNA-seq, you reported your results in RPKM (Reads Per Kilobase Million) or FPKM (Fragments Per Kilobase Million). However, TPM (Transcripts Per Kilobase Million) is now becoming quite popular. Since there seems to be a lot of confusion about these terms, I thought I’d use a StatQuest to clear everything up.

These three metrics attempt to normalize for sequencing depth and gene length. Here’s how you do it for RPKM:

  1. Count up the total reads in a sample and divide that number by 1,000,000 – this is our “per million” scaling factor.
  2. Divide the read counts by the “per million” scaling factor. This normalizes for sequencing depth, giving you reads per million (RPM)
  3. Divide the RPM values by the length of the gene, in kilobases. This gives you RPKM.

FPKM is very similar to RPKM. RPKM was made for single-end RNA-seq, where every read corresponded to a single fragment that was sequenced. FPKM was made for paired-end RNA-seq. With paired-end RNA-seq, two reads can correspond to a single fragment, or, if one read in the pair did not map, one read can correspond to a single fragment. The only difference between RPKM and FPKM is that FPKM takes into account that two reads can map to one fragment (and so it doesn’t count this fragment twice).

TPM is very similar to RPKM and FPKM. The only difference is the order of operations. Here’s how you calculate TPM:

  1. Divide the read counts by the length of each gene in kilobases. This gives you reads per kilobase (RPK).
  2. Count up all the RPK values in a sample and divide this number by 1,000,000. This is your “per million” scaling factor.
  3. Divide the RPK values by the “per million” scaling factor. This gives you TPM.

So you see, when calculating TPM, the only difference is that you normalize for gene length first, and then normalize for sequencing depth second. However, the effects of this difference are quite profound.

When you use TPM, the sum of all TPMs in each sample are the same. This makes it easier to compare the proportion of reads that mapped to a gene in each sample. In contrast, with RPKM and FPKM, the sum of the normalized reads in each sample may be different, and this makes it harder to compare samples directly.

Here’s an example. If the TPM for gene A in Sample 1 is 3.33 and the TPM in sample B is 3.33, then I know that the exact same proportion of total reads mapped to gene A in both samples. This is because the sum of the TPMs in both samples always add up to the same number (so the denominator required to calculate the proportions is the same, regardless of what sample you are looking at.)

With RPKM or FPKM, the sum of normalized reads in each sample can be different. Thus, if the RPKM for gene A in Sample 1 is 3.33 and the RPKM in Sample 2 is 3.33, I would not know if the same proportion of reads in Sample 1 mapped to gene A as in Sample 2. This is because the denominator required to calculate the proportion could be different for the two samples.

Source – StatQuest

23 comments

  1. Attention : “However, TPM (Transcripts Per Kilobase Million)” instead of “However, TPM (Transcripts Per Million)”

    • I was just pointed here by a colleague to help me understand the benefit of TPM over RPKM (this is not my field) and i think this correction is mistaken. TPM is measuring the transcription frequency of a specific gene; the length of the gene is absorbed into the calculation and shouldn’t appear in the units. I’m not sure if TeX will render here, but i’ll give it a shot:

      In the TPM calculation, $RPK_A$ (the reads per kilobase for gene A) is $n_A/\ell_A$, where $n_A$ is the number of reads that map to gene A and $\ell_A$ is the length of gene A. This measures the transcription rate for gene A. Then the scaling factor is $(\sum_i RPK_i) / (10^6)$, where $i$ ranges over all of the genes, including A. So the final value TPM of gene A is $(n_A/\ell_A)/(\sum_i n_i/\ell_i)\times 10^6$, which measures the *relative* rate of transcription of gene A (with the decimal point moved 6 spaces to the right). Both $\ell_A$ and $\ell_i$ have kilobase units, which cancel out.

      (I thought this might just be an accident of terminology, but other sources seem to expand TPM as “transcripts per million” as well. I also just realized that this comment might be trying to make the same correction i am; i’m not sure if the post itself was changed in response to this comment or remains as it was posted, so apologies for any misplaced attribution.)

  2. If you do a search for this page and read what it has to say on TPM: “What the FPKM? A review of RNA-Seq expression units” it says you should never compare TPM between samples and that it’s only for within sample comparisons. Please comment.

    • I tend to agree with “you should never compare TPM between samples and that it’s only for within sample comparisons”

      Otherwise, I don’t think “Count up all the RPK values in a sample and divide this number by 1,000,000.” interpretable anymore among samples. This is basically a sum of sequencing depths of all genes, which is fundamentally different from the total number of mapped reads.

      In my opinion, RPKM and TPM seem to be for different purposes.

    • Daniel J McGoldrick

      The authors actually state “TPM is probably the most stable unit across experiments, though you still shouldn’t compare it across experiments” You munged the quote and meaning. There is no “never” and certainly the implication that RPKM or FPKM would be better is false.

  3. I think you need to replace ‘gene’ with ‘transcript’. It’s not gene length that counts, but transcript length

  4. Hi,
    I currently work with qPCR, but just recently was introduced to RNA-Seq ways to report results when a paper about the whole transcriptome of the organism I work with came out. It happens that I would like to compare my qPCR results with the results reported in the transcriptome paper (which are in RPKM) and I don’t know if I can. I’m reading a lot of papers, trying to understand it better, but I couldn’t come to a conclusion yet. Could you give me a hand, please?

    Is it possible to compare qPCR results to RNA-Seq results?

    Many thanks!

    • You can’t compare the “numbers”, but you can compare the results of the analysis or use the evidence to support your claims.

  5. “TPM is very similar to RPKM and FPKM. The only difference is the order of operations. Here’s how you calculate TPM”, if the only difference is the order of operations, then the TPM is always equal to RPKM, then why we need to have TPM at all?

    • This is similar to why the order of operations like multiplication, addition, brackets matters. Look at the toy examples. The resulting RPKMs and TPMs are not multiples of each other, but yield different proportions for genes/samples.

  6. There is any paper that discuss de best efficacy (like your video) of TPM?

  7. If RPKM is obtained first by normalizing the sequencing depth and then the gene length in kb. Is there any reason why it wasn’t coined RPMK instead? Because RPMK seems to be more correctly reflect the order of operation.

  8. Can we directly correlate FPKM/TPM to expression? In other words, does high FPKM/TPM values mean high expression for a gene?

    Or how do I find the highly expressed genes within a RNAseq sample?

  9. Hi
    Could you tell me that why we need gene length based normalization (like-TPM, RPKM/FPKM)?
    Is there any biological explanation?

    Thanks in advance.

    • Dear Singha,

      If you do not consider the gene length, then you will consider a gene with more reads mapped to be expressed higher, while that may not be the case.

      For example, gene A has 30 reads mapped, while gene B has 50 reads mapped. Irrespective of the length, on an absolute scale, one could say that gene B is expressed higher, but if gene A is 1 kb long and gene B is 2 kb long, gene A has 30 reads per kb, while gene B has 25 reads per kb, which makes gene A expressed more on a relative scale.

  10. Hello,

    I wonder if the gene length used to calculate statistics for RNAseq includes introns? Since this is RNAseq, would it be more reasonable to sum up just the total lengths of all exons?

    Thanks!

  11. “Thus, if the RPKM for gene A in Sample 1 is 3.33 and the RPKM in Sample 2 is 3.33, I would not know if the same proportion of reads in Sample 1 mapped to gene A as in Sample 2. ”
    I disagree. The proof that they are the same is easier than for TPM
    Let Ci be the count of mapped reads for transcript i, let Si be it’s size, and let me not care if things are per million or per kilobase, since those are just constants. Let Sum(Ci) be the sum over i of the counts. Then “RPKM” for transcript i is Yi= (Ci/Sum(Ci))/Si. If the transcript is the same size for two different samples then if two samples have the same Yi it is because Ci/Sum(Ci) is the same, which is exactly what I mean when I say proportion of mapped reads.
    TPM is trying to not let bigger transcripts have more say just because they are big, even though we have more data there. All three are letting highly expressed transcripts have more say.
    Try this example in Excel. There are 11 genes, the first 10 of size 100kb, and 11th of size 1kb. Counts for first 10 genes for sample A are 500, and for B are 1000. Last gene has 10000 for A and 5000 for B. Total counts are 15000 for both samples, so RPKM has no effect, and thinks first 10 genes are 2-fold higher for B than A, and 11th is 2-fold higher in A (goes exactly as the proportion of mapped reads). TPM listens almost entirely to gene 11 since it is small and abundant. It thinks sample A has about 4-fold lower expression for the first 10 genes, and about the same expression for 11th gene (nothing like what the proportion of mapped reads say).
    A person from the 2D-gel or microarray world would note “over 90% of the transcripts are almost exactly 2 fold higher in B, so I think they are not really different in A and B, but that transcript 11 is 4-fold higher in A seems more likely in my experience of biology”. They might add “I’m not sure I trust transcript 11 since it has incredibly high reads per kb”.

  12. Thank you for the kind explanation. I have the question(this field is so new to me).

    Assume that we have RNA samples like ‘Reagent 1 treated (sample 1), reagent 2 treated (sample 2), reagent 3 treated (sample 3)’ to see transcript A,B,C (assume they have same length) expression in each condition, and RNA-sequenced.

    The raw data is like this (in sample 1, 2, 3 order); A (30, 30, 30) / B (20, 20, 20) / C (1, 1, 100), and they are calculated into RPKM or TPM.

    I think data will tell us that A and B transcripts are decreased in sample 3 compared to sample 1 and 2 (but they are not decreased in raw data actually), because increased transcript level of C will increase total reads.

    I think I have some mistaken concept.. Please help me!

  13. Sorry for the offense. I feel like you totally mess up everything.

    The difference between RPKM(RPM) and TPM is a basic assumption: how to evaluate the quantity of different samples, by total read counts, or total transcript counts.

    I guess people who choose one of these two only depend on their different understanding of that, and for each person who knows what he is doing, there is definitely the specific one of them which is better than the other one, before any calculation and presentation.

    RPM normalize the different sample though read counts, defined to compare the same gene expression between different sample. RPKM is additionally defined for the comparison between genes transcription in the same sample.
    The first step of TPM exchanges the read count to transcription count for the different gene in every sample. Then the second step is to normalize for different samples through transcription counts, for the comparison of the same gene.

    Go back to your calculation.
    If you just want to look at the good number of the gene percentage between different samples in the RPKM case, why not simply look at the RPM.
    It doesn’t make any sense that you normalize the different samples with read counts then present the proportion of each sample with transcription count.
    If you insist to do this on RPKM, then compare the number directly to get the fold change, don’t consider any additional normalization, which was exactly killing RPKM and calculating TPM.

    • I thinks both of your explanations make sense. As you said, when doing DEGs analysis, we only need to compare one gene across different samples so we only consider the normalisation of library size (i.e. sequence depth), so CPM/RPM is enough. Then if we want to know in one sample the different expression of all genes, we just need to divide CPM or RPM by kilobase, taking the gene length into consideration. But the thing is usually we have bio reps and want to compare them together, and we would like to know whether the portion distribution of genes in one genetype/treatment is as the same as in the others, from my understanding this is why TPM is included. What would you think?

  14. I have two datasets. One is provided in RPKM and the other in FPKM. Are the absolute values from the two comparable since FPKM doesn’t count the same fragment twice?

    Thanks!

  15. Hi, I am new in RNA sequence analysis, I want to compare the gene expression of control sample under different time scale. I have computed the counts from feature-counts. To get the gene expression in each time point which parameter is suitable; TPM, RPKM or FPKM? I do not need DEGs at this point.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.