RNA-Seq to Measure Protein Abundance

Within the proteomics community there is a substantial interest in development of novel label-free quantitative proteomic strategies.

One strategy is to take advantage of an increasing number of studies involving integrative analysis of gene and protein expression data that are based on new technologies such as next-generation transcriptome sequencing (RNA-Seq) and highly sensitive mass spectrometry (MS) instrumentation. Thus, it becomes interesting to revisit the correlative analysis of gene and protein expression data using more recently generated datasets to determine if gene expression data can be used as an indirect benchmark for such protein-level comparisons.

A team of researchers from the University of Michigan and the Chinese Academy of Sciences used publicly available mouse data to perform a joint analysis of genomic and proteomic data obtained on the same organism. First, they perform a comparative analysis of different label-free protein quantification methods (intensity-based and spectrum count based, and using various associated data normalization steps) using several software tools on proteomic side. Similarly, they perform correlative analysis of gene expression data derived using microarray and RNA-Seq methods on genomic side. Finally, they investigate the correlation between gene and protein expression data, and various factors affecting the accuracy of quantitation at both levels.

The researchers observe that spectral count-based protein abundance metrics, which are easy to extract from any published data, are comparable to intensity-base measures with respect to correlation with gene expression data. The results of this work should be useful for designing robust computational pipelines for extraction and joint analysis of gene and protein expression data in the context of integrative studies.

  • Ning K, Fermin D, Nesvizhskii AI. (2012) Comparative analysis of different label-free mass spectrometry based protein abundance estimates and their correlation with RNA-Seq gene expression data. J. Proteome Res[Epub ahead of print]. [abstract]