RNA sequencing (RNA-Seq) and mass spectrometry-based shotgun proteomics are powerful high-throughput technologies for identifying and quantifying RNA transcripts and proteins respectively. With the increasing affordability of these technologies, many projects have started to apply both to the same samples to achieve a more comprehensive understanding of biological systems. A major analytical challenge for such integrative projects is how to effectively leverage the complementary nature of RNA-Seq and shotgun proteomics data. RNA-Seq provides comprehensive information on mRNA abundance, alternative splicing, nucleotide variation and structure alteration. Sample-specific protein databases derived from RNA-Seq data can better approximate the real protein pools in cell and tissue samples and thus improve protein identification. Meanwhile, proteomics data provides essential confirmation of the validity and functional relevance of novel findings from RNA-Seq data. At the quantitative level, mRNA and protein levels are only modestly correlated, suggesting strong involvement of post-transcriptional regulation in controlling gene expression.
Here the authors review recent studies at the interface of RNA-Seq and proteomics data. They discuss goals, accomplishments and challenges in RNA-Seq-based proteogenomics. They also examine the current status and future potential of parallel transcriptome and proteome quantification in revealing post-transcriptional regulatory mechanisms.