Incorporating RNA-Seq data to improve sensitivity of protein identification by tandem MS

Tandem mass spectrometry (MS/MS) followed by database search is the method of choice for protein identification in proteomic studies. Database searching methods employ spectral matching algorithms and statistical models to identify and quantify proteins in a sample. In general, these methods do not utilize any information other than spectral data for protein identification. However considering the wealth of orthogonal data available for many biological systems, analysis methods can incorporate such information to improve the sensitivity of protein identification.

In this study, researchers from the University of Michigan present a method to utilize GPMDB identification frequencies and RNA-Seq transcript abundances to adjust the confidence scores of protein identifications. The method described is particularly useful for samples with low to moderate proteome coverage (i.e., < 2000-3000 proteins), where we observe up to 8% improvement in the number of proteins identified at a 1% false discovery rate.


Shanmugam AK, Yocum AK, Nesvizhskii AI. (2014) Utility of RNAseq and GPMDB protein observation frequency for improving sensitivity of protein identification by tandem MS. J Proteome Res [Epub ahead of print]. [abstract]