Mass spectrometry-based proteomic analysis underestimates proteomic variation due to the absence of variant peptides and posttranslational modifications (PTMs) from standard protein databases. Each individual carries thousands of missense mutations that lead to single amino acid variants, but these are missed because they are absent from generic proteomic search databases. Myriad types of protein PTMs play essential roles in biological processes, but remain undetected because of increased false discovery rates in variable modification searches.
Researchers at the University of Wisconsin-Madison address these two fundamental shortcomings of bottom-up proteomics with two recently developed software tools:
The first consists of workflows in Galaxy that mine RNA sequencing data to generate sample-specific databases containing variant peptides and products of alternative splicing events.
The second tool applies a new strategy that alters the variable modification approach to consider only curated PTMs at specific positions, thereby avoiding the combinatorial explosion that traditionally leads to high false discovery rates.
Using RNA-sequencing-derived databases with this Global Post-Translational Modification (G-PTM) search strategy revealed hundreds of single amino acid variant peptides, tens of novel splice junction peptides, and several hundred posttranslationally modified peptides of around thirty-five different types in each of ten human cell lines.
Using the G-PTM search strategy with a RNA-Seq proteogenomic workflow allows the identification of many sequence variant and PTM-containing peptides. RNASeq data is used to identify sequence variants and construct sequence variant peptide databases for each of 10 human cell lines using the Galaxy-P computational interface. MS proteomic data for the same cell lines is searched using the G-PTM strategy with a sample-specific database that includes single amino acid variant (SAV) peptides, novel splice junction (NSJ) peptides, and UniProt protein sequences annotated with curated site-specific PTMs.