University of Washington - Department of Statistics
Tandem mass spectrometry experiments generate from thousands to millions of spectra that can be used to identify the presence of proteins in complex samples. In this work, we propose a new method to identify peptides based on clustered tandem mass spectrometry data. In contrast to previously proposed approaches, which identify one representative spectrum for each cluster using traditional database searching algorithms, our method scores all the spectra in a cluster against candidate peptides using Bayesian model selection. This approach not only reduces database search time but also uses the information more efficiently leading to more accurate peptide/protein identification. We validate our model selection approach using a seven-standard-protein mixture data. We also compare our method to a popular algorithm for peptide identification.