Master’s Thesis Presentation • Bioinformatics • Predicting the Spectrum Quality and Digestive Enzyme for Shotgun Proteomics

Thursday, April 28, 2022 10:00 am - 10:00 am EDT (GMT -04:00)

Please note: This master’s thesis presentation will be given online.

Soroosh Gholamizoj, Master’s candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Bin Ma

In proteomics, database search programs are routinely used for peptide identification from tandem mass spectrometry data. However, many low-quality spectra cannot be interpreted by any programs. Meanwhile, certain high-quality spectra may not be identified due to incompleteness of the database, failure of the software, or sub-optimal search parameters. Thus, spectrum quality assessment tools are helpful programs that can eliminate poor-quality spectra before the database search and highlight the high-quality spectra that are not identified in the initial search. These spectra may be valuable candidates for further analyses.

We propose SPEQ: a spectrum quality assessment tool that uses a deep neural network to classify spectra into high-quality, which are worthy candidates for interpretation, and low-quality, which lack sufficient information for identification. SPEQ was compared with a few other prediction models and demonstrated improved prediction accuracy.

Furthermore, we propose a statistical model to automatically detect the enzyme used for digestion in a proteomics experiment, by analyzing the distribution of amino acids in peptides de novo sequenced with a nonspecific enzyme setting. Results demonstrate that this algorithm can accurately identify correct enzymes.