Peptide identification is a core challenge in proteomics, the study of proteins, their structure and functions. Unlike genomics, which examines an organism’s genetic information, proteomics is far more complex. The proteome — the complete set of proteins produced or modified by a cell or system — varies not only across different cell types but also over time.
“DNA sequencing is relatively straightforward,” says Yonghan Yu, a PhD candidate at the Cheriton School of Computer Science. “In contrast, sequencing proteins is far more challenging. Protein molecules must first be broken down into smaller fragments known as peptides, which are then analyzed using mass spectrometry. This analytical tool separates peptides based on their mass and electrical charge, generating data that computational methods use to infer the protein’s amino acid sequence.”
DeepSearch, a novel deep learning–based end-to-end database search method developed by Yonghan and University Professor Ming Li, addresses these limitations.
Read the full article from Computer Science to learn more.