GraphNovo: Graph-based deep learning model may lead to highly personalized medicine to treat cancer and infectious diseases

Tuesday, November 28, 2023

Computer scientists at the Cheriton School of Computer Science are using a graph-based deep learning model to analyze proteins on the surface of cells, which could lead to personalized medicine to treat cancer and infectious diseases.  

The researchers developed GraphNovo, a new program that provides a more accurate understanding of cellular peptide sequences, linear chains of amino acids.

In a healthy person, the immune system correctly identifies the peptides of irregular or foreign cells, such as cancer cells or harmful bacteria, then targets those cells for destruction. But for people whose immune system is struggling, the promising field of immunotherapy is working to retrain their immune systems to identify these dangerous invaders.

“What scientists want to do is sequence those peptides between the normal tissue and the cancerous tissue to recognize the differences,” said Zeping Mao, a PhD candidate in the Cheriton School of Computer Science who developed GraphNovo under the guidance of University Professor Ming Li.

This sequencing process is particularly difficult for novel illnesses or cancer cells, which may not have been analyzed before. While scientists can draw on an existing peptide database when analyzing diseases or organisms that have previously been studied, each person’s cancer and immune system are unique.

To quickly build a profile of the peptides in an unfamiliar cell, scientists have been using a method called de novo peptide sequencing, which uses mass spectrometry to rapidly analyze a new sample. 

But this process may leave some peptides incomplete or entirely missing from the sequence. GraphNovo significantly enhances the accuracy in identifying peptide sequences by filling these gaps with the precise mass of the peptide sequence. Such a leap in accuracy will likely be immensely beneficial in a variety of medical areas, especially in the treatment of cancer and in creating vaccines against infectious agents such as Ebola and SARS-CoV-2, the virus that causes COVID-19.


For more information about this research, see Zeping Mao, Ruixue Zhang, Lei Xin and Ming Li. Mitigating the missing-fragmentation problem in de novo peptide sequencing with a two-stage graph-based deep learning model. Nature Machine Intelligence 5, 1250–1260 (2023).