Fatema
Tuz
Zohora,
PhD
candidate
David
R.
Cheriton
School
of
Computer
Science
Liquid chromatography with tandem mass spectrometry (LC-MS/MS) based proteomics is a well-established research field with major applications such as identification of disease biomarkers, drug discovery, drug design and development. Typical analysis workflow begins with the peptide feature detection and quantification from LC-MS map.
We are the first to propose a deep learning based model, DeepIso, that combines recent advances in Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) to detect peptide features of different charge states and estimate their abundance by scanning LC-MS map. Existing tools are designed with limited engineered features, and domain specific parameters which are hardly updated despite huge amount of new coming proteomic data. On the other hand, DeepIso consisting of two separate deep learning based modules, learns multiple levels of representation of high dimensional data itself through many layers of neurons, and adaptable to newly acquired data.
The high confidence features (in terms of abundance) reported by our model matches with 89.32% of MS/MS identifications in a benchmark dataset, which is higher than the matching produced by several widely used tools, including MaxQuant. Our results demonstrate that novel deep learning tools are desirable to advance the state-of-the-art in protein identification and quantification.