A Study of Random Duplication Graphs and Degree Distribution Pattern of Protein-Protein Interaction Networks
The motivation of this study is to explain the degree distribution pattern of protein-protein interaction networks. Since the degree distribution pattern of protein-protein interaction networks arises through a long-time evolutionary process of gene duplication, we introduce the model of random duplication graph to depict protein-protein networks mathematically. Specifically, we are interested in the degree distribution function of random duplication graphs, and we intend to derive the degree distribution function of protein-protein interaction networks by modeling protein-protein interaction networks as a special case of random duplication graphs.
The random duplication graph model mimics the behavior of gene duplication. In a random duplication graph, one vertex is chosen uniformly at random to duplicate at every timestep t, and all the edges of the original vertex are preserved by the new vertex. We derive the expected degree distribution function of the model from the probability master function. Furthermore, we learned from the Erdös-Rényi random graph model that the convergence of degree distribution function is difficult to tangle in a single random duplication graph. In consequences, we define the n-fold of random duplication graphs, a combination of n independent random duplication graphs, under which we are able to prove that the degree distribution function converges.
Furthermore, we model the protein-protein interaction networks as a special case of random duplication graph with sparse initial graph, and the degree distribution function of protein-protein interaction networks is derived. We compare this degree distribution function with experimental data, and show that it can fit the data adequately. Our model gives a theoretical analysis of the self-organization process of protein-protein interaction networks. Moreover, we have shown that it is the self-organization process that leads to the unique degree distribution pattern in protein-protein interaction networks, in which the degree distribution function is monotonically decreasing, the majority of proteins are sparsely connected, while highly-connected proteins also exist. We can make a further prediction based on our analysis—as the self-organization process continues, more and more high-degree proteins will be produced.