/ Reto Caluori
How artificial intelligence learns from complex networks
Deep neural networks have achieved remarkable results across science and technology, but it remains largely unclear what makes them work so well. A new study sheds light on the inner workings of deep learning models that learn from relational datasets, such as those found in biological and social networks.
Graph Neural Networks (GNNs) are artificial neural networks designed to represent entities—such as individuals, molecules, or cities—and the interactions between them. These networks have practical applications in various domains; for example, they predict traffic flows in Google Maps and accelerate the discovery of new antibiotics within computational drug discovery pipelines.
GNNs are notably utilized by AlphaFold, an acclaimed AI system that addresses the complex issue of protein folding in biology. Despite these achievements, the foundational principles driving their success are poorly understood.
A recent study sheds light on how these AI algorithms extract knowledge from complex networks and identifies ways to enhance their performance in various applications.
From better understanding to better performance
According to the study, modern deep learning models with millions or billions of parameters exhibit a strange behavior known as "double descent", where adding more data can paradoxically degrade performance. GNNs, however, seemed to defy this trend.
The research team, led by Professor Ivan Dokmanić from the University of Basel, used analytical tools from statistical mechanics to show that double descent is, in fact, ubiquitous in GNNs. They pinpointed a key determinant that influences how GNNs learn: whether the datasets and networks exhibit homophily (as in social networks where like-minded people connect), or they exhibit heterophily (as in protein interaction networks where complementary proteins interact).
The team’s research shows that the level of homophily or heterophily in a network significantly affects the ability of a machine learning model to generalize to unseen data.
Different data processing behavior
They further uncovered reasons why GNNs behave differently when processing homophilic and heterophilic data. This finding is crucial for designing and training better GNNs, especially for problems with heterophilic data which arise in drug or biomarker repurposing research.
“Our results expand our fundamental understanding of how AI learns from complex networks but also provide practical guidelines for developing better deep neural networks for complex real-world data,” explained Ivan Dokmanić. “Such insights impact numerous fields, from drug discovery to social network analysis.”
The study, published in the journal PNAS, exploited an analogy between GNNs and disordered physical systems known as spin glasses to derive a theory of generalization in GNNs.
Original publication
Cheng Shi, Liming Pan, Hong Hu, and Ivan Dokmanić
Homophily modulates double descent generalization in graph convolution networks
Proceedings of the National Academy of Sciences (2024), doi: 10.1073/pnas.2309504121