Combating Over-Smoothing in Graph Models with Half- Hop for Molecular Property Prediction
Abstract
Natural products (NP) are abundant resources for the development of small-molecule drugs, as they are derived from plants, marine organisms, and microorganisms. Graph-based models are used by researchers in the pharmaceutical industry to identify non-toxic drug-like small molecules to address environmental challenges, thereby contributing to a sustainable future. This work proposes a novel approach Half-Hop (HH) message passing in Deep Graph Convolution Attention Network (HH-DGCAN) model at the Atomic Coordinates (AC), Substring (SS), Molecular Graphs (IMG), and Hybrid (HYB)-level representations to overcome the over-smoothing problem in graph models. The slow nodes are added between the source and target node which upsamples the edges. The non-toxic drug-like compounds are predicted using Simplified Molecular Input Line Entry System (SMILES) input from the COCONUT dataset. The proposed model HH-DGCAN at the HYB granularity level outperformed other methods with an accuracy of 91.50%, F1-score of 93.26%, and AUC of 95.38%, respectively, upon com-parison with baseline models and datasets. Extensive empirical analysis, visualization plots for over-smoothing, and Explainable Artificial Intelligence (XAI) techniques demonstrate significant performance enhancements to understand the decisions made by the model and interpret its bias and fairness. © 2024 IEEE.