Network-based anomaly detection algorithm reveals proteins with major roles in human tissues

Dima Kagan
Juman Jubran
Esti Yeger-Lotem
Michael Fire

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (GigaScience)

Abstract

Background

Proteins act through physical interactions with other molecules to maintain organismal health. Protein–protein interaction (PPI) networks have proved to be a powerful framework for obtaining insight into protein functions, cellular organization, response to signals, and disease states. In multicellular organisms, protein content varies between tissues, influencing tissue morphology and function. Weighted PPI networks, reflecting the likelihood of interactions in specific tissues, offer insights into tissue-specific processes and disease mechanisms. We hypothesized that detecting anomalous nodes in these networks could reveal proteins with key tissue-specific functions.

Results

Here, we introduce Weighted Graph Anomalous Node Detection (WGAND), a novel machine-learning algorithm to identify anomalous nodes in weighted graphs. WGAND estimates expected edge weights and uses deviations to generate anomaly detection features, which are then used to score network nodes. We applied WGAND to weighted PPI networks of 17 human tissues. High-ranking anomalous nodes were enriched for proteins associated with tissue-specific diseases and tissue-specific biological processes, such as neuron signaling in the brain and spermatogenesis in the testis. WGAND outperformed other methods in terms of area under the ROC curve and precision at K, highlighting its effectiveness in uncovering biologically meaningful anomalies.

Conclusions

Our findings demonstrate WGAND’s potential as a powerful tool for detecting anomalous proteins with significant biological roles. By identifying proteins involved in critical tissue-specific processes and diseases, WGAND offers valuable insights for discovering novel biomarkers and therapeutic targets. Its versatile algorithm is suitable for any weighted graph and is broadly applicable across various fields. The WGAND algorithm is available as an open-source Python library at https://github.com/data4goodlab/wgand.

GigaScience
Apr 28, 2025
Background Anomaly detection in graphs is critical in various domains, notably in medicine and biology, where anomalies often encapsulate pivotal information. Here, we focused on network analysis of molecular interactions between proteins, which is commonly used to study and infer the impact of proteins on health and disease. In such a network, an anomalous protein might indicate its impact on the organism’s health.Results We propose Weighted Graph Anomalous Node Detection (WGAND), a novel machine learning-based method for detecting anomalies in weighted graphs. WGAND is based on the observation that edge patterns of anomalous nodes tend to deviate significantly from expected patterns. We quantified these deviations to generate features, and utilized the resulting features to model the anomaly of nodes, resulting in node anomaly …
Background Anomaly detection in graphs is critical in various domains, notably in medicine and biology, where anomalies often encapsulate pivotal information. Here, we focused on network analysis of molecular interactions between proteins, which is commonly used to study and infer the impact of proteins on health and disease. In such a network, an anomalous protein might indicate its impact on the organism’s health.Results We propose Weighted Graph Anomalous Node Detection (WGAND), a novel machine learning-based method for detecting anomalies in weighted graphs. WGAND is based on the observation that edge patterns of anomalous nodes tend to deviate significantly from expected patterns. We quantified these deviations to generate features, and utilized the resulting features to model the anomaly of nodes, resulting in node anomaly scores. We created four variants of the WGAND methods and compared them to two previously-published (baseline) methods. We evaluated WGAND on data of protein interactions in 17 human tissues, where anomalous nodes corresponded to proteins with major roles in tissue contexts. In 13 of the tissues, WGAND obtained higher AUC and P@K than baseline methods. We demonstrate that WGAND effectively identified proteins that participate in tissue-specific processes and diseases.Conclusion We present WGAND, a new approach to anomaly detection in weighted graphs. Our results underscore its capability to highlight critical proteins within protein-protein interaction networks. WGAND holds the promise to enhance our understanding of intricate biological processes and might pave the way for novel therapeutic strategies targeting tissue-specific diseases. Its versatility ensures its applicability across diverse weighted graphs, making it a robust tool for detecting anomalous nodes.Competing Interest StatementThe authors have declared no competing interest.

Reviewer 2. Dan Shao

This manuscript provides an approach to highlight critical proteins within protein-protein interaction networks by Weighted Graph Anomalous Node Detection (WGAND). I see a lot of serious issues, as follows.

Overall, the author submitted the article to GigaScience, so the problem he needs to solve should be the protein-disease relationship rather than anomaly detection in graphs. However, from the Abstract to the Introduction, the article always introduces the methods and applications of anomaly detection.

Also, the logic of the whole article is confusing. There is a repetition of the specific method design in Methods (2.1 and 2.2). The overall program lacks method diagrams or flowcharts for explanation. In addition, the results should be in Results and not in Methods.

The results do not go to the significant achievements and cannot fully reflect the superiority of the methods.

Conclusion is missing from the text. 5.The use of the English language is very awkward at times.

The font in some panels of some Figures (e.g., 6) is way too small.

Re-review: Comments to the Authors The manuscript " Network-based anomaly detection algorithm reveals proteins with major roles in human tissues" triggered a positive initial impression, regarding abstract, introduction and figures, but going deeper, I see a lot of serious issues, as follows.

Methods and Results are very hard to read at times. In many cases, where tools or parameters are used without further justification, the impression is given that various choices were tried extensively until some setup gave plausible results. In this study, the authors treated an anomaly as a node that behaves differently from most of the nodes in the network. However, the basis for this assumption requires further substantiation. The authors' research is fundamentally rooted in this premise, yet it is not adequately verified in the article. In the evaluation, the authors employed non-standard parameters to validate the effectiveness of the model. For example, they used the value of 24% associated with Mendelian disease among the top 10 proteins calculated by WGAND to compare with results obtained from other models. However, is this method of comparison credible? Results contain a lot details that I would expect to be part of Methods. Details of the model are missing in Methods. The use of the English language is very awkward at times. Minor, nice to have

The font in some panels of some Figures (e.g., 2) is way too small.

If a Figure consists of more than one part, e.g. A part, B part, each part should be explained separately.

In the explanatory part of Figure 5, (a) (b) ... should be replaced by (A) (B) .... to maintain consistency with the figure.
Read the original source
GigaScience
Apr 28, 2025
AbstractBackground Anomaly detection in graphs is critical in various domains, notably in medicine and biology, where anomalies often encapsulate pivotal information. Here, we focused on network analysis of molecular interactions between proteins, which is commonly used to study and infer the impact of proteins on health and disease. In such a network, an anomalous protein might indicate its impact on the organism’s health.Results We propose Weighted Graph Anomalous Node Detection (WGAND), a novel machine learning-based method for detecting anomalies in weighted graphs. WGAND is based on the observation that edge patterns of anomalous nodes tend to deviate significantly from expected patterns. We quantified these deviations to generate features, and utilized the resulting features to model the anomaly of nodes, resulting in node …
AbstractBackground Anomaly detection in graphs is critical in various domains, notably in medicine and biology, where anomalies often encapsulate pivotal information. Here, we focused on network analysis of molecular interactions between proteins, which is commonly used to study and infer the impact of proteins on health and disease. In such a network, an anomalous protein might indicate its impact on the organism’s health.Results We propose Weighted Graph Anomalous Node Detection (WGAND), a novel machine learning-based method for detecting anomalies in weighted graphs. WGAND is based on the observation that edge patterns of anomalous nodes tend to deviate significantly from expected patterns. We quantified these deviations to generate features, and utilized the resulting features to model the anomaly of nodes, resulting in node anomaly scores. We created four variants of the WGAND methods and compared them to two previously-published (baseline) methods. We evaluated WGAND on data of protein interactions in 17 human tissues, where anomalous nodes corresponded to proteins with major roles in tissue contexts. In 13 of the tissues, WGAND obtained higher AUC and P@K than baseline methods. We demonstrate that WGAND effectively identified proteins that participate in tissue-specific processes and diseases.Conclusion We present WGAND, a new approach to anomaly detection in weighted graphs. Our results underscore its capability to highlight critical proteins within protein-protein interaction networks. WGAND holds the promise to enhance our understanding of intricate biological processes and might pave the way for novel therapeutic strategies targeting tissue-specific diseases. Its versatility ensures its applicability across diverse weighted graphs, making it a robust tool for detecting anomalous nodes.

This work has been peer reviewed in GigaScience (https://doi.org/10.1093/gigascience/giaf034), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

Reviewer 1. Yong Zhang

This study introduces the WGAND method, an innovative weighted graph anomaly detection algorithm to identify key anomalous proteins in human tissues using machine learning techniques. Given the critical role of abnormal proteins in disease prediction and treatment, this research area is pivotal for understanding complex systems' dynamic behaviors, especially in bioinformatics. In general, this article contributes to weighted graph anomaly detection. While this study provides valuable insights and demonstrates the WGAND method's good performance and practicality, here are some suggestions and potential directions for improvement:

Building on existing research, conducting a detailed performance comparison analysis between the WGAND algorithm and similar cutting-edge methods (such as OddBall, Yagada, etc.) is recommended, explicitly highlighting WGAND's advantages in anomaly detection accuracy. A series of standard metrics should be used, including but not limited to precision, recall, F1 score, and AUC curve, to quantify WGAND's effectiveness and superiority rigorously.

While AUC and P@K are valuable as main evaluation metrics, introducing additional metrics such as recall, precision, and F1 score for anomaly detection tasks can provide a more comprehensive assessment of model performance.

Delve into optimizing the selection of node embedding methods and edge weight estimators based on different application scenarios and explore more systematic model selection and hyperparameter optimization strategies.

Investigate strategies for dynamically setting thresholds to allow the WGAND method to adapt to changes in the data environment and various task demands.

Discuss the applicability of WGAND across different types of weighted graphs (such as undirected and directed graphs) and assess its generality and adaptability.
Read the original source
Video seminar held on Cassyni
Apr 8, 2025

Seminar title: Network-based anomaly detection algorithm reveals proteins with major roles in human tissues
Go to the seminar on Cassyni starting at the relevant time stamp.
Watch the whole seminar on Cassyni.
Version published to 10.1093/gigascience/giaf034
Jan 1, 2025
Version published to 10.1101/2023.12.19.572354 on bioRxiv
Dec 20, 2023

Edge-Based Execution of Graph Neural Networks for Protein Interaction Network Analysis in Clinical Oncology

This article has 1 author:
1. Swapin Vidya
This article has no evaluationsLatest version Jan 21, 2026
A Comprehensive Review on Graph-Based Anomaly Detection: Approaches for Intrusion Detection

This article has 4 authors:
1. Nimesha Dilini
2. Nan Sun
3. Sky Miao
4. Nour Moustafa
This article has no evaluationsLatest version Jan 20, 2026
Uncovering miRNA–Disease Associations Through Graph Based Neural Network Representations

This article has 1 author:
1. Alessandro Orro
This article has no evaluationsLatest version Jan 28, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Background

Results

Conclusions

Article activity feed

Related articles

Edge-Based Execution of Graph Neural Networks for Protein Interaction Network Analysis in Clinical Oncology

A Comprehensive Review on Graph-Based Anomaly Detection: Approaches for Intrusion Detection

Uncovering miRNA–Disease Associations Through Graph Based Neural Network Representations