SE-GCL: Structure-Aware Graph Clustering with Entropy Minimization
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Unsupervised graph clustering has become an important method for uncovering patterns in the latent community structure of graph nodes; however, existing methods face the challenge of simultaneously capturing both the local structural differences and the global organizational patterns in graph data.At the same time, mainstream depth map clustering approaches stack numerous network layers to capture global features, which generally suffer from the drawbacks of massive parameter counts and computational redundancy. Meanwhile, while traditional structural entropy clustering can quantify topological disorder, it is constrained by the non-differentiability of structural entropy, making it difficult to integrate into end-to-end deep learning frameworks for joint optimization.To address these issues, this paper proposes an unsupervised graph clustering framework—SE-GCL—based on structural entropy and probabilistic optimization. By overcoming the inherent limitation of traditional clustering algorithms that require a predefined number of clusters, SE-GCL achieves simultaneous optimization of adaptive cluster discovery and node cluster assignment.This method consists of a Structural Entropy Guidance (SEO) module and a Clustering Search (CSM) module: The SEO module utilizes a second-order global structural entropy metric to characterize the uncertainty and structural contribution of nodes within the graph structure, This paper designs a parameter-free indirect optimization strategy to address the non-differentiability of structural entropy, simultaneously capturing local inter-node associations and global topological distribution patterns. Combined with the CSM, it performs probabilistic inference and iterative updates of cluster labels, enhancing the stability of complex graph representations; the CSM module introduces a temperature-coefficient-based probabilistic update mechanism to dynamically optimize node cluster assignments, making the clustering process better align with the graph’s intrinsic structural characteristics. Experiments on multiple public graph datasets demonstrate that this method not only adaptively discovers optimal cluster partitions but also improves the efficiency of clustering tasks through its lightweight nature. The SE-GCL model achieves 80%–96% of the performance of state-of-the-art models while using only 12%–55% of the model parameters.