Het-node2vec: second order random walk sampling for heterogeneous graph embedding
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Many real-world problems are naturally modeled as heterogeneous graphs, where nodes and edges represent multiple types of entities and relations. Existing learning models for heterogeneous graph representation usually depend on the computation of specific, user-defined heterogeneous paths, or on the application of large, and often non-scalable, deep neural network architectures. We propose Het-node2vec, an extension of the node2vec algorithm, designed to embed heterogeneous graphs by capturing the topological and structural characteristics of the graph and the semantic information underlying the different types of nodes and edges; this is performed by introducing a simple stochastic node-type switching strategy in second-order random walk processes. Empirical results on synthetic graphs, as well as on benchmark and real-world biomedical graphs, show that Het-node2vec achieves comparable or superior performance to state-of-the-art methods for heterogeneous graphs in node label prediction tasks.