COMET: An Interactive Framework for Efficient and Effective Community Search via Active Learning
彗星
计算机科学
天体生物学
物理
作者
Jiawei Zhou,Kai Wang,Jianwei Wang,Kunpeng Zhang,Xuemin Lin
出处
期刊:Informs Journal on Computing日期:2025-09-09
标识
DOI:10.1287/ijoc.2024.0834
摘要
In recent years, substantial advancements in query-dependent community search (CS) have been driven by growing demands in various downstream applications such as social network analysis, fraud detection, bioinformatics, and others. They require methods to identify structurally cohesive communities that are dependent on specific queries. Learning-based interactive CS (ICS) models the search process as multiround with human interaction, enhancing its practicality. Nonetheless, learning-based approaches for ICS face two challenges. First, current methods for narrowing the search space rely on either query information or fixed topological structures, resulting in insufficient robustness when querying communities on large-scale graphs. Second, there is an absence of an effective interaction strategy in ICS, where the algorithm should offer users choices of highly uncertain nodes to iteratively refine search quality. To address these issues, we propose COMET, an interactive community search framework designed for large-scale graphs. COMET consists of three key modules: First, it features a community-aware subgraph module tailored to each specific query based on Personalized PageRank (PPR), considering both query information and topological structure. Second, we conceptualize ICS as a series of binary classification tasks, employing a graph neural network (GNN) to propagate label information within the candidate subgraph in each round. Finally, a novel active learning–based node selection module uses entropy from GNN and PPR from the subgraph module to dynamically select the most crucial nodes for labeling in each round. Extensive experimental evaluations demonstrate that COMET significantly outperforms state-of-the-art learning-based CS and ICS methods across eight real-world data sets. History: Accepted by Ram Ramesh, Area Editor for Data Science & Machine Learning. Funding: K. Wang was supported by the National Natural Science Foundation of China [Grants 72221001 and 62302294]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2024.0834 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2024.0834 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .