超图
计算机科学
比例(比率)
人工智能
模式识别(心理学)
相关性
数据挖掘
数学
地图学
组合数学
地理
几何学
作者
Xiangmin Han,Yubo Zhang,Shihui Ying,Yue Gao
标识
DOI:10.1109/tpami.2025.3603631
摘要
High-order correlations, which capture complex interactions among multiple entities, extend beyond traditional graph representations and support a wider range of applications. However, existing neural network models for high-order correlations encounter scalability issues on large datasets due to the substantial computational complexity involved in processing large-scale structures. In addition, long-tailed distributions, which are common in real-world data, result in underrepresented categories and hinder the model's ability to learn effective high-order interaction patterns for rare instances. To address these issues, we introduce a novel framework known as HyperGraph-based High-order Correlation analysis (HGHC) for large-scale long-tailed data classification. Firstly, to tackle the long-tailed distribution problem, HGHC generates synthetic vertices and computes their attributed high-order correlations using an oversampling module inspired by SMOTE, termed HSMOTE, to enhance the representation of tail categories. Secondly, for efficient computational scaling, we treat the data as having two modalities: the structural modality capturing high-order relationships and the feature modality representing individual attributes. We perform computations on both CPU and GPU separately and then fuse the results to achieve a lightweight vertex transformation and aggregation scheme for high-order correlation data. Additionally, we contribute the first benchmark for large-scale long-tailed datasets involving high-order correlations, known as Amazon-LT, which includes multiple datasets with varying imbalance ratios. Our experimental results demonstrate that HGHC achieves state-of-the-art performance in handling high-order correlation analysis issues for large-scale, long-tailed data.
科研通智能强力驱动
Strongly Powered by AbleSci AI