排名(信息检索)
计算机科学
散列函数
相似性(几何)
可扩展性
数据挖掘
集合(抽象数据类型)
模式识别(心理学)
局部敏感散列
人工智能
数学
哈希表
图像(数学)
数据库
计算机安全
程序设计语言
作者
Lei Zhang,Yongdong Zhang,Xiaoguang Gu,Jinhui Tang,Qi Tian
标识
DOI:10.1109/tip.2014.2326010
摘要
Hashing-based similarity search techniques is becoming increasingly popular in large data sets. To capture meaningful neighbors, the topology of a data set, which represents the neighborhood relationships between its subregions and the relative proximities between the neighbors of each subregion, e.g., the relative neighborhood ranking of each subregion, should be exploited. However, most existing hashing methods are developed to preserve neighborhood relationships while ignoring the relative neighborhood proximities. Moreover, most hashing methods lack in providing a good result ranking, since there are often lots of results sharing the same Hamming distance to a query. In this paper, we propose a novel hashing method to solve these two issues jointly. The proposed method is referred to as topology preserving hashing (TPH). TPH is distinct from prior works by also preserving the neighborhood ranking. Based on this framework, we present three different TPH methods, including linear unsupervised TPH, semisupervised TPH, and kernelized TPH. Particularly, our unsupervised TPH is capable of mining semantic relationship between unlabeled data without supervised information. Extensive experiments on four large data sets demonstrate the superior performances of the proposed methods over several state-of-the-art unsupervised and semisupervised hashing techniques.
科研通智能强力驱动
Strongly Powered by AbleSci AI