Human-Algorithm Collaborative Truth Inference in Crowdsourcing

众包推论机器学习计算机科学基本事实超参数稳健性（进化）水准点（测量）归纳偏置公制（单位）骨料（复合）互补性（分子生物学）人类智力人工智能数据挖掘聚合问题众包软件开发合成数据性能指标训练集

作者

Xuan Wei,Mingyue Zhang,Qingpeng Zhang,Zhi Li,Daniel Zeng

出处

期刊：Informs Journal on Computing 日期：2025-09-15

标识

DOI：10.1287/ijoc.2023.0440

摘要

Crowdsourcing has become a pivotal strategy in gathering large-scale, high-quality labeled data, particularly in data-intensive applications powered by artificial intelligence. To aggregate the noisy crowd efforts, many studies have considered learning a predictive algorithm based on the noisy human annotations and subsequently integrating the learned knowledge back into the data aggregation process. However, it is unclear how to design such hybrid systems that maximize the complementary strengths of humans and algorithms. In response, we analyze the patterns of human and algorithm intelligence and propose that the inductive bias of algorithms can effectively mitigate inconsistencies in human labeling, thus complementing human efforts. Building on this premise, we propose a human-algorithm collaborative framework (HAC) to combine human labels with algorithmic predictions. By proposing a metric called hybrid complementarity score (HCS) to quantify human-algorithm complementarity, our framework can dynamically adjust the weight of each algorithm based on its complementarity, significantly enhancing the overall efficacy of the human-algorithm integration. To validate the effectiveness of our framework, we first instantiate it with several algorithms, including a high-complementarity algorithm building upon the inductive bias of clustering-aware design. We then benchmark our framework against leading baselines across eight real-world tasks. Our results not only demonstrate the superior performance of our proposed framework but also affirm its robustness across different algorithm selections (e.g., types and number of algorithms) and hyperparameter configurations. This research not only delivers a feasible and effective solution for truth inference in crowdsourcing but also contributes to the burgeoning community of human-algorithm collaboration. History: Accepted by Ram Ramesh, Area Editor for Data Science and Machine Learning. Funding: X. Wei is supported by the National Natural Science Foundation of China (NSFC) [Grants 72201167, 72192822, 72571175, 72331006, 72221001, and 72232005] and the Young Elite Scientists Sponsorship Program by CAST [Grant 2023QNRC001]. D. D. Zeng is supported by NSFC [Grant 72293575]. M. Zhang is supported by NSFC [Grant 72272101]. Q. Zhang is supported by the General Research Fund of the Research Grant Council of Hong Kong [Grant 17209225]. X. Wei also thanks the Science and Technology Commission of Shanghai Municipality [Grant 22JC1403600]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2023.0440 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2023.0440 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .

求助该文献

最长约 10秒，即可获得该文献文件

Human-Algorithm Collaborative Truth Inference in Crowdsourcing

今日热心研友