可靠性
计算机科学
社会化媒体
误传
谣言
数据科学
比例(比率)
桥(图论)
大数据
众包
情报检索
万维网
数据挖掘
计算机安全
政治学
内科学
法学
物理
医学
量子力学
公共关系
作者
Tanushree Mitra,Éric Gilbert
标识
DOI:10.1609/icwsm.v9i1.14625
摘要
Social media has quickly risen to prominence as a news source, yet lingering doubts remain about its ability to spread rumor and misinformation. Systematically studying this phenomenon, however, has been difficult due to the need to collect large-scale, unbiased data along with in-situ judgements of its accuracy. In this paper we present CREDBANK, a corpus designed to bridge this gap by systematically combining machine and human computation. Specifically, CREDBANK is a corpus of tweets, topics, events and associated human credibility judgements. It is based on the real-time tracking of more than 1 billion streaming tweets over a period of more than three months, computational summarizations of those tweets, and intelligent routings of the tweet streams to human annotators — within a few hours of those events unfolding on Twitter. In total CREDBANK comprises more than 60 million tweets grouped into 1049 real-world events, each annotated by 30 human annotators. As an example, with CREDBANK one can quickly calculate that roughly 24% of the events in the global tweet stream are not perceived as credible. We have made CREDBANK publicly available, and hope it will enable new research questions related to online information credibility in fields such as social science, data mining and health.
科研通智能强力驱动
Strongly Powered by AbleSci AI