基于事实信息核查的虚假新闻检测综述
A survey on fake news detection based on fact verification
投稿时间: 2023/12/20 0:00:00
DOI:
中文关键词: 虚假新闻检测;深度学习;事实核查;谣言检测
英文关键词: fake news detection; deep learning; fact verification; misinformation detection
基金项目: 国家自然科学基金联合基金项目(U20B2051)
姓名 单位
杨昱洲 复旦大学计算机科学技术学院
周杨铭 复旦大学计算机科学技术学院
应祺超 复旦大学计算机科学技术学院
钱振兴 复旦大学计算机科学技术学院
曾丹 上海大学通信与信息工程学院
刘亮 中央广播电视总台视听新媒体中心技术应用部
点击数:953 下载数:2552
中文摘要:

基于深度学习的虚假新闻检测领域内已有许多开创性的方法能通过特征提取与检测的方式进行自动检测假新闻的任务,通常使用预训练模型提取新闻内容的特征,并开发算法使用这些特征进行检测。许多此类方法通过找到假新闻中通行的特征模式(例如写作风格、常用词等)来判别假新闻。但模型的高性能严重依赖于大量高质量标注数据的训练。然而在实际应用场景中,不仅获取、标注数据十分困难,新伪造的虚假新闻往往还会避免采用以往假新闻的写作风格,导致了模型在时间性上缺乏泛化能力。近年来事实核查在虚假新闻检测领域的发展为解决上述问题提供了新的研究思路,基于事实信息的虚假新闻检测提供了更可靠的检测解释性,通过对事件的真实性、描述与事实的匹配程度等的查验,很大程度上突破了以往方法依赖文本风格特征所带来的检测偏置。本文从任务和问题、算法策略、数据集等角度出发,对当前基于事实信息的虚假新闻的研究成果进行梳理和总结。首先,本文系统性地阐述了基于事实信息的虚假新闻检测的任务定义与核心问题。其次,从算法原理出发,对现有的检测方法进行归纳总结。之后,对领域内的经典与新提出的数据集进行了分析,对各数据集上的实验结果进行了总结。最后,本文概括性地阐述了现有方法的优势和劣势,提出了几个该领域方法可能面临的挑战,并对下一阶段的研究进行展望,期望为领域内的后续工作提供参考。

英文摘要:

There are many groundbreaking methods in the field of fake news detection based on deep learning that can automatically detect fake news through feature extraction and detection. A common methodology framework consists of extracting features from news content by pre-trained models and developing algorithms for detection. Major approaches within the scope identify fake news by learning common feature patterns in them, such as writing style, word usage, etc.. The performance of such models highly relies on large well-annotated data sets, but obtaining and annotating fake news data is laborious. Moreover, newly forged fake news often avoids utilizing the writing style of previous fake news, resulting in poor generalization ability in terms of timeliness. In recent research, fake news detection based on fact verification provides new ideas to address the above problems. Approaches within the scope verify the authenticity of the news event, matching between description and factual information, and so on, to provide more reliable and explicable detection, greatly addressing the bias of previous methods that rely on semantic and writing style features. In this paper, we sort out the research findings of fact-based fake news detection from the perspectives of tasks and problems, algorithms, datasets, and so on. First, this paper illustrates the task definition and core problems of fact-based fake news detection. Next, existing approaches are summarized and organized in terms of algorithms. Subsequently, classic and newly published datasets are analyzed, and extended by summarizing experimental results evaluation. Finally, this paper elaborates on the pros and cons of existing approaches from an overall perspective, pointing out some substantial challenges and expectations of this field. It is expected to provide references for future works in the field of fake news detection.

参考文献: