深度伪造音频检测综述
Audio deepfake detection: a survey
投稿时间: 2024/4/1 0:00:00
DOI:
中文关键词: 深度伪造音频检测;全局检测;局部定位;伪造溯源
英文关键词: audio deepfake detection; global deepfake audio detection; local deepfake audio localization; deepfake audio source tracing
基金项目: 国家自然科学基金 (62201524);国家重点研发计划项目(2021YFF0900504)
姓名 单位
赵义堃 安徽财经大学艺术学院
张彩芸 安徽财经大学统计与应用数学学院
朱楚颜 安徽师范大学音乐学院
点击数:313 下载数:259
中文摘要:

随着生成式人工智能技术的快速普及和发展,社交媒体领域充斥着大量由语音合成、语音转换等技术生成的深度伪造音频。这些高自然度的深度伪造音频为真伪媒体内容分辨带来了巨大挑战。为了解决这一问题,国内外已经组织了多样化深度伪造音频检测挑战赛,以促进音频反欺骗领域的发展。区别于已有综述局限于音频真伪二分类,本文跨越传统二分类,对深度伪造音频检测领域的相关工作做出了全面的总结。即将深度伪造音频检测领域分为三个子领域:全局伪造音频检测、局部伪造音频定位、深度伪造音频溯源,分别对三个子领域现有的数据集,领域问题、解决方法进行了梳理和总结。最后,提出了深度伪造音频检测领域可能面临的挑战,对下一阶段的研究进行展望,期望为未来研究人员提供可靠参考。

英文摘要:

With the rapid development of generative artificial intelligence technology, social media platforms have become inundated with a plethora of deepfake audio synthesized using techniques such as speech synthesis and voice conversion. These deepfake audios, capable of producing highly natural and realistic voices, pose significant threats. To address this issue, numerous deepfake audio detection challenges have been organized globally, aiming to foster the development of the audio anti-spoofing field. Distinguishing from existing surveys which limited to the binary classification of whole audio authenticity, this article transcends traditional binary classification and provides a comprehensive summary of audio deepfake detection. Specifically, this article divides the domain of audio deepfake detection into three sub-domains: global deepfake audio detection, local deepfake audio localization, and deepfake audio source tracing, systematically reviewing and summarizing existing datasets, domain issues, and solution approaches in each sub-domain. Finally, this paper outlines the potential challenges facing the field of deepfake audio detection and offers prospects for future research, aiming to provide reliable reference for future researchers.

参考文献: