中国传媒大学学报自然科学版

基于二阶差分MFCC 深度学习的声景基调声分类方法

A soundscape keynote classification based on the second order difference MFCC in depth learning

投稿时间：

2023/10/20 0:00:00

DOI：

中文关键词：

声景；基调声；卷积神经网络；二阶差分MFCC

英文关键词：

soundscape; keynote; convolution neural network; second order difference MFCC

基金项目：

北京社科基金重点项目（22GLA014）; 国家自然科学基金面上项目（41871130）

姓名	单位
邓志勇	首都师范大学音乐学院
张万亿	中央音乐学院音乐人工智能与音乐信息科技系
刘爱利	首都师范大学资源环境与旅游学院

点击数：202

下载数：179

中文摘要：

本文提出了一种可用于卷积神经网络分类技术的二阶差分MFCC特征，尝试解决声景学中基调声与非基调声二分类这一具有“人文色彩”的主观分类任务。以老北京中轴线的声景样本数据集为例，根据本文设计的网络模型结构，使用该二阶差分MFCC特征训练的二分类器对于声景基调声的识别准确率达到80.23%，远优于单独使用RMS和Mel频谱特征，以及RMS与二阶差分MFCC特征的准确率。

英文摘要：

In order to solve the subjective classification task of soundscape keynote classification with “humanistic color” in depth learning, a feature of the second order difference MFCC used in the classification technology of convolution neural network was put forward in this paper. Taking the soundscape data set in the axis of the Old Beijing for example, the accuracy of the keynote recognition by means of the second order difference MFCC in the designed CNN framework is 80.23%, which is higher than those of RMS, Mel spectrogram, and integration features of RMS and the second order difference MFCC.

参考文献：