中国传媒大学学报自然科学版

三维声双耳渲染算法音质主客观评价分析

Analysis of subjective and objective evaluation of sound quality by three-dimensional sound binaural rendering algorithm

投稿时间：

2023/8/20 0:00:00

DOI：

中文关键词：

三维声；双耳渲染算法；主观评价；客观评测模型；音质

英文关键词：

three-dimensional sound, binaural rendering algorithm, subjective evaluation, objective evaluation model, sound quality

基金项目：

姓名	单位
黄心仪	中国传媒大学音乐与录音艺术学院
谢凌云	中国传媒大学信息与通信工程学院
王鑫	中国传媒大学音乐与录音艺术学院

点击数：810

下载数：1060

中文摘要：

随着三维声的应用逐渐广泛，对三维声进行双耳渲染成为了新的技术热点，如何有效地评价三维声双耳渲染算法成为关键问题。本文针对6种三维声双耳渲染算法进行了音质维度的主观评价实验，对实验数据进行方差分析和回归分析。通过对双耳录音的实验素材进行客观特征的提取和筛选，与主观评价结果进行偏最小二乘回归分析，建立了总体音质评价维度的客观评测模型，并探究了主观感知与客观特征之间的关联。主观实验结果表明，进行双耳渲染算法处理会对音质造成损伤，但对音质进行算法补偿，可以在一定程度上弥补渲染算法造成的音质损伤。客观预测模型表明音质与2560～5120Hz和40～320Hz这两个频段的时频特征高度相关，例如谱通量和谱滚降等。低频段的双耳互相关系数和侧向声能比也是影响音质维度的重要特征。

英文摘要：

As the application of 3D sound becomes increasingly widespread, binaural rendering of 3D sound has emerged as a new technological focus. The effective evaluation of binaural rendering algorithms for 3D sound has become a key issue. In this paper subjective quality assessment experiments on six different binaural rendering algorithms for 3D sound were conducted, followed by variance analysis and regression analysis of the experimental data. Objective features were extracted and selected from binaural recordings, and a partial least squares regression analysis was performed to establish an objective evaluation model for overall sound quality dimensions. The relationship 作者简介(*为通讯作者)：黄心仪(2000-)，女，硕士研究生，主要从事音乐感知与空间音频研究。E-mail：rubydiva@163.com；谢凌云(1977-)，男，博士，副研究员，主要从事音频信号处理与心理声学研究。E-mail：xiely@cuc.edu.cn；王鑫 (1978-)，女，博士，教授，主要从事音乐感知与音乐声学研究。E-mail：metero_wx@cuc.edu.cn between subjective perception and objective features was also explored. The subjective experimental results indicate that the binaural rendering algorithm processing can have a negative impact on sound quality. However, compensating for sound quality using algorithmic adjustments can partially mitigate the sound quality degradation caused by the rendering algorithm. The objective prediction model reveals that sound quality is highly correlated with time-frequency features in the frequency ranges of 2560-5120Hz and 40-320Hz, such as spectral flux and spectral rolloff. Additionally, the interaural cross-correlation coefficient and lateral sound energy ratio in the low-frequency range are important features influencing sound quality dimensions.

参考文献：