跨模态通信理论及关键技术初探
Preliminary Study on Theory and Key Technology of Cross-Modal Communications
投稿时间: 2021/2/20 0:00:00
DOI:
中文关键词: 跨模态通信;触觉编码;码流传输;信息重建
英文关键词: cross-modal communications; haptic codecs; streaming transmission; information reconstruction
基金项目: 国家自然科学基金项目61571240,61671253 江苏高校优势学科建设工程基金资助项目
姓名 单位
高赟 南京邮电大学通信与信息工程学院、宽带无线通信与传感网技术教育部重点实验室
魏昕 南京邮电大学通信与信息工程学院、宽带无线通信与传感网技术教育部重点实验室
周亮 南京邮电大学通信与信息工程学院、宽带无线通信与传感网技术教育部重点实验室
点击数:1142 下载数:948
中文摘要:

传统视听服务和新兴触觉服务的相互加持,必将为多媒体用户带来更为极致的互动感受和场景体验。针对音频、视频和触觉信号在物理特征、传输需求、呈现形式等维度上均存在本质差异大的问题,提出音-视-触跨模态通信构架,主要包括触觉信号编码、多模态异构码流传输、跨模态信息重建三个方面。首先,基于用户触觉感知机理介绍当前高效、鲁棒的触觉信号编码方案,为实现信号的压缩提供理论依据;其次,通过充分利用码流传输的时空特性,提出一种边缘智能赋能下的多模态异构码流传输策略,以满足超低时延、超高可靠、大容量的传输需求;随后,通过不同模态间语义层面的融合及共享,探索智能、完备的跨模态信息重建机制以提升用户的沉浸感体验;最后,指出跨模态通信仍然存在的挑战以及展望其未来发展方向。

英文摘要:

The mutual support of traditional audio-visual services and emerging haptic services will definitely bring more extreme interactive experience and scene experience to multimedia users. Owing to substantial differences among audio, video, and haptic signals in terms of physical characteristic, transmission requirement, and display form, cross-modal communications architecture based on audio-video-haptic is proposed, which mainly includes haptic signal codecs, heterogeneous streaming transmission, and cross-modal information reconstruction. Firstly, the current efficient and robust haptic signal coding schemes are introduced based on the user haptic perception mechanism to provide a theoretical basis for signal compression. Then, by fully leveraging the spatio-temporal transmission characteristics, heterogeneous streaming transmission strategy empowered by edge intelligence is proposed to meet transmission needs of ultra-low latency, ultra-high reliability, and large volume. Subsequently, the intelligent and complete cross-modal information reconstruction mechanism is explored by the fusion and sharing of semantic levels among heterogeneous modalities to improve users’ immersive experience. Finally, the challenges and future directions existing in cross-modal communications are prospected.

参考文献: