中国传媒大学学报自然科学版

一种多模态跨媒体检索的融媒影视系统

A Film and Television Media Convergence System Based on Multi-Modal Cross-Media Retrieval

投稿时间：

2021/8/20 0:00:00

DOI：

中文关键词：

融媒体；跨媒体检索；字幕识别；人脸识别

英文关键词：

Media convergence; Cross-media retrieval; Subtitle recognition; Face recognition

基金项目：

国家社科基金艺术学项目资助(18BC034)

姓名	单位
李春芳	中国传媒大学计算机与网络空间安全学院
刘永久
王楷翔
杨睿
张凌飞
李敏
邓智铭
石民勇

点击数：948

下载数：1082

中文摘要：

视频是最有影响力的传播媒介，然而其非线性检索仍然困难。本文集成创新性工作包括：基于图像识别提取字幕，基于卷积神经网络识别人脸，通过字幕和人脸解决了影视视频的非线性检索问题；从字幕文本提取重要实体，用海量知识库和电子书补全影视关联知识，构建了文本、电子书和视频融合的跨媒体应用；以字幕词云和人物实体词云，实现影视的概览理解和检索导航；以众包实现字幕、电子书、人脸和实体信息的修正。以近代史献礼电影、中国诗词大会和科技纪录片为例系统完整地实现了一个示范性融媒影视系统。

英文摘要：

Video is the most influential media, but it’s difficult to nonlinearly search video content. The integrated creative work of this paper includes: Based on image processing to recognize video subtitle and convolutional neural networks to recognize faces of characters, the problem of film and TV video nonlinear retrieval is solved. Further, we extract important entities from subtitle text and enhance their relevant knowledge with large scale knowledge base and e-books, which constructs a cross-media application system of video, text, and e-book. Word cloud of subtitles and character entities are designed to facilitate video overview understanding and navigating retrieval. Crowdsourcing technology is used to update the amendments of subtitles, e-books, face recognition and entities information. A typical cross-media convergence system are completely implemented including movies in modern history, conference of the Chinese poetry, and information technology documentary video.

参考文献：