A Lexicon Enhanced Named Entity Recognition Algorithm for Typical Cultural Relics
投稿时间: 2023/4/20 0:00:00
中文关键词: 词汇增强;领域词库;命名实体识别
英文关键词: lexicon enhanced; domain thesaurus; named entity recognition;
基金项目: 国家重点研发计划课题“文化资源大数据服务工程方法与数据加工技术研究”(2021TFF0901701)
姓名 单位
崔鑫 北京邮电大学计算机学院
王琰 北京邮电大学人工智能学院
侯小刚 北京邮电大学人工智能学院
周月 北京邮电大学电子工程学院
点击数:678 下载数:905



Named entity recognition of typical cultural relics focuses on extracting entities from sentences in categories such as name of cultural relic, dynasty, excavation site, and place of collection. The data of typical cultural relics has the specificity of word construction, and using existing named entity recognition methods on typical cultural relics dataset will encounter problems such as wrong word boundary judgments. The algorithm introduces lexical information in both the input representation layer and the contextual encoding layer to improve the word domain expertise. By constructing a lexicon of heritage domain words, the algorithm is used as a lexicon for the lexically enhanced recognition algorithm of typical heritage named entities, which eventually solves the problem of incorrect word boundary judgement and achieves better results on the typical heritage dataset.
