英文摘要:
Action recognition is a fundamental yet challenging problem in computer vision. In the past few years, many works have been developed on recognition based on RGB videos and achieved many significant results. However, processing RGB videos can be very time consuming. Another data modality, human skeletons, which represent a person by the 3D coordinate positions of skeletal joints, draw much attention due to the lightweight representations, the robustness to variations of viewpoints, appearances, and surrounding distractions. However, action recognition of skeleton data faces two problems: noise of skeleton data and dependence of data annotation. The problem of noise refers to the noise in skeleton data that affects the accuracy of data, while the problem of data annotation dependence refers to that the training requires lots of labelled data. To address the issue of action analytics from noisy skeletons which commonly appear in the real world, this paper proposes a noise-adaptation model to get rid of explicit skeleton noise modelling and reliance on skeleton ground truths. Regression-based and generation-based adaptation models are developed respectively according to whether the pairs of noisy skeletons are available. Besides, aiming at dependence of data annotation, with the self-supervised learning method on human skeleton data, this paper proposes an action recognition method based on multitask self-supervised learning.
|