AbstractHuman vision and behavior recognition technology are important academicresearch topics in the fields of computer science,visual technology,and artificialintelligence video technology.In this information age,people's daily life and work areinseparable from a device with powerful human eye vision and motion recognitionfunctions,advanced human-computer interaction,unmanned driving,intelligent videoand monitoring,which are very important.The demand for smart devices such aspositioning and satellite navigation,virtual reality and such artificial intelligenceproducts is also growing.Therefore,it has important academic research value andsignificance for the research and development of human behavior recognitiontechnology in artificial intelligence video technology.Current human behavior recognition methods have a better effect on human behaviorrecognition of short videos,such as dual-flow neural networks,3d convolutionalneural networks,spatiotemporal convolutional neural networks and other recognitionmethods.These behavior recognition network video inputs are either a randomlyselected raw RGB image,or a stacked set of dense RGB images,or a set of opticalflow maps.However,for long videos,intensive selection of a group of image framescannot objectively represent global information.Therefore,from the perspective ofshort video theory and technical reality,this paper proposes a key frame segmentnetwork (KFSN)based on the fusion of local information of key frames for humanbehavior recognition of long video.In this method,the long video is divided intoequal-length multi-segment videos for human behavior recognition,and then thesegmented short video recognition results are fused.This kind of network is based onthe idea of long-term modeling.It combines the strategy of sparse time videosampling well,so that the whole action video can be efficiently learned.The recognition method proposed in this paper has been tested many times onpublic datasets UCF101 and HMDB51.The experimental results obtained show thatthe KFSN network proposed in this paper can achieve a good behavior recognitioneffect,which can achieve a recognition rate of 95.0%in UCF101 and 70.1%inHMDB51.Better than some existing behavior recognition network performance.Keywords:Behavior recognition;Key Frame Extraction;Local Information;Information FusionII
暂无评论内容