基于VGG和LSTM网络的视觉问答系统研究与应用

文章大全

5月11日 11:57发布

1.34MB32页0596

第1页 / 共32页

第2页 / 共32页

第3页 / 共32页

第4页 / 共32页

第5页 / 共32页

第6页 / 共32页

第7页 / 共32页

第8页 / 共32页

试读已结束，还剩24页，您可下载完整版后进行离线阅读

文章版权归作者所有，未经允许请勿转载。

THE END

计算机与科学

文本预览

Research And Application Of VisualQuestion Answering System Based On VGGAnd LSTM NetworkABSTRACTWith the development of the Internet,the amount of data available to humanbeings has increased exponentially,and the knowledge we can obtain from the datahas also increased greatly.The research and application of artificial intelligence havebeen revitalized again.Along with the continuous development of artificialintelligence application,the research on Visual Question Answering has appeared inrecent years and has developed into a hot spot.A VQA task is a multi-domain,interdisciplinary task,with a picture and a natural language question about the freeand open form of pictures as input and the generation of a natural language answer asoutput.Briefly,VQA is a question-and-answer session on a given picture.This designcombines the current research status of VQA,based on the theory of deep learning,tostudy the VQA system of VGG+LSTM network.It refers to use VGG network toextract the features of pictures and use LSTM network to extract the features ofquestions and generate the features of system output answers.It finally transforms thiscomplex artificial intelligence system into a multi-classification problem,realizing theway of questioning a picture in a natural language sentence,and answering it in anatural language word.The main innovation of this design is to combine the twodirections of Computer Vision and Natural Language Processing in the field of deeplearning and transform the output of the system into a classification problem,andachieve the question-and-answer effect.Key words:VQA;Visual q&a;VGG-Net;LSTM-Net;Deep learing;

喜欢就支持一下吧