一种基于AdaBoost.MH算法的汉语多义词排歧方法

第1页 / 共63页

第2页 / 共63页

第3页 / 共63页

第4页 / 共63页

第5页 / 共63页

第6页 / 共63页

第7页 / 共63页

第8页 / 共63页
试读已结束,还剩55页,您可下载完整版后进行离线阅读
一种基于AdaBoost.MH算法的汉语多义词排歧方法-知知文库网
一种基于AdaBoost.MH算法的汉语多义词排歧方法
此内容为付费资源,请付费后查看
10
限时特惠
20
立即购买
您当前未登录!建议登陆后购买,可保存购买订单
付费资源
© 版权声明
THE END
一种基于AdaBoost..MH算法的汉语多义词排歧方法AbstractWord sense disambiguation (WSD)plays an important role in many areas of naturallanguage processing such as machine translation,information retrival,sentence analysis,speechrecognition.The research on WSD has great theoretical and practical significance.The mainwork in the dissertation is to study the supervised learning algorithm learning WSD knowledgefrom many kinds of resources based on large sense-tagged Chinese corpus.An approach based on supervised AdaBoost.MH learning algorithm for Chinese wordsense disambiguation is presented.AdaBoost.MH algorithm is employed to learn WSDknowledge from many kinds of resources and to boost the accuracy of the weak stumps rulesfor decision trees and repeatedly calls a learner to finally produce a more accurate rule.A simplestopping criterion is also presented in view of the efficiency of learning and the utility of system.As for Chinese WSD,in order to extract more contextual information,we introduce a newWSD knowledge--semantic categorization as well as two classical knowledge sources:part-of-speech of neighboring words and local collocations.Experimental results show that thesemantic categorization knowledge is useful for improving the learning efficency of thealgorithm and accuracy of disambiguation.Due to the flexibility and complexity of bulding up a broad coverage semanticallyannotated corpus,an approach based on WwW search engines to automatically obtainannotated corpus for Chinse WSD is presented.AdaBoost.MH algorithm has a higher disambiguation accuracy rates which are 85.75%and 75.84%in open tests for 6 typical polysemous Chinese words and 20 polysemous wordsfrom SENSEVAL3 Chinese corpus.Key Words:Natural Language Processing;Word sense disambiguation;AdaBoost.MH algorithm;Multiple knowledge sources
喜欢就支持一下吧
点赞10 分享
评论 抢沙发
头像
欢迎您留下宝贵的见解!
提交
头像

昵称

取消
昵称表情代码图片

    暂无评论内容