基于增量式爬虫的搜索引擎系统的设计

文章大全

5月11日 12:09发布

1.41MB26页04712

第1页 / 共26页

第2页 / 共26页

第3页 / 共26页

第4页 / 共26页

第5页 / 共26页

第6页 / 共26页

第7页 / 共26页

第8页 / 共26页

试读已结束，还剩18页，您可下载完整版后进行离线阅读

文章版权归作者所有，未经允许请勿转载。

THE END

计算机与科学

文本预览

北京理工大学珠海学院2020届本科生毕业设计Design and Implementation of Search Engine System Based onIncremental CrawlerABSTRACTWith the continuous development of society,information is growing faster and faster,and a large amount of data appears in front of us.For these data,we will find it more difficultto find and extract information.How to find the information we need faster and moreaccurately and obtain useful information has become an important technology.Comparedwith traditional search engines,the search engine developed this time is targeted and updatedfrequently.It can crawl data in real time,so that the data obtained by users every time is thelatest.This topic crawls network data based on the Scrapy framework in Python under theWindows platform,and saves the obtained data locally and Redis distributed preservation.Through the most popular search engine elasticsearch for indexing and data connection,thenquickly build a search website through the Django framework,explain how to complete thesearch query interaction between Django and elasticsearch,and finally complete onlinedeployment of Scrapy through Scrapyd.Users can search and query information in a targetedmanner.Through a series of experimental verifications,the data obtained is well storedlocally and in Redis,proving the advantages and advantages of incremental crawlers intraditional search engines.Keywords:Scrapy,search engine,incremental,Django framework,crawler

喜欢就支持一下吧