The Design and Implementation of Website Crawler of theTencent Recruitment Information which Basic on ScrapyDistributedAbstract:With the rapid growth of information technology,network data has become more and moreimportant resource.Nowadays,a research hotspot is how to search,extract and analyze data quickly andefficiently.For these methods,at present,we can use scrapy framework to design web crawler tooperate data.At first,it introduces the application of scrapy in python,then analyzes the target website,designs the corresponding expression to extract the required data,and finally saves the data to a file torealize the storage persistence.Based on scrapy distributed of website data what tencent recruitment scraping system,for thefurther application of data that is job search and recommendation system to do data support.The projectis a distributed crawler based on scrapy in python,according to web page analysis,json.load (is usedto convert the downloaded web page data into python data for extraction.The system use Redisdatabase for distribution and use mysql database for data storage,it was designed and implemented adistributed crawler system for tencent's recruitment and job data.Keywords:Crawler,Scrapy,Tencent recruitment2
暂无评论内容