基于Python的新闻爬虫系统的设计与实现ABSTRACTDriven by Internet technology,online news has also become one of people's concerns.Internet news has the advantages of rapid propagation,large influence range,wide socialaudience,etc.,but there are also some fictitious and inferior online news.The quality ofonline news varies so that users do not get their due reading experience.Therefore,collecting real,accurate and structured online news data has become the focus of research.Using network information,the main task of achieving content resource evaluation isto obtain network data.In order to obtain more comprehensive and complete network data,this paper designs a data collection method that is different from the information on thetraditional Internet and mobile Internet.This system is a crawler system that is completedunder the environment of python language,and the current ranking of python language inprogramming is rising,with great development prospects.the experiment process mainlycrawled and visualized the data of the industry channel with the theme of the technologyindustry in China News.Show.Web crawlers are used to capture traditional Internet dataaccording to rules.In order to adapt crawlers to various website structures and breakthrough the limitations of various network sites,a more general and more scalable methodof web news crawlers is designed and implemented.Key words:Internet News;Scrapy;General Crawler
暂无评论内容