基于数据挖掘的企业失信识别模型研究

第1页 / 共29页

第2页 / 共29页

第3页 / 共29页

第4页 / 共29页

第5页 / 共29页

第6页 / 共29页

第7页 / 共29页

第8页 / 共29页
试读已结束,还剩21页,您可下载完整版后进行离线阅读
基于数据挖掘的企业失信识别模型研究-知知文库网
基于数据挖掘的企业失信识别模型研究
此内容为付费资源,请付费后查看
10
限时特惠
20
立即购买
您当前未登录!建议登陆后购买,可保存购买订单
付费资源
© 版权声明
THE END
北京理工大学珠海学院2020届本科生毕业论文Research on the model of corporate dishonesty recognition basedon Data MiningAbstractHow to detect whether enterprises break the law or not has become a problem in the eraof big data.This paper will use data mining method to predict whether the enterprise isdishonest.First of all,we use Python's pandas package to make statistics on the missingvalues of the data,and remove the indicators with a missing rate of more than 30%,andthen remove the remaining 19 indicators that have no impact on the enterprise's dishonesty.At last,we leave 12 indicators.Then,based on the principle of KNN algorithm,the dmwrpackage of R is used to fill the missing data.Secondly,this paper makes a data visualiza-tion analysis on the four indicators of data,namely,enterprise type,registration authority,enterprise status and jurisdiction authority.Finally,the decision tree model,random forestmodel and gradient promotion decision tree model are selected to establish the enterprisedishonesty identification model.The model is evaluated by the accuracy rate,recall rate,confusion matrix and ROC curve.Finally,the prediction accuracy rates of decision tree,random forest and gradient promotion decision tree are 90.9%,92%and 92.57%.However,the AUC value ofthe three decision tree family models is in the range of 0.51 and 0.61,andthe prediction accuracy of the models is not stable enough.Finally,we try MLP model,andfind that the prediction accuracy of MLP model is 92 and AUC value is stable at 0.89.Keywords:Corporate discredit Decision tree data mining Receiver Operating Char-acteristic
喜欢就支持一下吧
点赞15 分享
评论 抢沙发
头像
欢迎您留下宝贵的见解!
提交
头像

昵称

取消
昵称表情代码图片

    暂无评论内容