AbstractThe article first introduces the research background and value of DGAdomain names,the characteristics and basic definition of DGA domainnames.Then use the current mainstream intelligent algorithms:XGBoost,Naive Bayes,Multilayer Perceptron and Recurrent Neural Network tocombine several feature extraction methods,including N-Gram model,statistical domain name feature model and character sequence model forfeature extraction and experiment.The results are compared and analyzed toobtain better feature extraction and algorithm combination.According to theexperiment,Multilayer Perceptron based on 2-gram feature model has thebest effect on DGA domain name detection.Although mainstream detection methods have achieved good results indetecting DGA domain names,there are still several major problems:modeldetection capabilities still have room for improvement,lack of evolutionarytraining data,and self-defense of detection models.This paper is based onthe Multilayer Perceptron of the 2-gram feature model,and compares theimportant Hyperparameters in the combination to obtain a model with higherdetection ability.Finally,in view of the lack of evolutionary training data anddetection model's own security issues in mainstream detection technologies,this thesis proposes an improved training set by using an improved WGANcharacter domain name generator to generate adversarial domain names.Thismethod generates adversarial domain names that are more in line with humannaming habits than traditional GAN models.Conversely,adding thesetraining sets containing adversarial factors increases the model'sdiscriminative hit rate for unknown domain names,thereby enhancing themodel's own defense capabilities.Key words:DGA;Machine Learning:Deep Learning:WGAN
暂无评论内容