Back close

Improved DGA Domain Names Detection and Categorization Using Deep Learning Architectures with Classical Machine Learning Algorithms

Publication Type : Book Chapter

Publisher : Springer International Publishing, Cham

Source : Cybersecurity and Secure Information Systems: Challenges and Solutions in Smart Environments, Springer International Publishing, Cham, p.161–192 (2019)

Url :

ISBN : 9783030168377

Keywords : Botnet, Classical machine learning, Deep learning, Domain generation algorithms (DGAs), Domain Name System (DNS)

Campus : Coimbatore

School : School of Engineering

Center : Computational Engineering and Networking

Department : Electronics and Communication

Year : 2019

Abstract : Recent families of malware have largely adopted domain generation algorithms (DGAs). This is primarily due to the fact that the DGA can generate a large number of domain names after that utilization a little subset for real command and control (C&C) server communication. DNS blacklist based on blacklisting and sink-holing is the most commonly used approach to block DGA C&C traffic. This is a daunting task because the network admin has to continuously update the DNS blacklist to control the constant updating behaviors of DGA. Another significant direction is to predict the domain name as DGA generated by intercepting the DNS queries in DNS traffic. Most of the existing methods are based on identifying groupings based on clustering, statistical properties are estimated for groupings and classification is done using statistical tests. This approach takes larger time-window and moreover can't be used in real-time DGA domain detection. Additionally, these techniques use passive DNS and NXDomain information. Integration of all these various information charges high-cost and in some case is highly impossible to obtain all these information because of real-time constraints. Detecting DGA on per domain basis is an alternative approach which requires no additional information. The existing methods on detecting DGA per domain basis is based on machine learning. This approach relies on feature engineering which is a time-consuming process and can be easily circumvented by malware authors. In recent days, the application of deep learning is leveraged for DGA detection on per domain basis. This requires no feature engineering and easily can't be circumvented. In all the existing studies of DGA detection, the deep learning architectures performed well in comparison to the classical machine learning algorithms (CMLAs). Following, in this chapter we propose a deep learning based framework named as I-DGA-DC-Net, which composed of Domain name similarity checker and Domain name statistical analyzer modules. The Domain name similarity checker uses deep learning architecture and compared with the classical string comparison methods. These experiments are run on the publically available data set. Following, the domains which are not detected by similar are passed into statistical analyzer. This takes the raw domain names as input and captures the optimal features implicitly by passing into character level embedding followed by deep learning layers and classify them using the CMLAs. Moreover, the effectiveness of the CMLAs are studied for categorizing algorithmically generated malware to its corresponding malware family over fully connected layer with \$\$\backslashtextit{softmax}\$\$non-linear activation function using AmritaDGA data set. All experiments related deep learning architectures are run till 100 epochs with learning rate 0.01. The experiments with deep learning architectures-CMLs showed highest test accuracy in comparison to deep learning architectures-\$\$\backslashtextit{softmax}\$\$model. This is due to the reason that the deep learning architectures are good at obtaining high level features and SVM good at constructing decision surfaces from optimal features. SVM generally can't learn complicated abstract and invariant features whereas the hidden layers in deep learning architectures facilitate to capture them.

Cite this Research Publication : R. Vinayakumar, Dr. Soman K. P., Prabaharan Poornachandran, Akarsh, S., and Elhoseny, M., “Improved DGA Domain Names Detection and Categorization Using Deep Learning Architectures with Classical Machine Learning Algorithms”, in Cybersecurity and Secure Information Systems: Challenges and Solutions in Smart Environments, A. Ella Hassanien and Elhoseny, M., Eds. Cham: Springer International Publishing, 2019, pp. 161–192.

Admissions Apply Now