Publication Type : Journal Article
Publisher : Journal of Cyber Security and Mobility
Source : Journal of Cyber Security and Mobility, River Publishers, Volume 8, Number 2, p.189-240 (2018)
Campus : Coimbatore
School : School of Engineering
Center : Computational Engineering and Networking
Department : Computer Science, Electronics and Communication
Year : 2018
Abstract : A computer virus or malware is a computer program, but with the purpose of causing harm to the system. This year has witnessed the rise of malware and the loss caused by them is high. Cyber criminals have continually advancing their methods of attack. The existing methodologies to detect the existence of such malicious programs and to prevent them from executing are static, dynamic and hybrid analysis. These approaches are adopted by anti-malware products. The conventional methods of were only efficient till a certain extent. They are incompetent in labeling the malware because of the time taken to reverse engineer the malware to generate a signature. When the signature becomes available, there is a high chance that a significant amount of damage might have occurred. However, there is a chance of detecting the malicious activities quickly by analyzing the events of DNS logs, Emails, and URLs. As these unstructured raw data contains rich source of information, we explore how the large volume of data can be leveraged to create cyber intelligent situational awareness to mitigate advanced cyber threats. Deep learning is a machine learning technique largely used by researchers in recent days. It avoids feature engineering which served as a critical step for conventional machine learning algorithms. It can be used along with the existing automation methods such as rule and heuristics based and machine learning techniques. This work takes the advantage of deep learning architectures to classify and correlate malicious activities that are perceived from the various sources such as DNS, Email, and URLs. Unlike conventional machine learning approaches, deep learning architectures don't follow any feature engineering and feature representation methods. They can extract optimal features by themselves. Still, additional domain level features can be defined for deep learning methods in NLP tasks to enhance the performance. The cyber security events considered in this study are surrounded by texts. To convert text to real valued vectors, various natural language processing and text mining methods are incorporated. To our knowledge, this is the first attempt, a framework that can analyze and correlate the events of DNS, Email, andURLsat scale to provide situational awareness against malicious activities. The developed framework is highly scalable and capable of detecting the malicious activities in near real time. Moreover, the framework can be easily extended to handle large volume of other cyber security events by adding additional resources. These characteristics have made the proposed framework stand out from any other system of similar kind. © 2018 the Author(s).
Cite this Research Publication : R. Vinayakumar, Dr. Soman K. P., Poornachandran, P., Mohan, V. S., and Kumar, A. D., “ScaleNet: Scalable and Hybrid Framework for Cyber Threat Situational Awareness based on DNS, URL, and Email data Analysis”, Journal of Cyber Security and Mobility, vol. 8, pp. 189-240, 2018.