Methods for Low Footprint Intrusion Detection Using Ensemble Learning

Thumbnail Image
Shafieian, Saeed
Intrusion Detection , Ensemble Learning , Machine Learning , Anomaly Detection , Low Footprint Intrusion
Machine learning has rapidly become the state-of-the-art solution to problems in many areas of computing such as vision and natural language processing. In the intrusion detection domain, machine learning-based techniques have also been used in academia and industry in order to detect anomalies in network traffic. There are practical limitations, however, in using machine learning techniques in real-world intrusion detection systems as opposed to some other domains. In this thesis, we present methods for low footprint intrusion detection using ensemble learning. We identify the cloud attributes that can be exploited in order to exacerbate intrusions on the cloud. We define low footprint intrusions as specific attacks that do not transfer volumetric data to or from a target machine and may be exacerbated by the cloud. By being stealthier than volumetric attacks, low footprint intrusions can go under the radar of traditional intrusion detection systems. This research analyzes different methods of ensemble learning and presents ensemble models that achieve very high accuracy and very low error rates in detecting low footprint intrusions. We show that these models combine base machine learning classifiers that individually do not perform on par with the ensemble learners. However, by bringing more diversity, the base learners enable the ensemble model to gain high-performance results. This research shows that among hundreds of ensemble models from a number of base learners only a few multi-layer stacking ensemble models satisfy strict classification performance criteria. This is achieved by carefully crafting the ensemble models by considering different weights, choice of base and meta learners, hyperparameters, placement of learners, combination methods, and architectures. We simulate and launch low footprint intrusions from virtual machines on Amazon Web Services (AWS). We show that low footprint intrusions can be easily launched from public clouds against targets outside of the cloud. We have implemented our data processing, machine learning models, and evaluation techniques using open-source machine learning libraries in Java (Weka) and Python (scikit-learn).
External DOI