View : 488 Download: 0

Investigating the effect of traffic sampling on machine learning-based network intrusion detection approaches

Title
Investigating the effect of traffic sampling on machine learning-based network intrusion detection approaches
Authors
Alikhanov J.Jang R.Abuhamad M.Mohaisen D.Nyang D.Noh Y.
Ewha Authors
양대헌
SCOPUS Author ID
양대헌scopus
Issue Date
2022
Journal Title
IEEE Access
ISSN
2169-3536JCR Link
Citation
IEEE Access vol. 10, pp. 5801 - 5823
Keywords
CNNDeep learningFlow information exportIntrusion detectionMachine learningNetwork traffic sampling
Publisher
Institute of Electrical and Electronics Engineers Inc.
Indexed
SCIE; SCOPUS WOS scopus
Document Type
Article
Abstract
Machine Learning (ML) based Network Intrusion Systems (NIDSs) operate on flow features which are obtained from flow exporting protocols (i.e., NetFlow). Recent success of ML and Deep Learning (DL) based NIDS solutions assume such flow information (e.g., avg. packet size) is obtained from all packets of the flow. However, often in practice flow exporter is deployed on commodity devices where packet sampling is inevitable. As a result, applicability of such ML based NIDS solutions in the presence of sampling (i.e., when flow information is obtained from sampled set of packets instead of full traffic) is an open question. In this study, we explore the impact of packet sampling on the performance and efficiency of ML-based NIDSs. Unlike previous work, our proposed evaluation procedure is immune to different settings of flow export stage. Hence, it can provide a robust evaluation of NIDS even in the presence of sampling. Through sampling experiments we established that malicious flows with shorter size (i.e., number of packets) are likely to go unnoticed even with mild sampling rates such as 1/10 and 1/100. Next, using the proposed evaluation procedure we investigated the impact of various sampling techniques on NIDS detection rate and false alarm rate. Detection rate and false alarm rate is computed for three sampling rates (i.e., 1/10, 1/100, 1/1000), for four different sampling techniques and for three (two tree-based, one deep learning based) classifiers. Experimental results show that systematic linear sampler - SketFlow performs better compared to non-linear samplers such as Sketch Guided and Fast Filtered sampling. We also found that random forest classifier with SketchFlow sampling was a better combination. The combination showed higher detection rate and lower false alarm rate across multiple sampling rates compared to other sampler-classifier combinations. Our results are consistent in multiple sampling rates, exceptional case is observed for Sketch Guided Sampling (SGS) as it caused a drastic performance drop when sampling rate was changed from 1/100 to 1/1000. Our results provide valuable insights for network practitioners and researchers regarding on how packet sampling effects ML-based NIDS performance. © This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
DOI
10.1109/ACCESS.2021.3137318
Appears in Collections:
인공지능대학 > 사이버보안학과 > Journal papers
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

BROWSE