Selection of Online Network Traffic Discriminators for on-the-Fly Traffic Classification
Copyright (c) 2020 Revista Ingenierías Universidad de Medellín
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
There are several techniques to select a set of traffic features for traffic classification. However, most studies ignore the domain knowledge where traffic analysis or classification is performed and do not consider the always moving information carried in the networks. This paper describes a selection process of online network-traffic discriminators. We obtained 24 traffic features that can be processed on the fly and propose them as a base attribute set for future domain-aware online analysis, processing, or classification. For the selection of a set of traffic discriminators, and to avoid the inconveniences mentioned, we carried out three steps. The first step is a context knowledge-based manual selection of traffic features that meet the condition of being obtained on the fly from the flow. The second step is focused on the quality analysis of previously selected attributes to ensure the relevance of each one when performing a traffic classification. In the third step, the implementation of several incremental learning algorithms verified the usefulness of such attributes in online traffic classification processes.
- T. Bakhshi and B. Ghita, “On Internet Traffic Classification: A Two-Phased Machine Learning Approach,” J. Comput. Netw. Commun., vol. 2016, pp. 21, 2016.
- N. Namdev, S. Agrawal, and S. Silkari, “Recent Advancement in Machine Learning Based Internet Traffic Classification,” Procedia Comput. Sci., vol. 60, pp. 784-791, Jan. 2015.
- T. T. T. Nguyen and G. Armitage, “A survey of techniques for internet traffic classification using machine learning,” IEEE Commun. Surv. Tutor., vol. 10, 4, pp. 56-76, 2008.
- A. Baer et al., “DBStream: A holistic approach to large-scale network traffic monitoring and analysis,” Comput. Netw., vol. 107, pp. 5-19, Oct. 2016.
- A. Moore, M. Crogan, and D. Zuev, “Discriminators for use in flow-based classification (Technical report No. RR-05-13),” University of London, Department of Computer Science, Queen Mary, 2005.
- H. R. Loo and M. N. Marsono, “Online network traffic classification with incremental learning,” Evol. Syst., vol. 7, 2, pp. 129-143, Jun. 2016.
- F. Ertam and E. Avcı, “A new approach for internet traffic classification: GA-WK-ELM,” Measurement, vol. 95, pp. 135-142, Jan. 2017.
- S. Valenti, D. Rossi, A. Dainotti, A. Pescapè, A. Finamore, and M. Mellia, “Reviewing Traffic Classification,” in Data Traffic Monitoring and Analysis: From Measurement, Classification, and Anomaly Detection to Quality of Experience, Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 123-147.
- A. Moore, J. Hall, C. Kreibich, E. Harris, and I. Pratt, “Architecture of a Network Monitor,” in Passive & Active Measurement Workshop 2003 (PAM2003), 2003.
- D. Lei, Y. Xiaochun, and X. Jun, “Optimizing Traffic Classification Using Hybrid Feature Selection,” in 2008 The Ninth International Conference on Web-Age Information Management, Zhangjiajie Hunan, China, 2008, pp. 520-525.
- D. Lei, C. You, and Y. Xiaochun, “Optimizing IP Flow Classification Using Feature Selection,”in Eighth International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT 2007), Adelaide, SA, Australia, 2007, pp. 39-45.
- H. Zhang, G. Lu, M. T. Qassrawi, Y. Zhang, and X. Yu, “Feature selection for optimizing traffic classification,” Comput. Commun., vol. 35, 12, pp. 1457-1471, Jul. 2012.
- D. Zuev and A. W. Moore, “Traffic Classification Using a Statistical Approach,” in Passive and Active Network Measurement, 2005, pp. 321-324.
- A. W. Moore and D. Zuev, “Internet Traffic Classification Using Bayesian Analysis Techniques,” in Proceedings of the 2005 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, New York, NY, USA, 2005, pp. 50–60.
- G. P. S. Junior, J. E. B. Maia, R. Holanda, and J. N. de Sousa, “P2P Traffic Identification using Cluster Analysis,” in 2007 First International Global Information Infrastructure Symposium, 2007, pp. 128–133.
- T. Auld, A. W. Moore, and S. F. Gull, “Bayesian Neural Networks for Internet Traffic Classification,” IEEE Trans. Neural Netw., vol. 18, 1, pp. 223-239, Jan. 2007.
- N. Jing, M. Yang, S. Cheng, Q. Dong, and H. Xiong, “An efficient SVM-based method for multi-class network traffic classification,” in 30th IEEE International Performance Computing and Communications Conference, Orlando, FL, 2011, pp. 1-8.
- R. Holanda Filho, M. F. Fontenelle do Carmo, J. E. B. Maia, and G. Paulino Siqueira, “An Internet traffic classification methodology based on statistical discriminators,” in NOMS 2008 - 2008 IEEE Network Operations and Management Symposium, Salvador, Bahia, Brazil, 2008, pp. 907-910.
- Y. Liu, H. Liu, H. Zhang, and X. Luan, “The Internet Traffic Classification an Online SVM Approach,” in 2008 International Conference on Information Networking, Busan, South Korea, 2008, pp. 1-5.
- F. Noorbehbahani, A. Fanian, R. Mousavi, and H. Hasannejad, “An incremental intrusion detection system using a new semi-supervised stream classification method,” Int. J. Commun. Syst., vol. 30, 4, p. e3002, Mar. 2017.
- G. Sun, T. Chen, Y. Su, and C. Li, “Internet Traffic Classification Based on Incremental Support Vector Machines,” Mob. Netw. Appl., vol. 23, 4, pp. 789-796, Aug. 2018.
- G. Baptista and T. Oliveira, “Gamification and serious games: A literature meta-analysis and integrative model,” Computers in Human Behavior, vol. 92, pp. 306–315, Mar. 2019, doi: 10.1016/j.chb.2018.11.030.
- J. Hamari and L. Keronen, “Why do people play games? A meta-analysis,” International Journal of Information Management, vol. 37, 3, pp. 125–141, Jun. 2017, doi: 10.1016/j.ijinfomgt.2017.01.006. H. A. Jamil, A. Mohammed, A. Hamza, S. M. Nor, and M. N. Marsono, “Selection of On-line Features for Peer-to-Peer Network Traffic Classification,” in Recent Advances in Intelligent Informatics, 2014, pp. 379-390.
- D. C. Corrales, A. Ledezma, and J. C. Corrales, “A Conceptual Framework for Data Quality in Knowledge Discovery Tasks (FDQ-KDT): A Proposal,” J. Comput., vol. 10, 6, pp. 396-405, Nov. 2015.
- M. Bramer, Principles of Data Mining. Springer, 2016.
- M. Juhola and J. Laurikkala, “Missing values: how many can they be to preserve classification reliability?,” Artif. Intell. Rev., vol. 40, 3, pp. 231-245, Oct. 2013.
- A. Fernández, S. García, M. Galar, R. C. Prati, B. Krawczyk, and F. Herrera, Learning from Imbalanced Data Sets. Springer, 2018.
- M. M. Patil, “Handling Concept Drift in Data Streams by Using Drift Detection Methods,” in Data Management, Analytics and Innovation, Singapore, 2019, pp. 155-166.
- A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer, “MOA: Massive Online Analysis,” J. Mach. Learn. Res., vol. 11, pp. 1601–1604, 2010.
- L. Rutkowski, M. Jaworski, and P. Duda, Stream Data Mining: Algorithms and Their Probabilistic Properties. Springer, 2019.