SELECTION OF ONLINE NETWORK TRAFFIC DISCRIMINATORS FOR ON-THE-FLY TRAFFIC CLASSIFICATION
There are several techniques to select a set of traffic features to perform traffic classification. However, most studies ignore the domain knowledge where traffic analysis or classification is performed and do not consider that the information carried in the networks is always moving. This paper describes a selection process of online network traffic discriminators, obtaining 24 traffic features that can be processed on the fly, and proposes them as a set of base attributes for future domain aware online analysis, processing or classification. For the selection of a set of traffic features, and to avoid inconveniences mentioned, we carried out three steps. The first step is a context knowledge based manual selection of traffic attributes that meet the condition of being obtained on the fly of the flow. The second step is focused on the quality analysis of previously selected attributes to ensure the relevance of each one when performing a traffic classification. In the third step, the usefulness of the discriminators in online traffic classification processes is verified by implementing several incremental learning algorithms.
References T. Bakhshi and B. Ghita, “On Internet Traffic Classification: A Two-Phased Machine Learning Approach,” J. Comput. Netw. Commun., vol. 2016, p. 21, 2016.
 N. Namdev, S. Agrawal, and S. Silkari, “Recent Advancement in Machine Learning Based Internet Traffic Classification,” Procedia Comput. Sci., vol. 60, pp. 784–791, Jan. 2015.
 T. T. T. Nguyen and G. Armitage, “A survey of techniques for internet traffic classification using machine learning,” IEEE Commun. Surv. Tutor., vol. 10, no. 4, pp. 56–76, 2008.
 A. Baer et al., “DBStream: A holistic approach to large-scale network traffic monitoring and analysis,” Comput. Netw., vol. 107, pp. 5–19, Oct. 2016.
 A. Moore, M. Crogan, and D. Zuev, “Discriminators for use in flow-based classification (Technical report No. RR-05-13),” University of London, Department of Computer Science, Queen Mary, 2005.
 H. R. Loo and M. N. Marsono, “Online network traffic classification with incremental learning,” Evol. Syst., vol. 7, no. 2, pp. 129–143, Jun. 2016.
 F. Ertam and E. Avcı, “A new approach for internet traffic classification: GA-WK-ELM,” Measurement, vol. 95, pp. 135–142, Jan. 2017.
 S. Valenti, D. Rossi, A. Dainotti, A. Pescapè, A. Finamore, and M. Mellia, “Reviewing Traffic Classification,” in Data Traffic Monitoring and Analysis: From Measurement, Classification, and Anomaly Detection to Quality of Experience, Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 123–147.
 D. Lei, Y. Xiaochun, and X. Jun, “Optimizing Traffic Classification Using Hybrid Feature Selection,” in 2008 The Ninth International Conference on Web-Age Information Management, Zhangjiajie Hunan, China, 2008, pp. 520–525.
 D. Lei, C. You, and Y. Xiaochun, “Optimizing IP Flow Classification Using Feature Selection,” in Eighth International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT 2007), Adelaide, SA, Australia, 2007, pp. 39–45.
 H. Zhang, G. Lu, M. T. Qassrawi, Y. Zhang, and X. Yu, “Feature selection for optimizing traffic classification,” Comput. Commun., vol. 35, no. 12, pp. 1457–1471, Jul. 2012.
 D. Zuev and A. W. Moore, “Traffic Classification Using a Statistical Approach,” in Passive and Active Network Measurement, 2005, pp. 321–324.
 A. W. Moore and D. Zuev, “Internet Traffic Classification Using Bayesian Analysis Techniques,” in Proceedings of the 2005 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, New York, NY, USA, 2005, pp. 50–60.
 G. P. S. Junior, J. E. B. Maia, R. Holanda, and J. N. de Sousa, “P2P Traffic Identification using Cluster Analysis,” in 2007 First International Global Information Infrastructure Symposium, 2007, pp. 128–133.
 T. Auld, A. W. Moore, and S. F. Gull, “Bayesian Neural Networks for Internet Traffic Classification,” IEEE Trans. Neural Netw., vol. 18, no. 1, pp. 223–239, Jan. 2007.
 N. Jing, M. Yang, S. Cheng, Q. Dong, and H. Xiong, “An efficient SVM-based method for multi-class network traffic classification,” in 30th IEEE International Performance Computing and Communications Conference, Orlando, FL, 2011, pp. 1–8.
 R. Holanda Filho, M. F. Fontenelle do Carmo, J. E. B. Maia, and G. Paulino Siqueira, “An Internet traffic classification methodology based on statistical discriminators,” in NOMS 2008 - 2008 IEEE Network Operations and Management Symposium, Salvador, Bahia, Brazil, 2008, pp. 907–910.
 Y. Liu, H. Liu, H. Zhang, and X. Luan, “The Internet Traffic Classification an Online SVM Approach,” in 2008 International Conference on Information Networking, Busan, South Korea, 2008, pp. 1–5.
 F. Noorbehbahani, A. Fanian, R. Mousavi, and H. Hasannejad, “An incremental intrusion detection system using a new semi-supervised stream classification method,” Int. J. Commun. Syst., vol. 30, no. 4, p. e3002, Mar. 2017.
 G. Sun, T. Chen, Y. Su, and C. Li, “Internet Traffic Classification Based on Incremental Support Vector Machines,” Mob. Netw. Appl., vol. 23, no. 4, pp. 789–796, Aug. 2018.
 H. A. Jamil, A. Mohammed, A. Hamza, S. M. Nor, and M. N. Marsono, “Selection of On-line Features for Peer-to-Peer Network Traffic Classification,” in Recent Advances in Intelligent Informatics, 2014, pp. 379–390.
 D. C. Corrales, A. Ledezma, and J. C. Corrales, “A Conceptual Framework for Data Quality in Knowledge Discovery Tasks (FDQ-KDT): A Proposal,” J. Comput., vol. 10, no. 6, pp. 396–405, Nov. 2015.
 M. Juhola and J. Laurikkala, “Missing values: how many can they be to preserve classification reliability?,” Artif. Intell. Rev., vol. 40, no. 3, pp. 231–245, Oct. 2013.
 E. Frank, M. A. Hall, and I. H. Witten, The WEKA Workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques,” 4th ed. Morgan Kaufmann, 2016.
 A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer, “MOA: Massive Online Analysis,” J. Mach. Learn. Res., vol. 11, pp. 1601–1604, 2010.
Copyright (c) 2020 Revista Ingenierías Universidad de Medellín
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The total or partial reproduction of the contents of the journal for educational, research, or academic purposes is authorized as long as the source is cited. For reproduction for other purposes, express authorization from the Sello Editorial Universidad de Medellín is required.