INTERNET TRAFFIC ANALYSIS: MAPREDUCE BASED TRAFFIC FLOW CLASSIFICATION IN HADOOP ENVIRONMENT
Main Article Content
Abstract
Internet is the global network that interconnects entities all over the world. This unparalleled network has occupied the mandatory part in the life of every individual. In recent days, due to the increase in the number of flow, the internet traffic is increased. The increasing traffic is flooding with the DDoS flows from multiple DDoS attackers. If DDoS flow traffic enters the internet, then there will be a drastic increase in the utilization of resources. Due to this, the legitimate traffic will not get proper service. In order to address the above issues, this paper has proposed an approach that classifies the internet traffic as Normal traffic flow or DDoS traffic flow. A huge volume of traffic flows is analyzed in this paper and the results are presented. The MapReduce is implemented for the classification as it accurately maps the flow features and reduces them into the appropriate traffic type. The incoming traffic is classified into one of the three categories as Web Traffic, DDoS Traffic (Heavy User) or DDoS Traffic (Spoofed IP). The main objective of this paper is to classify structured as well as unstructured data of IP, TCP, HTTP and NetFlow analysis. The experimental observations were carried out in the Hadoop 2.7.2 environment. The dataset is obtained from Wireshark, which consists of traffic flow based on latest traffic pattern. Hadoop Distributed File System (HDFS) and MapReduce components of Hadoop are used under the metrics as Work Completion Time, Throughput and Accuracy.
Downloads
Article Details
COPYRIGHT
Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
- The journal allows the author(s) to retain publishing rights without restrictions.
- The journal allows the author(s) to hold the copyright without restrictions.
References
Arthur Callado, Carlos Kamienski, Geza Szabo, Balazs Peter Gero, Judith Kelner, Stenio Fernandes, Djamel Sadok, “A Survey on Internet Traffic Identificationâ€, IEEE Communications Surveys & Tutorials, IEEE, Vol. 11, no. 3, pp. no. 37-52, 2009.
Akshay Kumar Suman, Dr. Manasi Gyanchandani, Priyank Jain, “A Survey on Miscellaneous Attacks in Hadoop Frameworkâ€, 2018 2nd International Conference on Inventive Systems and Control, IEEE, 2018.
Ronaldo Celso Messias Correia, Gabriel Spadon, Pedro Henrique De Andrade Gomes, Danilo Medeiros Eler, Rogério Eduardo Garcia and Celso Olivete Junior, “Hadoop Cluster Deployment:A Methodological Approachâ€, information, MDPI, Vol. 9, no. 6, 2018.
Kaushik Sekaran, G.Raja Vikram, B.V. Chowdar, UNP Gangadhar Raju, “Combating Distributed Denial of Service Attacks Using Load Balanced Hadoop Clustering in Cloud Computing Environmentâ€, ICDTE 2018: Proceedings of the 2nd International Conference on Digital Technology in Education, pp. no. 77-81, 2018.
Andrea Morichetta, Marco Mellia, “Clustering and evolutionary approach for longitudinal web traffic analysisâ€, Performance Evaluation, ELSEVIER, vol. 135, 2019.
Neha Sehta, Karuna Mishra, “Network Traffic Classification Using Hadoop Serverâ€, International Journal of Engineering Science and Computing, IJESC, Vol. 8, no. 10, 2018.
Margaret Gratian, Darshan Bhansali, Michel Cukier, Josiah Dykstra, “Identifying Infected Users via Network Trafficâ€, Computers & Security, ELSEVIER, Vol. 80, pp. no. 306-316, 2019.
Muhammad Aamir, Syed Mustafa Ali Zaidi, “Clustering based semi-supervised machine learning for DDoS attack classificationâ€, Journal of King Saud University - Computer and Information Sciences, ELSEVIER, 2019.
Tae-YoungKim and Sung-Bae Cho, “Web Traffic Anomaly Detection using C-LSTM Neural Networksâ€, Expert Systems With Applications, ELSEVIER, Vol. 106, pp. no. 66-76, 2018.
Mohammed Ali Al-Garadi, Amr Mohamed, AbdullaAl-Ali, Xiaojiang Du, Mohsen Guizani, “A Survey of Machine and Deep Learning Methods for Internet of Things (IoT) Securityâ€, Cryptography and Security, arXiv, 2018.
Alan Saied, Richard E. Overill, Tomasz Radzik, “Detection of known and unknown DDoS attacks using Artificial Neural Networksâ€, Neurocomputing, ELSEVIER, Vol. 172, pp. no. 385-393, 2016.
Asad Arfeen, Krzysztof Pawlikowski, Don McNickle, Andreas Willig, “The role of the Weibull distribution in modelling traffic in Internet access and backbone core networksâ€, Journal of Network and Computer Applications, ELSEVIER, Vol. 141, pp. no. 1-22, 2019.
Muhammad Taufiq Zulfikar, Suharjito, “Detection Traffic Congestion Based on Twitter Data using Machine Learningâ€, Procedia Computer Science, ELSEVIER, Vol. 157, pp. no. 118-124, 2019.
ShakeelAhmad,AmanullahYasin, Qaisar Shafi, “DDoS Attacks Analysis in Bigdata (Hadoop) Environmentâ€, 2018 15th International Bhurban Conference on Applied Sciences and Technology (IBCAST), IEEE, 2018.
Vishal Maheshwari, Ashutosh Bhatia and Kuldeep Kumar, “Faster Detection and Prediction of DDoS attacks using MapReduce and Time Series Analysisâ€, 2018 International Conference on Information Networking (ICOIN), IEEE, 2018.
Sufian Hameed and Usman Ali, “HADEC: Hadoop-based live DDoSdetection frameworkâ€, EURASIP Journal on Information Security, 2018.
Nilesh Vishwasrao Patil, C.Rama Krishna, Krishan Kumar, SunnyBehal, “E-Had: A distributed and collaborative detection framework for early detection of DDoS attacksâ€, Journal of King Saud University - Computer and Information Sciences, ELSEVIER, 2019.
Awais Ahmed, Sufian Hameed, Muhammad Rafi, Qublai Khan Ali Mirza, “An Intelligent and Time- Efficient DDoS Identification Framework for Real-Time Enterprise Networksâ€, Cryptography and Security, arXiv, 2020.
Nakul Chorey, Rujuta Kate, Prajakta Khatavkar, Ms. Renuka.R.Kajale, “Detecting, Capturing &Resolving of DDoS Attacks with Hadoopâ€, IJSRD -International Journal for Scientific Research & Development|, IJSRD, Vol. 6, no. 2, 2018.
M. Sughasiny, “Zero Event Anomaly Detection in Big Data using Spark for Fast and Streaming Applicationsâ€, International Journal of Pure and Applied Mathematics, Vol. 119, no. 15, 2018.
Mounir Hafsa and Farah Jemili, “Comparative Study between Big Data AnalysisTechniques in Intrusion Detectionâ€, big data and cognitive computing, MDPI, Vol. 3, no. 1, 2018.