AI & Machine Learning Sample Work

A High Performance Model with Combinational AI approach for Identifying Cascading Connection of Malicious Activities

Info: 3891 words Sample Manuscript Service
Published: 20th DEC 2022

Share this:

1. Introduction

Data security has become a top priority on the internet in recent years. Intruders broke into a system for the unauthorised right to use or obtain information from the network. An incursion is defined as an attack, a hacking attempt, a packet sniffing attempt, or a data halt. Attacks are intended to compromise system privacy or networks to extort money, obtain vital records, or for other malevolent purposes (Kaloudi & Li, 2020). Intrusion alters a computer's program, data, or logic by introducing malicious code, causing several issues, including the ability to provide and take the institute's sensitive data and making it accessible to cybercriminals.

Cyber-attacks include data hacking, denial of service, malware, phishing, and theft (Procopiou & Komninos, 2019). The number of cyber attackers or illegal actions is increasing worldwide, and cyber security defenders face growing threats from these cyber attackers. It is likely to have a vast and significant impact on human life, necessitating the implementation security measures. And an intrusion detection system can help with these efforts (IDS). Intrusion detection is accomplished by collecting data packets, processing them, and identifying any unwanted, suspicious, or malicious activity in the traffic to notify the administrator (Khan et al., 2020).

Signature-based and anomaly-based approaches are commonly utilised. Some types are network intrusion detection systems, host-based intrusion detection systems, perimeter intrusion detection systems, and virtual machine intrusion detection systems. This process is useful for preventing data, information, and other types of losses from being lost due to an attack.

However, from an application standpoint, most of these works still have shortcomings (Porambage et al., 2019). To detect and identify attackers in a network, many academics recommended using machine learning (ML) techniques, including CNN (Convolutional Neural Networks), Naive Bayes (NB), Support Vector Machine (SVM) and LSTM (Long Short Term Memory). These methods use classical machine learning and have a higher computational cost. They don't learn more about their data (they are shallow learners). They also send out warnings that aren't entirely accurate (i.e., they raise false alarms). Unlike traditional machine learning, the current approach, known as deep learning, has demonstrated state-of-the-art performance on various issues, including intrusion detection (Sarker et al., 2021).

Deep learning offers automated deep feature extraction techniques. It provides a more accurate data representation, allowing for more advanced models. Building on today's research in intrusion detection, the recurrent neural network (RNN) has become one of the most extensively used algorithms in deep learning for performing classifications and other assessments on data sequences (Truong et al., 2020). Additionally, RNN is a valuable strategy for achieving excellent results in successive learning and improving anomaly detection in a network system. In addition, the use of a bidirectional Long-Short-Term Memory based RNN model known as BiDLSTM for network intrusion detection is discussed in this study.

2. Intrusion Detection System

An intrusion Detection System (IDS) is a cyber-security application that follows a firewall, antivirus software, and data loss prevention (DLP). Firewalls can perform only online traffic and limited analysis. On the other hand, IDS is a system that can track and analyse all network traffic. When a particular circumstance or cyber-attack happens in the design, IDSs can produce alarms and warn system administrators or information security professionals (Edamadaka et al., 2020).

To determine the attack in host-based systems, the user software on the user's computer and the user's motions are reviewed. According to the intrusion detection approach, the latter category can be signature-based or anomaly-based. Signature-based systems have a database of known assaults; if the network signatures match, the system sends out an alert and identifies a cyberattack (Hasan et al., 2019). However, if the system is not properly modelled, the number of false-positive alerts can be extremely high (Gopalan et al., 2021). The types of intrusion detection systems are presented in Figure. 1.

Figure 1:

3. Anomaly detection with LSTM networks

Anomalies occur in various domains and are thus the topic of extensive investigation in multiple sectors, including networks, the internet of things, medicine, and manufacturing processes. The common ground for all disciplines understands an anomaly as a deviation from the rule or an irregularity that is not deemed part of the normal system behaviour. This is consistent with the definition, which describes anomalies as abnormalities, deviants, or outliers. Anomalous dynamics are primarily unknown and arise unexpectedly, leading to instabilities and, as a result, increased inefficiencies and system defects (Yao et al., 2021).

The researchers Mohanty and Vyas (2018) explained that the structure of a feed-forward neural network is the foundation for deep learning models. The most typical components of deep learning are the input layer, hidden layer/layers, and output layer. To infer characteristics, hidden layers are used. A property vector in the input layer represents the item to be classified as an input. The output layer enables the generation of the class vector connected to the input vector (Devi & Mohankumar, 2019). The LSTM approach reduces the cost function and completes the learning process by utilizing a back propagation approach to change the weight values. The system is given an input vector and weights to begin with, and the error rate is calculated based on the difference between the outcome and the desired result. After that, the error rate is decreased by using backpropagation to adjust the weights (Abdulqadder et al., 2020).

3.1 Convolutional Neural Networks

The researchers Tabassum et al. (2019) have discussed that the Convolutional network data is captured in real-time, suggesting a CNN-based layer to separate anomalies from standard data. The method entails using the data to learn about changes resulting from anomalous data immediately. The critical problem was that normal and abnormal flows aren't so dissimilar; thus, CNN had to be driven to discover anomalies and changes.

Also, the author Powell (2020) suggested that to echo these actions, the proposed layer aims to learn prediction error filters exclusively. After that, the feature maps are linked to prediction error fields, which are used as low-level aberrant traces. As a result, while CNN's have the potential to improve object detection by learning visual information, they are not currently perfect for anomaly detection. In a CNN, expected input flow content is easily discovered. Still, irregularities show only minor variations from these average data, necessitating a particular algorithm to detect these minor differences. As a result, IDS researchers are investigating whether CNNs, for example, can learn to see these strange patterns and typical content properties. However, the accuracy level and the efficiency of detecting the anomalies don’t meet the expectation of the users.

3.2 Naïve Bayes based approach

The author's Jabber et al. (2017) proposed a novel method for identifying an intrusion in the network and compared it with other existing networks using Naïve Bayes. The NB algorithm assumes independent attributes and is very sensitive to selecting many features that interfere with the performance or accuracy of the NB to be low. Still, in practice, the possibilities of the component are interrelated.

Also, the other researchers, Mehri et al. (2018), proposed a method using NB, which is widely used in network intrusion detection systems (IDS) because it can improve the efficiency and accuracy of network intrusion detection and low false favourable rates. However, NB is weak because it assumes independent attributes and is very sensitive to a large selection of features, interfering with performance. NB accuracy is low, so it is necessary to research to improve its performance. The accuracy achieved from this method is 90.51%, and the false alarm detection rate is 0.14. Henceforth, the presented method must be validated with other datasets for better accuracy.

3.3 SVM-based approach

The researcher's Xu et al. (2018) discussed that the supervised approaches work with a fully labelled dataset and employ it to train the SVM. Although this method is the most accurate at detecting intrusions, finding a completely labelled dataset in practical security applications is difficult, if not impossible. The SVM parameters are trained to utilise met heuristic algorithms in some schemes using the trading data of the benchmark datasets to improve the accuracy and efficacy of the IDS. Some SVM-based systems use clustering methods in conjunction with the SVM to handle massive datasets and improve IDS performance. Furthermore, several IDS systems combine SVM with additional classifiers such as artificial neural networks, decision trees, and naive Bayes to improve detection rates. Many IDSs utilize an SVM to categorize computer network traffic as normal and abnormal (Alqahtani et al., 2020). However, the limitation is that the classification performance can be increased further.

3.4 LSTM-based approach

The author's Ding and Zhai (2018) have utilized RNNs with long short-term memory (LSTM) is a unique sort of RNN. After RNN was discovered, improvements elucidated LSTM. Although RNN is utilised in a variety of domains, including language processing and word prediction, it is a framework with flaws. RNN isn't very good at storing information in long-term memory. Gates in the LSTM can add and remove functionalities (Chen & Subramanian, 2018). The input gate, output gate and forget gate are the names of these three gates.

The primary function of LSTM is to accomplish disappearing gradient descent, which is an optimization process for determining artificial neural network weights to minimise long-term dependency issues. To detect anomalies, this study used the LSTM. The LSTM is implemented using a high-level neural networks API developed in Python that can run on top of TensorFlow. The trained LSTM model has a ten-neuron input layer corresponding to the ten features, a six-neuron hidden layer, and a five-neuron output layer (Hossain et al., 2020). In addition, the Long-Term Dependency Machine (LSTM) is a recurrent neural network that learns long-term dependencies using a gating mechanism.

It solves the vanishing gradient issue that occurs during the training of standard RNNs. Multiple switch gates are used in LSTM models to allow them to avoid units and, as a result, recall longer time steps. The LSTM architecture often features a memory called cells that accepts the current and past states as input (Chawla et al., 2019). These cells decide what to preserve and delete from memory, then combine the present and past states to form the next input. They can capture long-term interdependence in this way. Because of this advantage over ordinary RNNs, LSTMs have recently received much attention.

Because of its ability to learn temporal relations and capture them in a low-dimensional state representation, LSTM networks are predisposed to identify contextual anomalies. Stationary and non-stationary dynamics, as well as short-term and long-term dependencies, can all be affected by these relationships. Modelling multivariate time series and time-variant systems is particularly well suited to LSTM networks (Lindemann et al., 2021). As a result, anomaly detection can be based on the difference between natural system outputs and expected outputs predicted by the network. Furthermore, the ability to detect anomalies using LSTM-based techniques has been demonstrated, such as in the authors' work (Malhotra et al., 2015). Finally, a stacked LSTM architecture is shown in the study to detect anomalies in time series data.

In contrast to evaluating each time step separately, the novelty consists of evaluating multiple one-step ahead prediction mistakes. LSTM networks improve detection accuracy by modelling stationary and non-stationary time dependencies in advance. As a result, effective detection of temporal anomaly structures is possible. In this study (Lee et al., 2020) the authors proposed that a real-time detection approach is realized based on two LSTM networks. One to short-term model characteristics and can detect single upcoming anomalous data points within time series, and the other to control the detection based on long-term thresholds (Zenati et al., 2018).

Table 1: Overview of surveyed regular LSTM with the approaches

The authors Althubiti et al. (2018) explained that there are typical lags of uncertain duration between critical occurrences in a time series. LSTM networks are helpful for classifying, analysing, and making predictions based on that data. The vanishing gradient problem that can occur when training traditional RNNs was addressed with the development of LSTMs. In many cases, LSTM outperforms RNNs, hidden Markov models, and other sequence learning algorithms due to its relative sensitivity to gap length. The cell structure of the LSTM has four stages, as seen in fig.2. First, the cell state is shown by the horizontal uppermost line running through the cell. The LSTM can either delete or include information in the cell state, which is carefully controlled by structures known as gates. Third, the cell has a design that resembles a series. 1st step: The initial stage in LSTM is to identify information that isn't required and to decide what information will be discarded from the cell state. The sigmoid layer, known as the 'forget gate layer, is responsible for this decision.

ft=σ(wf.[ht−1,xt]+bf) ft=𝜎(wf.ht−1,xt+bf)

Where; ℎ𝑡−1= output from previous time stamp, 𝑥𝑡 = new input, 𝑏𝑓 = bais

Figure 2

Step 2: The next stage is to decide what kind of fresh data we'll store in the cell state. There are two components to this: The first is a "sigmoid layer" dubbed the "input gate layer," which determines which values will be updated, and the second is a "tanh layer," which generates a vector of new contender values that can be added to the state. So the next step is putting these two together to update a state.

it=σ(wi.[ht−1,xt]+bi) it=𝜎(wi.ht−1,xt+bi)

C˜t=tanh(wc.[ht−1,xt]+bc) C~t=tanh(wc.ht−1,xt+bc)

Step 3: It's time to switch from the old cell state C(t-1) to the new cell state C(t-2). The previous phases have already determined what should be done; all that is left is carrying out the instructions. First, the old state grows as we forget what we previously decided to fail. Then, add to the state of the cell. This is the new set of candidate values, scaled by the proportion for which each state value was determined to be updated (Tariq et al., 2020).

Ct=ft∗Ct+it∗C˜t Ct=ft∗Ct+it∗C~t

Step 4: The output stage is the last. This output will be used to support our cell state, but it will be filtered. To begin, create a sigmoid layer that determines which bits of the cell state will be output. Then run the cell state via tanh (to constrain the values to be between -1 and 1) and multiply it by the sigmoid gate's output, so that only the parts elected to output are outcome (Boukhalfa et al., 2020).

ot=σ(wo.[ht−1,xt]+bo) ot=𝜎(wo.ht−1,xt+bo)

ht=ot∗tanh(Ct) ht=ot∗tanh⁡(Ct)

4. Summary

Several types of assaults can be identified among the malicious operations carried out by attackers, including Denial of Service (DoS), disclosure, manipulation, impersonation, and repudiation. An umbrella word, incursion, might be used to group these types. Intrusion detection systems (IDS) have been created to deal with increasingly sophisticated intrusions and the emergence of increasingly sophisticated incursions (Lee et al., 2018).

As the Internet of Things (IoT) progresses, so do the security concerns. Only when the Internet of Things is secure, which artificial intelligence can provide, is it a benefit to society. For detection, the approach employs both binary and multiclass classification. Multiclass categories are divided into four primary classes, all present in the NSL-KDD dataset. The necessity for a very precise assault detection can be solved by employing a deep learning method for intrusion detection (Idrissi et al., 2021). The vast capacity of recurrent neural networks (RNNs) to recognise complicated patterns in a text and generate similar patterns can be used. For example, the LSTM RNN can handle cyber-attacks by guaranteeing that a continuous error is maintained. This allows the RNN to learn over lengthy steps, allowing it to associate problems and their effects remotely. This feature allows it to save specifics of attacks learned during the training process and make detection judgments based on this information (Mirza & Cosan, 2018).

This gate, unlike computer memory, is analogue rather than digital. Last but not least, we want to emphasise that while RNNs' applications have been extensively examined and recognised in the machine learning literature, their ability to assist with intrusion detection should be investigated further (Kasongo & Sun, 2021). This study set the path for a more thorough investigation of this cybersecurity capability. Because the embedding technique could capture factual information, the LSTM models proved to be the best. The LSTM models are better because the embedding technique captures accurate information, which is crucial for attack detection.

References

Abdulqadder, I.H., Zhou, S., Zou, D., Aziz, I.T. & Akber, S.M.A. 2020. Multi-layered intrusion detection and prevention in the SDN/NFV enabled cloud of 5G networks using AI-based defense mechanisms. Computer Networks. (179). pp. 107364.

Alqahtani, A., Gazzan, M. & Sheldon, F.T. 2020. A proposed Crypto-Ransomware Early Detection(CRED) Model using an Integrated Deep Learning and Vector Space Model Approach. In: 2020 10th Annual Computing and Communication Workshop and Conference (CCWC). January 2020, IEEE, pp. 0275–0279.

Althubiti, S.A., Jones, E.M. & Roy, K. 2018. LSTM for Anomaly-Based Network Intrusion Detection. In: 2018 28th International Telecommunication Networks and Applications Conference (ITNAC). November 2018, IEEE, pp. 1–3.

Bontemps, L., Cao, V.L., McDermott, J. & Le-Khac, N.-A. 2016. Collective Anomaly Detection Based on Long Short-Term Memory Recurrent Neural Networks. In: pp. 141–152.

Boukhalfa, A., Abdellaoui, A., Hmina, N. & Chaoui, H. 2020. LSTM deep learning method for network intrusion detection system. International Journal of Electrical and Computer Engineering (IJECE). (10)3,. pp. 3315.

Chawla, A., Lee, B., Fallon, S. & Jacob, P. 2019. Host Based Intrusion Detection System with Combined CNN/RNN Model. In: pp. 149–158.

Chen, Z. & Subramanian, D. 2018. An unsupervised approach to detect spam campaigns that use botnets on twitter. arXiv preprint arXiv:1804.05232.

Devi, R.S. & Mohankumar, M. 2019. Digital Forensics and Artificial Intelligence for Cyber Security. (13).

Ding, S., Morozov, A., Vock, S., Weyrich, M. & Janschek, K. 2020. Model-Based Error Detection for Industrial Automation Systems Using LSTM Networks. In: pp. 212–226.

Ding, Y. & Zhai, Y. 2018. Intrusion Detection System for NSL-KDD Dataset Using Convolutional Neural Networks. In: Proceedings of the 2018 2nd International Conference on Computer Science and Artificial Intelligence - CSAI ’18. 2018, New York, New York, USA: ACM Press, pp. 81–85.

Edamadaka, G., Chowdary, C.S., Kumar, M.J. & Sai, N.R. 2020. Hybrid Learning Method to Detect the Malicious Transactions in Network Data. In: IOP Conference Series: Materials Science and Engineering. 2020, IOP Publishing, pp. 22032.

Hasan, K., Shetty, S. & Ullah, S. 2019. Artificial Intelligence Empowered Cyber Threat Detection and Protection for Power Utilities. In: 2019 IEEE 5th International Conference on Collaboration and Internet Computing (CIC). December 2019, IEEE, pp. 354–359.

Hossain, M.D., Inoue, H., Ochiai, H., Fall, D. & Kadobayashi, Y. 2020. LSTM-Based Intrusion Detection System for In-Vehicle Can Bus Communications. IEEE Access. (8). pp. 185489–185502.

Idrissi, I., Boukabous, M., Azizi, M., Moussaoui, O. & El Fadili, H. 2021. Toward a deep learning-based intrusion detection system for IoT against botnet attacks. IAES International Journal of Artificial Intelligence (IJ-AI). (10)1,. pp. 110.

Jabbar, M.A., Aluvalu, R. & Reddy S, S.S. 2017. RFAODE: A Novel Ensemble Intrusion Detection System. Procedia Computer Science. (115). pp. 226–234.

Kaloudi, N. & Li, J. 2020. The AI-Based Cyber Threat Landscape. ACM Computing Surveys. (53)1,. pp. 1–34.

Kasongo, S.M. & Sun, Y. 2021. A Deep Gated Recurrent Unit based model for wireless intrusion detection system. ICT Express. (7)1,. pp. 81–87.

Khan, A.Y., Latif, R., Latif, S., Tahir, S., Batool, G. & Saba, T. 2020. Malicious Insider Attack Detection in IoTs Using Data Analytics. IEEE Access. (8). pp. 11743–11753.

Lee, B., Amaresh, S., Green, C. & Engels, D. 2018. Comparative study of deep learning models for network intrusion detection. SMU Data Science Review. (1)1,. pp. 8.

Lee, M.-C., Lin, J.-C. & Gan, E.G. 2020. ReRe: A Lightweight Real-Time Ready-to-Go Anomaly Detection Approach for Time Series. In: 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC). July 2020, IEEE, pp. 322–327.

Lindemann, B., Muller, T., Vietz, H., Jazdi, N. & Weyrich, M. 2021. A survey on long short-term memory networks for time series prediction. Procedia CIRP. (99). pp. 650–655.

Malhotra, P., Vig, L., Shroff, G. & Agarwal, P. 2015. Long short term memory networks for anomaly detection in time series. In: Proceedings. 2015, pp. 89–94.

Mehri, V.A., Ilie, D. & Tutschku, K. 2018. Privacy and DRM Requirements for Collaborative Development of AI Applications. In: Proceedings of the 13th International Conference on Availability, Reliability and Security. 27 August 2018, New York, NY, USA: ACM, pp. 1–8.

Mirza, A.H. & Cosan, S. 2018. Computer network intrusion detection using sequential LSTM Neural Networks autoencoders. In: 2018 26th Signal Processing and Communications Applications Conference (SIU). May 2018, IEEE, pp. 1–4.

Mohanty, S. & Vyas, S. 2018. Cybersecurity and AI. In: How to Compete in the Age of Artificial Intelligence. Berkeley, CA: Apress, pp. 143–153.

Porambage, P., Kumar, T., Liyanage, M., Partala, J., Lovn, L., Ylianttila, M. & Seppnen, T. 2019. Sec-EdgeAI: AI for edge security Vs security for edge AI. The 1st 6G Wireless Summit,(Levi, Finland).

Powell, B.A. 2020. Detecting malicious logins as graph anomalies. Journal of Information Security and Applications. (54). pp. 102557.

Procopiou, A. & Komninos, N. 2019. ORCID: 0000-0003-2776-1283 and Douligeris, C.(2019). ForChaos: Real Time Application DDoS detection using Forecasting and Chaos Theory in Smart Home IoT Network. Wireless Communications and Mobile Computing. (8469410).

Sarker, I.H., Furhad, M.H. & Nowrozy, R. 2021. AI-Driven Cybersecurity: An Overview, Security Intelligence Modeling and Research Directions. SN Computer Science. (2)3,. pp. 173.

Zenati, H., Foo, C.S., Lecouat, B., Manek, G. & Chandrasekhar, V.R. 2018. Efficient gan-based anomaly detection. arXiv preprint arXiv:1802.06222.