diff --git a/img/PCA.png b/img/PCA.png new file mode 100644 index 0000000..11a01f1 Binary files /dev/null and b/img/PCA.png differ diff --git a/img/pattern-anomaly.png b/img/pattern-anomaly.png new file mode 100644 index 0000000..caf6b26 Binary files /dev/null and b/img/pattern-anomaly.png differ diff --git a/paper.tex b/paper.tex index 89dacd5..be90edb 100644 --- a/paper.tex +++ b/paper.tex @@ -54,7 +54,7 @@ Another factor for these models is the network topology. In a non-static WSN, a \subsection{Problem definition} -An anomaly is a collection of one or more temporally correlated measurements in a given dataset that seem to be inconsistent with expected results. These measurements can originate from different sensors, and in the context of WSNs even from different nodes. Bosman et al. \cite{bosman2013} and others distinguish between four different kinds of anomalies (c.f. Figure~\ref{fig:noisetypes}): +An anomaly is a collection of one or more temporally correlated measurements in a given dataset that seem to be inconsistent with expected results. These measurements can originate from different sensors, and in the context of WSNs even from different nodes. Bosman et al. \cite{bosman2013} and others distinguish between four different kinds of anomalies relevant in WSNs (c.f. Figure~\ref{fig:noisetypes}): \begin{itemize} \item \emph{Spikes} are short changes with a large amplitude @@ -69,12 +69,20 @@ Detecting constant type anomalies isn't very difficult, as they can simply be cl A Noise anomaly is not the same as a noisy sensor, working with noisy data is a problem in WSN, but we will not focus on methods of cleaning noisy data, as it is not in the scope of this survey. Elnahrawy et al. \cite{elnahrawy2003} and Barcelo et al. \cite{barcelo2019} are great places to start a survey in this direction. +In the general field of anomaly detection, more advanced definitions of anomalies can include patterns (c.f. Figure~\ref{fig:patternanomaly}) and other contextual phenomena, but these are much more rare in WSN, due to the nature of such networks measuring mostly less complex data, such as vibrations, temperature, etc. Therefore most approaches discussed in this survey won't take these anomalies into account and instead focus on the ones discussed above. + \begin{figure} \includegraphics[width=8.5cm]{img/anomaly_types.png} \caption{Spike, noise, constant and drift type anomalies in noisy linear data, image from Bosmal et al. \cite{bosman2013}} \label{fig:noisetypes} \end{figure} +\begin{figure} + \includegraphics[width=8.5cm]{img/pattern-anomaly.png} + \caption{Pattern based anomaly corresponding to an Atrial Premature Contraction in an electrodiagram, image from Chandola et al. \cite{chandola2009}} + \label{fig:patternanomaly} +\end{figure} + The term outlier and anomaly are often used interchangeably, but often actually mean slightly different phenomena. While an anomaly falls into one of these four categories, only spikes, noise, and some types of drifts are considered \emph{outliers} \cite{chandola2009}, as they are the only ones that produce data outside of the considered ''norm''. @@ -91,15 +99,16 @@ The problem of outlier detection in WSNs is the creation of a model which can us \subsection{Structure} -At first we will look into sensor self-calibration, a method of improving sensor accuracy. Calibrating a sensor will remove constant offsets, enabling nodes to compare measurements between one another more easily. If a sensor is in use for a prolonged length of time, it might needs to be recalibrated, to remove sensor drift. +After the introduction and coverage of related work, we will look into sensor self-calibration, a method of improving sensor accuracy. Calibrating a sensor will remove constant offsets, enabling nodes to compare measurements between one another more easily. If a sensor is in use for a prolonged length of time, it might needs to be recalibrated, to remove sensor drift. + +Then we will look into conventional, model based approaches to outlier detection, such as statistical, or density based models, followed by the more recent machine learning based models. Finally, all presented models are summarized in a table and evaluated based on their properties and requirements. -Then we will look into conventional, model based approaches to outlier detection, such as statistical models, or density based models. -We will first look into sensor self-calibration, which aims to remove or reduce drift and constant offsets. Then we will look into conventional model based techniques for outlier detection, such as probabilistic models, or density based models. At last we will look into machine learning based approaches to building these models. +\section{Related Work} +Chandola et al. \cite{chandola2009} provide a very comprehensive survey on outlier detection in general, not just focused on WSN. They introduce many key concepts and definitions, but focus more on outliers than anomalies in general. + -\section{Related work} -Chandola et al. \cite{chandola2009} provide a very comprehensive survey on outlier detection in general, not just focused on WSN. They introduce many key concepts and definitions, but focus more on outliers than anomalies in general. McDonald et al. \cite{mcdonald2013} survey methods of finding outliers in WSN, with a focus on distributed solutions. They go into a moderate amount of detail on most solutions, but skip over a lot of methods such as principal component ananlysis, and support vector machines, which were already maturing at that point in time. @@ -199,11 +208,11 @@ After the update phase, we obtain $\hat{x}_{k|k}$, which is our best approximati Sirisanwannakul et al. takes the computed Kalman gain and compares its bias. In normal operation, the gain is biased towards the measurement. If the sensor malfunctions, the bias is towards the prediction. But if the gains bias is between prediction and measurement, the system assumes sensor drift and corrects automatically. Since this approach lacks a ground truth measurement it cannot recalibrate the sensor, but the paper shows that accumulative error can be reduced by more than 50\%. -\section{Anomaly detection - model-based approaches} -A centralized WSN is defined by the existence of a central entity, called the \emph{base station} or \emph{fusion centre}, where all data is delivered to and analyzed. It is often assumed, that the base station does not have limits on its processing power or storage. Centralized approaches are not optimal in hostile environments, but that is not our focus here. Since central anomaly detection is closely related to the general field of anomaly detection, we will not go into much detail on these solution, instead focusing on covering solutions more specific to the field of WSN. +\section{Outlier detection - Classical Approaches} +We consider a classical approach to be anything that uses conventional (non-machine learning) models or algorithms to perform outlier detection. This chapter will first look at \subsection{Statistical Analysis} -Classical Statistical analysis is done by creating a model of the expected data and then finding the probability for each recorded data point. Improbable data points are then deemed outliers. The problem for many statistical approaches is finding this model of the expected data, as it is not always feasible to create it in advance. It also bears the problem of bad models or changes in the environment \cite{mcdonald2013}, requiring frequent update of the existing model. +Classical Statistical analysis is done by creating a statistical model of the expected data and then finding the probability for each recorded data point. Improbable data points are then deemed outliers. The problem for many statistical approaches is finding this model of the expected data, as it is not always feasible to create it in advance, when the nature of the phenomena is not well known in advance, or if the expected data is too complex. It is also not very robust to changes in the environment \cite{mcdonald2013}, requiring frequent updates to the model if the environment changes in ways not forseen by the model. Sheng et al. \cite{sheng2007} propose an approach to global outlier detection, meaning a data point is only regarded as an outlier, if their value differs significantly from all values collected over a given time, not just from local sensors near the measured one. They propose that the base station requests bucketed histograms of each nodes sensors data distribution to reduce the data transmitted. These histograms are polled, combined, and then used to analyze outliers by looking at the maximum distance a data point can be away from his nearest neighbors. This method bears some problems, as it fails to account for non gaussian distribution. Another problem is the use of fixed parameters for outlier detection, requiring prior knowledge of the data collected and anomaly density. These fixed parameters also require an update, whenever these parameters change. Due to the histograms used, this method cannot be used in a shifting network topology. @@ -223,23 +232,46 @@ Outliers can be selected by looking at the density of points as well. Breuning e Papadimitriou et al. \cite{papadimitriou2003} introduces a parameterless approach. They formulate a method using a local correlation integral (LOCI), which does not require parametrization. It uses a multi-granularity deviation factor (MDEF), which is the relative deviation for a point $p$ in a radius $r$. The MDEF is simply the number of nodes in an $r$-neighborhood divided by the sum of all points in the same neighborhood. LOCI provides an automated way to select good parameters for the MDEF and can detect outliers and outlier-clusters with comparable performance to other statistical approaches. They also formulate aLOCI, a linear approximation of LOCI, which also gives accurate results while reducing runtime. This approach can be used centralized, decentralized or clustered, depending on the scale of the event of interest. aLOCI seems great for even running on the sensor nodes itself, as it has relatively low computational complexity. +\subsection{Distance Based Approaches} + + + \subsection{Principal Component Analysis} + + + +\begin{figure*}[ht] + \includegraphics[width=0.8\textwidth]{img/PCA.png} + \caption{An Example of reducing a 3 Dimensional dataset to two dimensions using PCA to minimize loss of information. The PCA Vectors are marked red.} + \label{fig:pca} +\end{figure*} + Principal components of a point cloud in $\R^n$ are $n$ vectors $p_i$, where $p_i$ defines a line with minimal average square distance to the point cloud while lying orthogonal to all $p_j, j