Anomaly Detection via Latent Variables Learned by Variational Autoencoders

Loading...
Thumbnail Image

Authors

Branch, Richard

Date

Type

thesis

Language

eng

Keyword

Anomaly Detection , Variational Autoencoders , Latent Variables , Deep Learning , Expected Latent Representation , Latent Dimensions , Reconstruction Error of Expected Latent Representation , Subsets of Latent Dimensions , Approximate Expected Reconstruction Error , Anomaly Detection Evaluation Guidelines , Individual Latent Dimensions

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Modeling normality is a popular approach to anomaly detection. Learning good models of normality and choosing appropriate ways of identifying differences from normality are critical tasks for effective anomaly detection. Variational autoencoders are state of the art modeling techniques that incorporate latent variables, hidden variables that are not directly observed but instead inferred from observed variables. Approaches to anomaly detection via variational autoencoders either adopt reconstruction error as a sole anomaly detection metric, ignoring the potential of latent variables in the anomaly detection task, or consider the latent space as whole. However, reconstruction error on its own creates an incomplete picture of the latent space representation, while individual latent dimensions often fail to represent useful features for anomaly detection, dragging down the overall performance. I focus on making subtle choices using latent variables discovered via variational autoencoders to help find anomalies, from two different perspectives. The first leverages the stochastic nature of the latent variables, as each point in the latent space is sampled from probability distributions that are parameterized during the learning process. I develop the concept of expected latent representation, which I use for anomaly detection by evaluating differences between the expected latent representation and the prior. Additionally, I extend the expected latent representation to reconstruction error, by adopting the reconstruction error of the expected latent representation as an anomaly detection measure. Results from evaluations on benchmark datasets reveal similar performance or incremental improvements, in comparison to unmodified variational autoencoders, and competitive results to comparable anomaly detection techniques. The second perspective focuses on finding latent dimensions that contain valuable features for anomaly detection, where I explore subsets of dimensions. Evaluations on benchmark datasets suggest that subsets and individual dimensions, can significantly outperform the entire latent space. I propose a heuristic method for carefully selecting subsets of latent dimensions in a supervised manner. Although dataset dependent, the relative performance is improved by as much as twofold. In the end, these two perspectives demonstrate that subtle choices in crafting models of normality and measuring differences from normality, can significantly improve the anomaly detection task in a variety of datasets.

Description

Citation

Publisher

License

Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.

Journal

Volume

Issue

PubMed ID

External DOI

ISSN

EISSN