Anomaly Detection via Latent Variables Learned by Variational Autoencoders

dc.contributor.authorBranch, Richarden
dc.contributor.supervisorSkillicorn, David B.'s University at Kingstonen
dc.description.abstractModeling normality is a popular approach to anomaly detection. Learning good models of normality and choosing appropriate ways of identifying differences from normality are critical tasks for effective anomaly detection. Variational autoencoders are state of the art modeling techniques that incorporate latent variables, hidden variables that are not directly observed but instead inferred from observed variables. Approaches to anomaly detection via variational autoencoders either adopt reconstruction error as a sole anomaly detection metric, ignoring the potential of latent variables in the anomaly detection task, or consider the latent space as whole. However, reconstruction error on its own creates an incomplete picture of the latent space representation, while individual latent dimensions often fail to represent useful features for anomaly detection, dragging down the overall performance. I focus on making subtle choices using latent variables discovered via variational autoencoders to help find anomalies, from two different perspectives. The first leverages the stochastic nature of the latent variables, as each point in the latent space is sampled from probability distributions that are parameterized during the learning process. I develop the concept of expected latent representation, which I use for anomaly detection by evaluating differences between the expected latent representation and the prior. Additionally, I extend the expected latent representation to reconstruction error, by adopting the reconstruction error of the expected latent representation as an anomaly detection measure. Results from evaluations on benchmark datasets reveal similar performance or incremental improvements, in comparison to unmodified variational autoencoders, and competitive results to comparable anomaly detection techniques. The second perspective focuses on finding latent dimensions that contain valuable features for anomaly detection, where I explore subsets of dimensions. Evaluations on benchmark datasets suggest that subsets and individual dimensions, can significantly outperform the entire latent space. I propose a heuristic method for carefully selecting subsets of latent dimensions in a supervised manner. Although dataset dependent, the relative performance is improved by as much as twofold. In the end, these two perspectives demonstrate that subtle choices in crafting models of normality and measuring differences from normality, can significantly improve the anomaly detection task in a variety of datasets.en
dc.relation.ispartofseriesCanadian thesesen
dc.rightsQueen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canadaen
dc.rightsProQuest PhD and Master's Theses International Dissemination Agreementen
dc.rightsIntellectual Property Guidelines at Queen's Universityen
dc.rightsCopying and Preserving Your Thesisen
dc.rightsThis publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.en
dc.subjectAnomaly Detectionen
dc.subjectVariational Autoencodersen
dc.subjectLatent Variablesen
dc.subjectDeep Learningen
dc.subjectExpected Latent Representationen
dc.subjectLatent Dimensionsen
dc.subjectReconstruction Error of Expected Latent Representationen
dc.subjectSubsets of Latent Dimensionsen
dc.subjectApproximate Expected Reconstruction Erroren
dc.subjectAnomaly Detection Evaluation Guidelinesen
dc.subjectIndividual Latent Dimensionsen
dc.titleAnomaly Detection via Latent Variables Learned by Variational Autoencodersen
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
12.02 MB
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
2.25 KB
Item-specific license agreed upon to submission