2022: A Computational Odyssey - Towards a Deeper Understanding of Clustering Streaming Human Activity Recognition Data
Loading...
Authors
Woo, Martin
Date
Type
thesis
Language
eng
Keyword
stream clustering , LSTM , CNN , HAR , Autoencoder , visualization , machine learning
Alternative Title
Abstract
Global integration of 22-billion network-connected devices such as smartphones and smartwatches into commonplace life have inadvertently created various research initiations with the plethora of data streams being readily available. Human Activity Recognition (HAR) is one such example and understanding the hierarchy of human movements can have positive implications in elderly care, physiotherapeutic therapy, surveillance, and general healthcare, among others. However, the typical modalities for deep analysis - pure supervised learning - are limited to the classes provided during training and are incapable of relaying any hidden knowledge residing outside of the trained labels. Unsupervised learning methods, such as clustering, are not only well equipped to handle high-density data streams and address the challenges inherently prevalent in this medium of data, but are also capable of automatically extracting varying levels of abstractions from the data represented by the relationships within and between clusters.
This thesis presents a journey that begins with the hypothesis that time-series IoT data can be clustered to effectively identify human activity. However, experimental results ultimately evolves the question into \textit{what else is needed to allow effective clustering of HAR data for identifying human activity}. A subsequent exploration tries to answer two additional research questions: can deep learning approaches add a higher level of insight for time-dependent human movement sequences and uncover unintuitive knowledge regarding the actions, and can visualization allow for better understanding on the nature of human activities?
Starting with experiments on applying renowned stand-alone clustering algorithms onto a HAR data stream, we developed hybrid algorithms to extract important features for human activity recognition from IoT sensor data, and concluded with an analysis of visualization algorithms that can be used for improving the interpretation of the clustering methods. Our work on clustering IoT HAR sensor data using extracted temporal features is novel and has not been done before. Not only do our machine learning pipelines meet or exceed the state-of-the-art results for HAR sequences, we also establish the first metric baseline for clustering. Our hybrid temporal-based architectures achieved the optimal results - extracting and returning exceptional temporal features for forming distinct clusters. The visualization of the temporally-reduced clusters asserts the necessity for supervised-learning-based temporal feature extraction to supplement unsupervised feature learning algorithms for clustering highly complex temporal-based HAR data. We demonstrate that our Temporal Feature Extraction architecture in combination with unsupervised learning models is capable of achieving comparable state-of-the-art classification accuracy and clustering scores of 93.8\% and 0.78 NMI and 0.548 ARI for the MobiAct v2.0 dataset, and 93.5\% accuracy with 0.99 NMI/ARI for the UCI HAR dataset.
Description
Citation
Publisher
License
Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
Attribution 3.0 United States
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
Attribution 3.0 United States