2022: A Computational Odyssey - Towards a Deeper Understanding of Clustering Streaming Human Activity Recognition Data

Thumbnail Image
Woo, Martin
stream clustering , LSTM , CNN , HAR , Autoencoder , visualization , machine learning
Global integration of 22-billion network-connected devices such as smartphones and smartwatches into commonplace life have inadvertently created various research initiations with the plethora of data streams being readily available. Human Activity Recognition (HAR) is one such example and understanding the hierarchy of human movements can have positive implications in elderly care, physiotherapeutic therapy, surveillance, and general healthcare, among others. However, the typical modalities for deep analysis - pure supervised learning - are limited to the classes provided during training and are incapable of relaying any hidden knowledge residing outside of the trained labels. Unsupervised learning methods, such as clustering, are not only well equipped to handle high-density data streams and address the challenges inherently prevalent in this medium of data, but are also capable of automatically extracting varying levels of abstractions from the data represented by the relationships within and between clusters. This thesis presents a journey that begins with the hypothesis that time-series IoT data can be clustered to effectively identify human activity. However, experimental results ultimately evolves the question into \textit{what else is needed to allow effective clustering of HAR data for identifying human activity}. A subsequent exploration tries to answer two additional research questions: can deep learning approaches add a higher level of insight for time-dependent human movement sequences and uncover unintuitive knowledge regarding the actions, and can visualization allow for better understanding on the nature of human activities? Starting with experiments on applying renowned stand-alone clustering algorithms onto a HAR data stream, we developed hybrid algorithms to extract important features for human activity recognition from IoT sensor data, and concluded with an analysis of visualization algorithms that can be used for improving the interpretation of the clustering methods. Our work on clustering IoT HAR sensor data using extracted temporal features is novel and has not been done before. Not only do our machine learning pipelines meet or exceed the state-of-the-art results for HAR sequences, we also establish the first metric baseline for clustering. Our hybrid temporal-based architectures achieved the optimal results - extracting and returning exceptional temporal features for forming distinct clusters. The visualization of the temporally-reduced clusters asserts the necessity for supervised-learning-based temporal feature extraction to supplement unsupervised feature learning algorithms for clustering highly complex temporal-based HAR data. We demonstrate that our Temporal Feature Extraction architecture in combination with unsupervised learning models is capable of achieving comparable state-of-the-art classification accuracy and clustering scores of 93.8\% and 0.78 NMI and 0.548 ARI for the MobiAct v2.0 dataset, and 93.5\% accuracy with 0.99 NMI/ARI for the UCI HAR dataset.
External DOI