Impact of Reinforcement Learning Enabled Smart Homes on Humans
MetadataShow full item record
Smart homes are becoming increasingly popular as a result of advances in machine learning and cloud computing. Devices such as smart thermostats and speakers are now capable of learning from user feedback and adaptively adjust their settings to human preferences. Nonetheless, these devices might in turn impact human behavior. To investigate the potential impacts of smart homes on human behavior we simulate a series of Hierarchical-Reinforcement Learning-based human models capable of performing various activities namely setting temperature and humidity for thermal comfort inside a Q-Learning-based smart home model. We then investigate the possibility of the human models' behaviors being altered as a result of the smart home and the human model adapting to one another. We then integrate our human model in the environment along with the smart home model and perform rigorous experiments considering various scenarios involving a single human model and two human models. Our experiments show that with the smart home, the human model can exhibit unexpected behaviors like frequent changing of activities and an increase in the time required to modify the thermal preferences. With two human models, we interestingly observe that certain combinations of models result in normal behaviors, while other combinations exhibit the same unexpected behaviors as those observed from the single human experiment. Given that the smart home model uses parameters like temperature, humidity, and activity to learn the user's thermal preferences, it lacks personalization for the occupant. To address this, we modify our smart home model to obtain a partially observable smart home system based on the partial-observable Markov decision process, where the state of the human model is unknown. The smart home maintains a belief over each possible human state to approximate the hidden human state using its thermal parameters. Subsequently, it learns the personalized thermal preference of multiple human models. We demonstrate that our improved model can reasonably approximate up to 5 occupants and also learn their thermal preferences with incomplete state information. Our simulated experiment achieves classification accuracies of 0.98, 0.91, 0.76, and 0.67 for 2 to 5 human occupants in the smart home.
URI for this recordhttp://hdl.handle.net/1974/29501
Request an alternative formatIf you require this document in an alternate, accessible format, please contact the Queen's Adaptive Technology Centre
The following license files are associated with this item: