Impact of Reinforcement Learning Enabled Smart Homes on Humans

Thumbnail Image
Suman, Shashi
Reinforcement Learning , Smart Homes , Human Model , User Recognition
Smart homes are becoming increasingly popular as a result of advances in machine learning and cloud computing. Devices such as smart thermostats and speakers are now capable of learning from user feedback and adaptively adjust their settings to human preferences. Nonetheless, these devices might in turn impact human behavior. To investigate the potential impacts of smart homes on human behavior we simulate a series of Hierarchical-Reinforcement Learning-based human models capable of performing various activities namely setting temperature and humidity for thermal comfort inside a Q-Learning-based smart home model. We then investigate the possibility of the human models' behaviors being altered as a result of the smart home and the human model adapting to one another. We then integrate our human model in the environment along with the smart home model and perform rigorous experiments considering various scenarios involving a single human model and two human models. Our experiments show that with the smart home, the human model can exhibit unexpected behaviors like frequent changing of activities and an increase in the time required to modify the thermal preferences. With two human models, we interestingly observe that certain combinations of models result in normal behaviors, while other combinations exhibit the same unexpected behaviors as those observed from the single human experiment. Given that the smart home model uses parameters like temperature, humidity, and activity to learn the user's thermal preferences, it lacks personalization for the occupant. To address this, we modify our smart home model to obtain a partially observable smart home system based on the partial-observable Markov decision process, where the state of the human model is unknown. The smart home maintains a belief over each possible human state to approximate the hidden human state using its thermal parameters. Subsequently, it learns the personalized thermal preference of multiple human models. We demonstrate that our improved model can reasonably approximate up to 5 occupants and also learn their thermal preferences with incomplete state information. Our simulated experiment achieves classification accuracies of 0.98, 0.91, 0.76, and 0.67 for 2 to 5 human occupants in the smart home.
External DOI