Aligning Planning Models with Real-World Observations

Thumbnail Image
Morgan, Ella
machine learning , computer vision , planning
While planning models are symbolic and precise, the real world is noisy and unstructured. This work aims to bridge the gap between noise and structure by aligning visualizations of planning states to the underlying state space structure. Further, we do so in the presence of noise and augmentations that simulates a commonly overlooked property of real environments: several variations of semantically equivalent states. First, we create a dataset that visualizes states for several common planning domains; each state is generated in a way that introduces variability or noise. E.g., objects changing in location or appearance in a manner that preserves semantic meaning. Then, we train a contrastive learning model to predict the underlying states from the images. Next we evaluate how we can align the predictions of a given sequence of visualized states with the problem's reachable state space, taking advantage of the known structure to improve predictions. We compare three methods for doing so: a greedy algorithm that only considers the top-n best predictions for each image, beam search which maintains several candidates of prediction sequences, and Viterbi's algorithm, which finds the optimal prediction sequence with respect to the given state-space transition model and probabilities provided by the state prediction model. We evaluate the performance of these various methods and explore trade-offs and differences between these methods. The results demonstrate that these alignment methods can correct errors in the model and significantly improve predictive accuracy. Furthermore, we find that in many cases the Beam aligning algorithm performs on-par with or better than Viterbi's algorithm, frequently finding the same solution or better while running in significantly less time.
External DOI