An Analysis of Motion Smoothness in Video Object Detection
Machine learning in computer vision has become an invaluable aspect of research on object detection and object tracking. While advancements in current research aim to improve the matching of predictions with ground truth bounding box annotations from humans, to our knowledge, very little work is currently being done on analyzing bounding box path smoothness. Bounding box path smoothness is useful as it can contribute to improve machine vision. Additionally, it provides another metric by which researchers can assess the capabilities and qualities of video object detection systems. In this work, we investigate the problem of object bounding box path smoothness in video object detection systems. We begin by studying the fields of convolutional neural networks for object detection systems, and smoothness metrics from biokinematics research. Two smoothness metrics from this field, namely Log Dimensionless Jerk (LDLJ) and Spectral Arc Length (SAL), are adapted for usage in object bounding box paths and an analysis is done to justify the adaptations made. An in-depth analysis of two bounding box proposal generation systems is done using the two adapted smoothness metrics and validated against the ground truth bounding box paths. The analysis showed that both LDLJ and SAL can differentiate between all tested object bounding box path generation systems. Additional experiments demonstrate that the human annotations are the most smooth bounding box paths, however, the object detection systems tested can be improved naively by doing a moving average over proposed paths. Finally, we adapt the smoothness metrics as loss functions in a video object detection system to analyze if it could be used as a regularizer on video object detection using convolutional neural networks. We propose, train and analyse a model on video object detection with 7 training regimens which vary only in the regularizer. We found that using a smoothness regularizer can improve object path smoothness by a small amount and conclude with a list of possible future work.