Radial Voting Based Keypoint and 6DoF Pose Estimation
Abstract
A novel keypoint voting scheme named radial voting based on intersecting spheres is proposed. It is more accurate than existing schemes and accepts a relatively smaller set of sparse keypoints. Radial voting forms the voting stage of the proposed RCVPose method for RGB-D based 6 DoF pose estimation of 3D objects. A CNN is trained to estimate the distance between the 3D point corresponding to the pixel of the depth field, and a set of $3$ sparse keypoints defined based on the bounding box of the object. At inference, a discrete sphere with the radius of the estimated distance is rendered, centered at each corresponding 3D point. The voxels on the surface of these spheres vote to increment a 3D accumulator space, the peak of which indicates the estimated keypoint location. The proposed radial voting is more precise than existing vector and offset voting, and is also robust to sparse keypoints and target object scale. Experiments demonstrate RCVPose to be highly accurate and competitive, achieving state-of-the-art results on the LINEMOD 99.7% and YCB-Video 97.2% datasets, and notably scoring +7.9% higher than previous leading methods on the challenging Occlusion LINEMOD 71.1% dataset.