Image servo is an indispensable technique in robotic applications that helps to achieve high precision positioning. The intermediate representation of image servo policy is important to sensor input abstraction and policy output guidance. Classical approaches achieve high precision but require clean keypoint correspondence, and suffer from limited convergence basin or weak feature error robustness. Recent learning-based methods achieve moderate precision and large convergence basin on specific scenes but face issues when generalizing to novel environments. In this paper, we encode keypoints and correspondence into a graph and use graph neural network as architecture of controller. This design utilizes both advantages: generalizable intermediate representation from keypoint correspondence and strong modeling ability from neural network. Other techniques including realistic data generation, feature clustering and distance decoupling are proposed to further improve efficiency, precision and generalization. Experiments in simulation and real-world verify the effectiveness of our method in speed (maximum 40fps along with observer), precision (<0.3° and sub-millimeter accuracy) and generalization (sim-to-real without fine-tuning).
We encode matched keypoints into a graph and build neural controller with graph neural networks. This allows us to model arbitrary number of detected keypoints with time-varing structure.
We benchmark our method in two simulation environments and one real-world environment.
We visualize several success cases of IBVS and CNS in real-world experiments. In each case, we plot the overview trajectories and their zoomed-in views around the desired pose. Next to the 3D trajectories are images obtained from initial and desired poses, and the gray-scale images representing the photometric error between the desired image and images obtained from the final poses guided by IBVS and CNS.
@misc{chen2023cns,
title={CNS: Correspondence Encoded Neural Image Servo Policy},
author={Anzhe Chen and Hongxiang Yu and Yue Wang and Rong Xiong},
year={2023},
eprint={2309.09047},
archivePrefix={arXiv},
primaryClass={cs.RO}
}