Haptic Intelligence Embodied Vision Robotics Conference Paper 2025

Visuo-Tactile Object Pose Estimation for a Multi-Finger Robot Hand with Low-Resolution In-Hand Tactile Sensing

Accurate 3D pose estimation of grasped objects is an important prerequisite for robots to perform assembly or in-hand manipulation tasks, but object occlusion by the robot's own hand greatly increases the difficulty of this perceptual task. Here, we propose that combining visual information with binary, low-resolution tactile contact measurements from across the interior surface of an articulated robotic hand can mitigate this issue. The visuo-tactile object-pose-estimation problem is formulated probabilistically in a factor graph. The pose of the object is optimized to align with the two kinds of measurements using a robust cost function to reduce the influence of outlier readings. The advantages of the proposed approach are first demonstrated in simulation: a custom 15-DOF robot hand with one binary tactile sensor per link grasps 17 YCB objects while observed by an RGB-D camera. This low-resolution in-hand tactile sensing significantly improves object-pose estimates under high occlusion and also high visual noise. We also show these benefits through grasping tests with a preliminary real version of our tactile hand, obtaining reasonable visuo-tactile estimates of object pose at approximately 12.9 Hz on average.

Author(s): Lukas Mack and Felix Grüninger and Benjamin A. Richardson and Regine Lendway and Katherine J. Kuchenbecker and Joerg Stueckler
Book Title: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)
Year: 2025
Month: May
Bibtex Type: Conference Paper (inproceedings)
Address: Atlanta, USA
State: Accepted

BibTex

@inproceedings{Mack25-ICRA-Visuo,
  title = {Visuo-Tactile Object Pose Estimation for a Multi-Finger Robot Hand with Low-Resolution In-Hand Tactile Sensing},
  booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  abstract = {Accurate 3D pose estimation of grasped objects is an important prerequisite for robots to perform assembly or in-hand manipulation tasks, but object occlusion by the robot's own hand greatly increases the difficulty of this perceptual task. 
  Here, we propose that combining visual information with binary, low-resolution tactile contact measurements from across the interior surface of an articulated robotic hand can mitigate this issue. The visuo-tactile object-pose-estimation problem is formulated probabilistically in a factor graph. The pose of the object is optimized to align with the two kinds of measurements using a robust cost function to reduce the influence of outlier readings. The advantages of the proposed approach are first demonstrated in simulation: a custom 15-DOF robot hand with one binary tactile sensor per link grasps 17 YCB objects while observed by an RGB-D camera. 
  This low-resolution in-hand tactile sensing significantly improves object-pose estimates under high occlusion and also high visual noise. 
  We also show these benefits through grasping tests with a preliminary real version of our tactile hand, obtaining reasonable visuo-tactile estimates of object pose at approximately 12.9 Hz on average.},
  address = {Atlanta, USA},
  month = may,
  year = {2025},
  slug = {mack25-icra-visuo},
  author = {Mack, Lukas and Gr{\"u}ninger, Felix and Richardson, Benjamin A. and Lendway, Regine and Kuchenbecker, Katherine J. and Stueckler, Joerg},
  month_numeric = {5}
}