OpenHEART: Opening Heterogeneous Articulated Objects with a Legged Manipulator

ICRA'26
1KAIST, Electrical Engineering2KRAFTON*Corresponding author.

Abstract

Legged manipulators offer high mobility and versatile manipulation. However, robust interaction with heterogeneous articulated objects, such as doors, drawers, and cabinets, remains challenging because of the diverse articulation types of the objects and the complex dynamics of the legged robot. Existing reinforcement learning-based approaches often rely on high-dimensional sensory inputs, leading to sample inefficiency. In this paper, we propose a robust and sample-efficient framework for opening heterogeneous articulated objects with a legged manipulator. In particular, we propose Sampling-based Abstracted Feature Extraction (SAFE), which encodes handle and panel geometry into a compact low-dimensional representation, improving cross-domain generalization. Additionally, Articulation Information Estimator (ArtIEst) is introduced to adaptively mix proprioception with exteroception to estimate opening direction and range of motion for each object. The proposed framework was deployed to manipulate various heterogeneous articulated objects in simulation and real-world robot systems.

Heterogeneous Articulate Object Opening

Real-world Demonstration

Diverse Object Opening

Articulation Information Estimator (ArtIEst)

Articulation Information

The articulation information is defined as the joint direction of the object and the distance between the handle and the joint axis. It is required to estimate the grasping pose as well as the direction and range of opening motion.

Visual Ambiguity in Articulation Information

Although the exteroception-based estimator can infer articulation information before manipulation, estimation can be ambiguous when visual features suggest multiple candidates.

E.g. a cabinet with a horizontally elongated handle at the upper center may appear to open left, right, or downward.

To resolve such visual ambiguities, the proprioception-augmented estimator incorporates proprioceptive information with exteroceptive information during manipulation.

ArtIEst Framework

ArtIEst, predicts articulation information by adaptively combining exteroception and proprioception. We separate the estimator based on whether proprioception is included as an input. It's because when the robot is still approaching the object, proprioception alone cannot provide meaningful cues for estimating the opening direction. The exteroception-based estimator relies on the object's appearance. Once contact is made, the proprioception-augmented estimator incorporates proprioception

Estimation Demonstration

Estimation Error Transition

Estimation error transition of exteroception-based, proprioception-augmented, and mixed estimations during the manipulation. The progress of the opening motion is indicated by the object’s open angle normalized using the joint limits. (a) The appearance of the object can be inferred to be opened in either a rightward or upward direction, while the correct direction is upward, leading to visual ambiguity in exteroception-based estimation. (b) In contrast, no visual ambiguity exists in this case.

Auto-Retrying Behavior

Additionally, the robot developed an auto-retrying behavior. When the initial grasp was unstable due to a misaligned gripper, the robot autonomously regrasped the handle and successfully opened the drawer. Such auto-retrying behaviors are crucial for adapting to unexpected changes in the environment and ensuring successful task completion.

License

This template is provided under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license. You are free to use and modify the code in your project as long as you include a link to this GitHub repository in your footer.