RecoverFormer: End-to-End Recovery Policy for Humanoid Robots Learns to Switch Among Multiple Stabilization Strategies

arXiv cs.RO (Robotics)Apr 28, 20261 min read

Summaries like this, in your inbox every morning.

3 Key Points

Researchers present RecoverFormer, a fully end-to-end humanoid recovery policy that learns when and how to switch among recovery behaviors—including compensatory stepping, hand-environment contact, and center-of-mass reshaping—while maintaining robust performance under model mismatch.
The architecture combines a causal transformer over a 50-step observation history with a latent recovery mode (enabling smooth transitions among distinct recovery strategies) and a contact affordance head (predicting which environmental surfaces like walls, railings, and table edges are beneficial for stabilization).
When trained only on open floor and tested on Unitree G1 humanoid in MuJoCo, RecoverFormer transfers zero shot to walled environments, achieving 100% recovery success across 100–300 N pushes and across wall distances from 0.25–1.4m. Under zero-shot dynamics mismatch, it reaches 75.5% at plus +25% mass, 89% under 30 ms latency, 91.5% at low friction, and 99% under compound friction, latency and mass perturbation.

No comments yet. Be the first to share your thoughts!

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Free · takes 30 seconds · unsubscribe anytime

1 minute a day. The AI essentials.

200+ sources · Email / LINE / Slack