Back to articles

RecoverFormer: End-to-End Recovery Policy for Humanoid Robots Learns to Switch Among Multiple Stabilization Strategies

arXiv cs.RO (Robotics) · April 28, 2026

AI Summary

  • Researchers present RecoverFormer, a fully end-to-end humanoid recovery policy that learns when and how to switch among recovery behaviors—including compensatory stepping, hand-environment contact, and center-of-mass reshaping—while maintaining robust performance under model mismatch.
  • The architecture combines a causal transformer over a 50-step observation history with a latent recovery mode (enabling smooth transitions among distinct recovery strategies) and a contact affordance head (predicting which environmental surfaces like walls, railings, and table edges are beneficial for stabilization).
  • When trained only on open floor and tested on Unitree G1 humanoid in MuJoCo, RecoverFormer transfers zero shot to walled environments, achieving 100% recovery success across 100–300 N pushes and across wall distances from 0.25–1.4m. Under zero-shot dynamics mismatch, it reaches 75.5% at plus +25% mass, 89% under 30 ms latency, 91.5% at low friction, and 99% under compound friction, latency and mass perturbation.

Related Articles

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free