Designing reward functions for agile robotic maneuvers in reinforcement learning remains difficult, and demonstration-based approaches often require reference motions that are unavailable for novel platforms or extreme stunts.
We present LineRides, a line-guided learning framework that enables a custom bicycle robot to acquire diverse, commandable stunt behaviors from a user-provided spatial guideline and sparse key-orientations, without demonstrations or explicit timing.
LineRides handles physically infeasible guidelines using a tracking margin that permits controlled deviation, resolves temporal ambiguity by measuring progress via traveled distance along the guideline, and disambiguates motion details through position- and sequence-based key-orientations.
We evaluate LineRides on the Ultra Mobility Vehicle (UMV) and show that the policy trained with our methods supports seamless transitions between normal driving and stunt execution, enabling five distinct stunts on command: MiniHop, LargeHop, ThreePointTurn, Backflip, and DriftTurn.
A small vertical hop reaching approximately 32 cm in height and 50 cm in distance. Demonstrated on real hardware.
A larger vertical hop reaching about 56 cm in height and 80 cm in distance. Demonstrated on real hardware.
A planar two-step pivot turn that requires precise balance adjustments. Demonstrated on real hardware.
A full vertical backward flip executed while moving forward. The guideline is generated via trajectory optimization with a simplified two-mass model.
A planar left drift-turn maneuver of roughly 90 degrees involving controlled rear-wheel slip.
To demonstrate the generality of our framework, we applied LineRides to a quadruped platform and trained five diverse stunt skills: MultiHop, TurnLeft, Circle, WallJump, and Backflip. The quadruped successfully acquires all five skills using the same guideline-based specification, suggesting that the framework is not limited to bicycle robots alone.
LineRides enables robots to learn stunt behaviors from a simple user-provided line (spatial guideline) and a sequence of key-orientations. The framework addresses three key challenges:
The framework trains a single end-to-end policy that supports both driving mode and stunt mode, enabling seamless transitions between normal operation and stunt execution.
coming soon... }