Reference Grounded Skill Discovery
ICLR 2026


Seungeun Rho Aaron Trinh Danfei Xu Sehoon Ha
School of Interactive Computing
Georgia Institute of Technology


Motivation

RGSD vs METRA

Existing unsupervised skill discovery methods struggle to scale to high-DoF agents. Moreover, the learned skills are often unstructured and therefore lack semantic meaning. We propose Reference-Grounded Skill Discovery (RGSD), which leverages a reference motion dataset to ground the online skill discovery process onto a semantically meaningful manifold. Through this reference grounding, RGSD discovers variations for a 69-DoF humanoid character around existing motions.

How it works

RGSD vs METRA
  • Using contrastive learning, we train an encoder q(𝑧|s) to map each motion to a directional vector on the unit hypersphere.
  • We then use the pre-trained encoder as a discriminator within DIAYN.
  • When the policy conditioned on a specific motion embedding 𝑧, the DIAYN reward acts as an imitation reward.
  • When the policy conditioned on a different 𝑧, the DIAYN reward encourages the discovery of novel yet relevant behaviors.

Video

1) Learned Skills from RGSD


RGSD can control the degree of diversity at test time by varying the sampling distribution of the latent variable 𝑧.


2) When trained with unsegmented reference motions


RGSD is applicable to unsegmented reference motions
(a single motion containing multiple skills)


3) Unconstrained motion generation


Latent vectors are sampled uniformly from the unit hypersphere.


4) Downstream task evaluation with learned skills


Citation

Acknowledgements

Thanks to Jeonghwan Kim for discussion.