Model Card for TRANSIC Policies

This modelcard is accompanied with the CoRL 2024 paper titled TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction. It includes robot policies trained in the simulation and transferred to the real world for complex and contact-rich manipulation tasks.

Model Details

Model Description

This model repository includes three parts, 1) teacher policies trained in simulation with reinforcement learning; 2) student policies distilled from successful trajectories generated by teacher policies; and 3) residual policies learned in the real world to augment simulation policies.

The first part can be found in the rl directory. We provide RL teacher policies for 8 different tasks. The second part can be found in the student directory. We provide 5 student policies corresponding to 5 skills used in assembling the square table from FurnitureBench. The third part can be found in the residual directory. They augment those 5 simulation base policies (student policies).

Developed by: Yunfan Jiang
Model type: PyTorch Checkpoints
License: MIT

Model Sources

Repositories: TRANSIC, TRANSIC-Envs
Paper: TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction
Demo: Provided on Our Website

Uses & How to Get Started with the Model

Please see our codebase for detailed usage.

Training Details

Training Data

We provide training data in our 🤗Hugging Face data repository.

Training Procedure

Teacher policies are first trained with reinforcement learning from scratch in simulation;
We then rollout teacher policies to generate successful trajectories;
These generated data are used to train student policies through behavior cloning. Student policies take point-cloud and proprioceptive observations and output joint actions.
We then deploy student policies on the real robot. A human operator monitors the execution, intervenes when necessary, and provides online correction through teleoperation. Such teleoperation data are collected.
We use collected correction data to learn residual policies, which then augment simulation policies for successful sim-to-real transfer.

Evaluation

Policies are evaluated in simulation and the real world. We use task success rate as the metric.

Citation

BibTeX:

@inproceedings{jiang2024transic,
  title     = {TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction},
  author    = {Yunfan Jiang and Chen Wang and Ruohan Zhang and Jiajun Wu and Li Fei-Fei},
  booktitle = {Conference on Robot Learning},
  year      = {2024}
}

Model Card Contact

Yunfan Jiang, email: yunfanj[at]cs[dot]stanford[dot]edu