LeRobot for SO-101 users
A friendly tour of LeRobot 0.4.4 → 0.5.0 for hobbyists who use the CLI but haven't dug into the code.
Pick a path below. Each card takes you to one focused page — no wall of source-code references, no assumed Python deep-dive.
If you train ACT or Diffusion policies on your SO-101 today, your code didn't change between 0.4.4 and 0.5.0 — it's byte-for-byte the same. The upgrade asks you to rebuild your Python environment for changes that won't help most hobbyists. Stay on 0.4.4 unless you specifically want the π0 / π0.5 refactor.
The 5 pieces you actually use, mapped in plain language. Start with the big picture before any code.
How LeRobot is organized →lerobot-train actually do?A walkthrough of the training loop you've been kicking off, with the knobs that actually matter for SO-101.
Training a policy →What changed between 0.4.4 and 0.5.0, who should care, and the one silent gotcha to watch for.
Should I upgrade? →What "RL on top of imitation learning" means in LeRobot today, and what would need to be built to get there.
RL on top →At a glance
| Versions compared | 0.4.4 (Feb 27, 2026) → 0.5.0 (Mar 9, 2026), 26 PRs apart |
| Policies you can train natively | 14 (ACT, Diffusion, SmolVLA, π0, π0.5, VQ-BeT, TDMPC, SAC, …) |
| Hardware supported | SO-100, SO-101, Koch, ALOHA, Reachy 2, Unitree G1, LeKiwi, OpenArm |
| Verdict for ACT / Diffusion users | Stay on 0.4.4. Your training code is unchanged. |
Topic guide
Eight pages, one question each. Read in order or jump to what you need.
The 5 pieces you actually use — robots, teleop, datasets, policies, scripts.
Open →Leader-follower teleop, joint mirroring vs IK, and how little force feedback ships today.
Open →What lerobot-train actually does, plus the knobs worth tuning for SO-101.
Episodes, parquet, MP4s, and the index card that ties them together.
Open →How real-time inference works on your robot — including the GPU-on-the-side setup.
Open →What's actually shipping for reinforcement learning, and the π0.6 question.
Open →GR00T, V-JEPA, UMI, world models — what's already supported and what's still external.
Open →8 files added, 2 removed, 96 modified — what it means for your stack.
Open →Glossary
Terms you'll see across the site.
- Policy
- The trained brain that maps what the robot sees (camera frames + joint positions) to what it should do next (motor commands).
- Imitation learning (IL)
- You record human demonstrations, then train a policy to copy them. ACT, Diffusion, SmolVLA all work this way.
- Reinforcement learning (RL)
- The robot tries actions, gets a reward signal, and learns from trial and error. SAC and TDMPC are LeRobot's RL options.
- HIL-SERL
- "Human-in-the-loop SERL" — a human supervises and corrects the robot while it learns by RL. Shipped, but starts from scratch (no IL warm-start).
- Action chunk
- A short sequence of future motor commands the policy predicts at once (e.g. the next 50 steps), instead of one step at a time. Smoother and faster.
- ACT
- Action Chunking Transformer. A small transformer policy that predicts action chunks. The bread-and-butter SO-101 baseline.
- Diffusion Policy
- A policy that generates action chunks by iteratively denoising random noise, the way image diffusion models generate pictures.
- SmolVLA, π0, π0.5, GR00T
- Vision-language-action models — bigger policies that take a text instruction plus camera frames and output actions. Mostly fine-tuned from Hub checkpoints.
- LeRobotDataset
- LeRobot's standard recording format: a folder of parquet tables (joint positions, actions) plus MP4 videos, with an index card describing the schema.
- Episode
- One continuous demonstration — from "start recording" to "stop recording." A dataset is a stack of episodes.
- Hub
- The Hugging Face Hub. Where pretrained policies and public datasets live.
lerobot-traincan pull either by name. - Teleoperator / leader arm
- The smaller arm a human moves by hand. The larger "follower" arm copies it. Demonstrations are recorded by teleoperating.
- Async inference
- Two programs talking over the network: one on the robot, one on a GPU box. Camera frames go one way, motor commands the other.
- RTC
- Real-Time Chunking. An inference trick that smooths the seam between consecutive action chunks for flow-matching policies (π0, π0.5, SmolVLA).
- ONNX
- A portable format for trained neural networks. LeRobot uses an ONNX runtime for the Unitree G1 whole-body controller.