Start here

LeRobot for SO-101 users

A friendly tour of LeRobot 0.4.4 → 0.5.0 for hobbyists who use the CLI but haven't dug into the code.

Pick a path below. Each card takes you to one focused page — no wall of source-code references, no assumed Python deep-dive.

If you train ACT or Diffusion policies on your SO-101 today, your code didn't change between 0.4.4 and 0.5.0 — it's byte-for-byte the same. The upgrade asks you to rebuild your Python environment for changes that won't help most hobbyists. Stay on 0.4.4 unless you specifically want the π0 / π0.5 refactor.

New here
I'm new — what is LeRobot?

The 5 pieces you actually use, mapped in plain language. Start with the big picture before any code.

How LeRobot is organized →
Just use the CLI
What does lerobot-train actually do?

A walkthrough of the training loop you've been kicking off, with the knobs that actually matter for SO-101.

Training a policy →
Upgrade decision
Should I upgrade from 0.4.4?

What changed between 0.4.4 and 0.5.0, who should care, and the one silent gotcha to watch for.

Should I upgrade? →
Advanced
Can I add RL to my trained policy?

What "RL on top of imitation learning" means in LeRobot today, and what would need to be built to get there.

RL on top →

At a glance

Versions compared 0.4.4 (Feb 27, 2026) → 0.5.0 (Mar 9, 2026), 26 PRs apart
Policies you can train natively 14 (ACT, Diffusion, SmolVLA, π0, π0.5, VQ-BeT, TDMPC, SAC, …)
Hardware supported SO-100, SO-101, Koch, ALOHA, Reachy 2, Unitree G1, LeKiwi, OpenArm
Verdict for ACT / Diffusion users Stay on 0.4.4. Your training code is unchanged.

Topic guide

Eight pages, one question each. Read in order or jump to what you need.

Glossary

Terms you'll see across the site.

Policy
The trained brain that maps what the robot sees (camera frames + joint positions) to what it should do next (motor commands).
Imitation learning (IL)
You record human demonstrations, then train a policy to copy them. ACT, Diffusion, SmolVLA all work this way.
Reinforcement learning (RL)
The robot tries actions, gets a reward signal, and learns from trial and error. SAC and TDMPC are LeRobot's RL options.
HIL-SERL
"Human-in-the-loop SERL" — a human supervises and corrects the robot while it learns by RL. Shipped, but starts from scratch (no IL warm-start).
Action chunk
A short sequence of future motor commands the policy predicts at once (e.g. the next 50 steps), instead of one step at a time. Smoother and faster.
ACT
Action Chunking Transformer. A small transformer policy that predicts action chunks. The bread-and-butter SO-101 baseline.
Diffusion Policy
A policy that generates action chunks by iteratively denoising random noise, the way image diffusion models generate pictures.
SmolVLA, π0, π0.5, GR00T
Vision-language-action models — bigger policies that take a text instruction plus camera frames and output actions. Mostly fine-tuned from Hub checkpoints.
LeRobotDataset
LeRobot's standard recording format: a folder of parquet tables (joint positions, actions) plus MP4 videos, with an index card describing the schema.
Episode
One continuous demonstration — from "start recording" to "stop recording." A dataset is a stack of episodes.
Hub
The Hugging Face Hub. Where pretrained policies and public datasets live. lerobot-train can pull either by name.
Teleoperator / leader arm
The smaller arm a human moves by hand. The larger "follower" arm copies it. Demonstrations are recorded by teleoperating.
Async inference
Two programs talking over the network: one on the robot, one on a GPU box. Camera frames go one way, motor commands the other.
RTC
Real-Time Chunking. An inference trick that smooths the seam between consecutive action chunks for flow-matching policies (π0, π0.5, SmolVLA).
ONNX
A portable format for trained neural networks. LeRobot uses an ONNX runtime for the Unitree G1 whole-body controller.
Where to go next →

New to LeRobot? Start with How LeRobot is organized — the 5 pieces you actually use.

Just here for the upgrade decision? Jump to Should I upgrade from 0.4.4?