Start here

LeRobot for SO-101 users

A friendly tour of LeRobot 0.4.4 → 0.5.0 for hobbyists who use the CLI but haven't dug into the code.

Pick a path below. Each card takes you to one focused page — no wall of source-code references, no assumed Python deep-dive.

If you train ACT or Diffusion policies on your SO-101 today, your code didn't change between 0.4.4 and 0.5.0 — it's byte-for-byte the same. The upgrade asks you to rebuild your Python environment for changes that won't help most hobbyists. Stay on 0.4.4 unless you specifically want the π0 / π0.5 refactor.

New here

I'm new — what is LeRobot?

The 5 pieces you actually use, mapped in plain language. Start with the big picture before any code.

How LeRobot is organized →

Just use the CLI

What does lerobot-train actually do?

A walkthrough of the training loop you've been kicking off, with the knobs that actually matter for SO-101.

Training a policy →

Upgrade decision

Should I upgrade from 0.4.4?

What changed between 0.4.4 and 0.5.0, who should care, and the one silent gotcha to watch for.

Should I upgrade? →

Advanced

Can I add RL to my trained policy?

What "RL on top of imitation learning" means in LeRobot today, and what would need to be built to get there.

RL on top →

At a glance

Versions compared	0.4.4 (Feb 27, 2026) → 0.5.0 (Mar 9, 2026), 26 PRs apart
Policies you can train natively	14 (ACT, Diffusion, SmolVLA, π0, π0.5, VQ-BeT, TDMPC, SAC, …)
Hardware supported	SO-100, SO-101, Koch, ALOHA, Reachy 2, Unitree G1, LeKiwi, OpenArm
Verdict for ACT / Diffusion users	Stay on 0.4.4. Your training code is unchanged.

Topic guide

Eight pages, one question each. Read in order or jump to what you need.

How LeRobot is organized

The 5 pieces you actually use — robots, teleop, datasets, policies, scripts.

Open →

Robots & how you control them

Leader-follower teleop, joint mirroring vs IK, and how little force feedback ships today.

Open →

Training a policy

What lerobot-train actually does, plus the knobs worth tuning for SO-101.

Open →

Where your data lives

Episodes, parquet, MP4s, and the index card that ties them together.

Open →

Running a trained policy

How real-time inference works on your robot — including the GPU-on-the-side setup.

Open →

Can I add RL to my policy?

What's actually shipping for reinforcement learning, and the π0.6 question.

Open →

What plugs in

GR00T, V-JEPA, UMI, world models — what's already supported and what's still external.

Open →

Should I upgrade from 0.4.4?

8 files added, 2 removed, 96 modified — what it means for your stack.

Open →

Glossary

Terms you'll see across the site.

Policy: The trained brain that maps what the robot sees (camera frames + joint positions) to what it should do next (motor commands).
Imitation learning (IL): You record human demonstrations, then train a policy to copy them. ACT, Diffusion, SmolVLA all work this way.
Reinforcement learning (RL): The robot tries actions, gets a reward signal, and learns from trial and error. SAC and TDMPC are LeRobot's RL options.
HIL-SERL: "Human-in-the-loop SERL" — a human supervises and corrects the robot while it learns by RL. Shipped, but starts from scratch (no IL warm-start).
Action chunk: A short sequence of future motor commands the policy predicts at once (e.g. the next 50 steps), instead of one step at a time. Smoother and faster.
ACT: Action Chunking Transformer. A small transformer policy that predicts action chunks. The bread-and-butter SO-101 baseline.
Diffusion Policy: A policy that generates action chunks by iteratively denoising random noise, the way image diffusion models generate pictures.
SmolVLA, π0, π0.5, GR00T: Vision-language-action models — bigger policies that take a text instruction plus camera frames and output actions. Mostly fine-tuned from Hub checkpoints.
LeRobotDataset: LeRobot's standard recording format: a folder of parquet tables (joint positions, actions) plus MP4 videos, with an index card describing the schema.
Episode: One continuous demonstration — from "start recording" to "stop recording." A dataset is a stack of episodes.
Hub: The Hugging Face Hub. Where pretrained policies and public datasets live. lerobot-train can pull either by name.
Teleoperator / leader arm: The smaller arm a human moves by hand. The larger "follower" arm copies it. Demonstrations are recorded by teleoperating.
Async inference: Two programs talking over the network: one on the robot, one on a GPU box. Camera frames go one way, motor commands the other.
RTC: Real-Time Chunking. An inference trick that smooths the seam between consecutive action chunks for flow-matching policies (π0, π0.5, SmolVLA).
ONNX: A portable format for trained neural networks. LeRobot uses an ONNX runtime for the Unitree G1 whole-body controller.

Where to go next →

New to LeRobot? Start with How LeRobot is organized — the 5 pieces you actually use.

Just here for the upgrade decision? Jump to Should I upgrade from 0.4.4?