From Bits to the Physical World

A Full-Stack Robotics Guide for AI and Software Developers

01The Big Picture

What Is a Robot, Really?The Full-Stack Map of RoboticsWho's Who in the Industry?

02Hardware

SensorsActuatorsCompute Platforms

03Operating System

ROS 2DDS Deep Dive

04Algorithms

SLAMNav2MoveIt 2PerceptionBehavior Trees

05Simulation & Training

Why Simulation?NVIDIA Isaac Sim

06AI Meets Robotics

Reinforcement LearningImitation LearningFoundation Models

07Toolchain

Visualization & DebuggingDev Environment & DevOps

08The Harsh Reality of Deployment

Real-Robot DeploymentReliability EngineeringFleet Management

09Industry Reality

Business ModelsChina vs. GlobalCareer Advice

Part 3: Operating System · The robot's nervous system

Chapter 7

ROS 2: The Operating System of the Robot World

The first five chapters took us from sensors to actuators to compute platforms — we've basically laid out the robot's "body." Sensors are its eyes and ears, actuators are its muscles and joints, and Jetson is the physical hardware that houses its brain.

But eyes, muscles, and a brain aren't enough on their own — these organs need a nervous system to wire them all together. Your visual cortex spots a ball flying toward you, that signal has to reach your motor cortex, your motor cortex has to coordinate dozens of muscles to catch it. If any link in that chain breaks or lags, the ball hits you in the face.

Robots face the exact same problem. A robot runs dozens of software modules simultaneously: the camera driver is streaming images, the LiDAR driver is streaming point clouds, SLAM is building a map, navigation is planning paths, the arm controller is computing trajectories, the perception model is recognizing objects... These modules are all separate processes, often built by different teams or even different companies. They need a unified way to discover each other, communicate, and coordinate.

ROS 2 (Robot Operating System 2) is that nervous system.


What ROS Actually Is — The Name Is Misleading

ROS has "Operating System" in its name, but it is not an operating system. Your robot still runs Ubuntu Linux underneath. ROS 2 is a software framework that sits on top of Linux.

A more accurate way to think about it: ROS 2 = an inter-process communication framework + a massive ecosystem of open-source packages + a development paradigm.

All three matter, but most people underestimate the last two.

The communication framework is straightforward — it defines how modules send messages to each other. But the real killer feature of ROS 2 is the ecosystem. Nav2 (navigation), MoveIt 2 (arm planning), slam_toolbox (mapping), ros2_control (motor driver abstraction), image_pipeline (image processing)... virtually every functional module you can think of in robotics already exists as an open-source ROS 2 package. You don't need to write a SLAM system from scratch. You run apt install ros-humble-slam-toolbox, tweak a few parameters, launch it, and it starts mapping.

The development paradigm is about the industry-wide convention ROS 2 has established: when you build a robot module, you wrap it as a ROS 2 node, expose standard topic and service interfaces, and anyone else can plug it right in. It's like how the web world settled on REST APIs — not because REST is technically perfect, but because everyone uses it, so it became the standard.

If you're coming from web development, here's a handy mapping:

Web WorldROS 2 WorldWhy They're Similar
npm / PyPIrosdistro (ROS package index)Package management and distribution
MicroservicesNodesIndependent processes, each doing one thing
Kafka / RabbitMQTopicsPublish/subscribe messaging
REST APIServicesRequest/response pattern
gRPC streamingActionsLong-running tasks with progress feedback
docker-composeLaunch filesSpin up a fleet of processes in one command
KubernetesLifecycle managementOrchestrating node startup/pause/shutdown

Why the Entire Industry Is Locked Into ROS

The answer is simple: network effects.

The core pain point of robotics development isn't "building one module from scratch is hard" — it's hard, sure, but doable. The real pain is that building every module from scratch is unrealistic. A robot that can autonomously navigate and pick items in a warehouse involves SLAM, navigation, arm planning, perception, behavior scheduling, sensor drivers, motor control, safety monitoring... Each one is its own specialized domain, each one represents months to years of engineering effort.

The ROS 2 open-source ecosystem lets you stand on everyone else's shoulders. Use slam_toolbox for mapping, Nav2 for navigation, MoveIt 2 for arm planning, ros2_control to interface with motor drivers. You only need to focus your energy on what actually differentiates you — say, your perception algorithm for a specific use case, or your task scheduling logic.

And this ecosystem is self-reinforcing: more people use ROS 2 → more people contribute high-quality packages → newcomers have even less reason not to use ROS 2 → more people use ROS 2. Unitree's quadruped SDK is built on ROS 2, NVIDIA Isaac Sim's ROS bridge connects to ROS 2, Foxglove's visualization tools speak ROS 2 message formats. The entire toolchain is built around it.

You can absolutely skip ROS 2, but you'd be walking away from an entire ecosystem. Some projects (like Dimensional OS) are experimenting with a Python-first, ROS-2-optional developer experience to lower the barrier to entry. But the package ecosystem advantage — thousands of battle-tested open-source packages — has no substitute in the near term.


ROS 1 vs. ROS 2 — There's Only One Thing You Need to Know

When you search for ROS tutorials, you'll find a mountain of ROS 1 content. Ignore it.

ROS 1 was designed in 2007 for research labs. Fine for running demos, but it had a centralized master process — if that thing went down, the entire system went down with it. In a lab, who cares? Just restart it. Would you run a warehouse robot that way?

ROS 2 rewrote the communication layer, switching to the industrial-grade DDS (Data Distribution Service) protocol — decentralized, configurable message reliability, automatic multi-robot discovery. DDS has been used in defense and aerospace for over a decade. This isn't some bleeding-edge experiment.

If you're starting now, go straight to Humble (LTS, supported until 2027) or Jazzy (newer, supported until 2029), paired with Ubuntu 22.04 or 24.04. When you run into ROS 1 tutorials, APIs, or tools — just skip them.


Core Concepts — The Quick Tour

Here are six concepts you'll use every single day. Don't try to memorize them — they'll come up again and again in every chapter on navigation, SLAM, and arm planning. This is just a first pass to build some mental scaffolding. When you actually start writing code, you'll have that "oh, that's what that is" moment.

Node — One Process, One Job

A Node is the smallest unit of execution in ROS 2. A node is an independent process (or thread) responsible for one well-defined task. For example:

  • One node reads LiDAR data and publishes point clouds
  • One node runs the SLAM algorithm
  • One node handles path planning
  • One node controls the chassis motors

Why split things up this finely? Same logic as microservices: fault isolation, independent development, swappability. Your SLAM node crashes, but the navigation node keeps running (though without fresh map data it'll hold position and wait). Want to swap slam_toolbox for your own SLAM implementation? As long as your new node publishes the same topic format, nothing else needs to change.

Here's the simplest possible ROS 2 node in Python:

import rclpy
from rclpy.node import Node
 
class MyNode(Node):
    def __init__(self):
        super().__init__('my_node')  # 节点名
        self.get_logger().info('Hello from my_node!')
 
def main():
    rclpy.init()
    node = MyNode()
    rclpy.spin(node)  # 保持节点运行,监听消息
    rclpy.shutdown()

That's it. Inherit the Node class, give it a name, call rclpy.spin() to keep it running. Every ROS 2 node, no matter how complex, has this same skeleton.

Topic — Broadcast Channels, Pub/Sub Style

A Topic is the most common way nodes communicate. The pattern is publish/subscribe — a publisher sends messages to a topic, and every node subscribed to that topic receives them.

Think of it like a Slack channel at work. The LiDAR driver node keeps posting messages to the /scan channel (10-20 per second), and both the SLAM node and the navigation node are subscribed, each pulling data for their own purposes. The LiDAR node doesn't need to know who's listening, and the SLAM node doesn't need to know where the data comes from — loose coupling.

Messages on a topic have a strict type definition (called a message type). For instance, sensor_msgs/msg/LaserScan is the standard format for 2D LiDAR data, and geometry_msgs/msg/Twist is the standard format for velocity commands. These standard message types are another critical convention in the ROS 2 ecosystem — no matter who wrote the LiDAR driver, the output format is the same, so any SLAM package can plug right in.

Service — Request, Response, Done

A Service is basically the robot equivalent of an HTTP API — request/response pattern. You send a request: "compute a path from A to B." The other side computes it and sends the result back. One question, one answer, synchronously.

The distinction between topics and services is intuitive: sensor data uses topics (continuous broadcast, whoever wants it can listen), one-off requests use services (I ask, you answer).

Action — Long-Running Tasks With a Progress Bar

An Action solves a specific problem: some tasks don't complete in a split second.

You tell the robot "navigate to shelf 3" — that might take 30 seconds. You don't want to just sit there waiting. You want to know where it is, whether it's stuck, whether you can cancel mid-way. Actions are designed for exactly this: send a goal → receive continuous progress feedback → get the final result (or cancel partway through).

Nav2's navigation interface NavigateToPose and MoveIt 2's motion execution MoveGroup are both Actions. Basically, any task you want to "kick off, track, and possibly abort mid-flight" uses an Action.

Parameter — Runtime-Tunable Configuration

A Parameter is a node's configuration value that can be changed dynamically at runtime.

Navigation max speed, SLAM resolution, perception model confidence threshold — you don't want to hardcode these, because different scenarios need different values. ROS 2's Parameter system lets you set initial values in a launch file and adjust them at runtime via the command line or programmatically.

# 运行时改参数
ros2 param set /nav2_controller max_vel_x 0.5

Simple concept, but critically important. A huge chunk of robotics development is just tuning parameters — you'll feel this deeply in the Nav2 and MoveIt chapters.

Launch File — The Orchestration Script

A typical robot needs to start dozens of nodes simultaneously. You're not going to manually ros2 run them one by one, are you?

A Launch file is ROS 2's orchestration script. It uses Python (or XML/YAML) to define "which nodes to start, what parameters to pass, how to remap topics." A single ros2 launch command brings up every software module on the entire robot.

# launch/robot_bringup.launch.py
from launch import LaunchDescription
from launch_ros.actions import Node
 
def generate_launch_description():
    return LaunchDescription([
        Node(package='slam_toolbox', executable='async_slam_toolbox_node',
             parameters=[{'resolution': 0.05}]),
        Node(package='nav2_bringup', executable='bringup_launch'),
        Node(package='my_perception', executable='detector_node'),
    ])

Its role is very similar to a docker-compose.yml — declaratively defining the startup configuration for a group of services.


The Analogy: If ROS 2 Were a Company

Let's tie all these concepts together by imagining ROS 2 as the way a company operates:

  • Node = Employee. Each employee owns one job — someone handles vision, someone handles navigation, someone handles the arm.
  • Topic = Slack channels. The vision team posts detection results to #perception-results, and whoever needs them subscribes. No DMs needed — information is open to all subscribers.
  • Service = A ticketing system. You file a ticket: "compute a path for me." The path planning team picks it up, computes it, and sends the result back. One request, one response, done.
  • Action = Project management. You kick off a project: "move the goods from A to B." The execution team accepts it and gets to work, sending you daily progress updates (feedback), and delivering the final result when it's done. You can kill the project at any time (cancel).
  • Parameter = Company policy settings. "Travel reimbursement cap: $500" — it's in the handbook, but the CEO can bump it to $800 any time without reincorporating the company.
  • Launch file = The opening-up-shop routine. 9 AM, one button: security at the door, front desk online, HVAC running, all departments up. No need to call each person individually.

URDF — The Robot's Blueprint

ROS 2's communication framework solves the "how do modules talk to each other" problem, but there's another fundamental question left unanswered: how does ROS 2 know what your robot looks like?

How many joints does it have, how are they connected, where are the sensors mounted, what are the dimensions and mass distribution of the entire body — this information is essential for SLAM, navigation, and arm planning. MoveIt needs to compute arm trajectories, so it needs to know how many joints the arm has, each joint's rotation limits, and the arm segment lengths. Nav2 needs to do obstacle avoidance, so it needs to know the robot's outer footprint.

URDF (Unified Robot Description Format) is the standard format for storing all of this — an XML file that describes the robot's complete physical structure.

URDF defines two basic elements:

  • Link — a rigid segment of the robot body (e.g., upper arm, forearm, chassis)
  • Joint — the movable connection between two links (e.g., shoulder joint, elbow joint)

Each joint has a type (revolute, prismatic, fixed), rotation limits, and position. Each link has mass, inertia, a collision model, and a visual model. Chain all the links and joints together and you get a "kinematic tree."

<robot name="my_arm">
  <link name="base_link">
    <visual>
      <geometry><cylinder radius="0.05" length="0.1"/></geometry>
    </visual>
  </link>
  <joint name="shoulder" type="revolute">
    <parent link="base_link"/>
    <child link="upper_arm"/>
    <limit lower="-3.14" upper="3.14" effort="10" velocity="1.0"/>
  </joint>
  <link name="upper_arm">
    <!-- ... -->
  </link>
</robot>

URDF is one of the most fundamental "interface contracts" in the ROS 2 ecosystem. You buy a Unitree G1, and Unitree provides its URDF. You swap in a new gripper, and you need to attach the gripper's URDF to the end of the arm. You want to simulate your robot in Isaac Sim — step one is importing the URDF.


Where ROS 2 Sits in the Overall System

ROS 2 itself doesn't "do" anything — it doesn't build maps, it doesn't navigate, it doesn't pick things up. It's infrastructure. It lets the modules that actually do the work communicate and coordinate.

Every chapter that follows — SLAM, navigation, arm planning, perception — covers application-layer software that runs on top of ROS 2. They exchange data through topics, coordinate tasks through actions, and adjust behavior through parameters. Having a rough mental model of these concepts is enough for now. The real understanding will arrive the first time you ros2 launch an entire robot and watch everything come alive.

← Previous06. Compute PlatformsHardwareNext→08. DDS Deep Dive