Stretch 5 Planning

Lessons learned, competitive landscape, technology trends, AI/ML roadmap, and architecture decisions for the next generation

Strategic Priorities

P0Jetson Orin compute upgradeEnables VLA models, future-proofs AI stack

P1SAMD51 + dedicated I2C busesEliminates critical firmware reliability risk (I2C reentrancy)

P2ESP32-S3 with defined UART protocolClean WiFi/BLE/OTA architecture

P3eFuse power protectionSafety + FCC compliance

P4FCC Class B design-inRequired for consumer market

P5Enhanced Dex Teleop + LeRobot pipelineData flywheel is the long-term moat

Key Strategic Insight

Stretch's competitive moat is its AI ecosystem, not hardware. With 216+ GitHub stars on stretch_ai, the largest mobile manipulation research community, and integration into Open X-Embodiment, Hello Robot is the default platform for home robot AI research. Stretch 5 should double down by upgrading compute (Jetson Orin) and making the AI developer experience frictionless. The defensible position is best-integrated AI platform for home manipulation research.

Competitive Landscape

Platform	Company	Price	Status	Threat	Differentiator
Stretch 3	Hello Robot	$24,950	Shipping	low	Lightest, most affordable research platform
TIAGo	PAL Robotics	~$80K+	Shipping	low	Full humanoid torso, industrial-grade, ROS
Mobile ALOHA	Stanford (open-source)	~$30K BOM	Research	high	Bimanual, learning from demonstration focus
Unitree G1/H1	Unitree	$16K–$90K	Shipping	high	Humanoid, legged locomotion, Chinese supply chain

Technology Trends

Single neural networks that take images + language instructions and output robot actions directly. π₀ (Physical Intelligence) and RT-2/RT-X (Google DeepMind) represent the leading edge.

π₀ (pi-zero)RT-2 / RT-XDiffusion Policy

Implication for Stretch 5

Stretch 5 compute must support 3B+ parameter VLA models at >5 Hz inference. Requires dedicated GPU (Jetson Orin).

AI/ML Integration Roadmap

Object Grasping

Current

OWL-v2 detection → heuristic grasp

With Foundation Models

VLA model (π₀-style) generates grasp trajectories end-to-end

Task Planning

Current

LLM prompt → fixed operation sequence

With Foundation Models

LLM with affordance grounding — plans only feasible actions

Navigation

Current

A* / RRT on voxel map

With Foundation Models

Learned navigation policies for dynamic obstacles + social norms

Voice Interaction

Current

Whisper STT → GPT-4o → Piper TTS

With Foundation Models

On-device multimodal model (Gemma/Qwen) for low-latency, private interaction

Compute Requirements for AI Stack

Model Class	Example	Parameters	Min VRAM	Speed
Vision encoder	SIGLip-so400m	400M	2 GB	30+ FPS
Object detector	OWL-v2 large	300M	3 GB	10+ FPS
Segmentation	SAM2-base	90M	2 GB	15+ FPS
LLM (local)	Qwen2.5-7B	7B	6 GB	20 tok/s
VLA model	π₀-small	3B	8 GB	5+ Hz
Full stack	Perception + LLM + VLA	—	12–16 GB	Pipelined

Recommendation: Jetson Orin NX 16 GB as baseline, Orin AGX 64 GB for flagship. This future-proofs Stretch 5 for the VLA model wave.

Executive Summary

Why Rust: the body server sits in the real-time bridge between AI software and six SAMD21 boards. The proposal removes Python GC pauses from the 1 kHz coordination path, removes GIL contention from multi-axis concurrency, and replaces runtime protocol checks with compile-time validation.

Key finding for leadership: Hello Robot is the only vendor still running Python directly in the real-time control path. The migration keeps the Python user API through a PyO3 FFI layer while moving transport, stepper, sensor fusion, and server internals to Rust.

4-Phase Timeline

Program window11 months implementation + 6 months parallel support

Transport → Stepper → Sensor Fusion → Robot Server

Phase 1Months 1–3

Transport

3 months

Next: Stepper

Phase 2Months 4–6

Stepper

3 months

Next: Sensor Fusion

Phase 3Months 7–8

Sensor Fusion

2 months

Next: Robot Server

Phase 4Months 9–11

Robot Server

3 months

Rust migration phase plan
Phase	Scope	Duration	Risk	Key deliverables
Phase 1: Transport Months 1–3	Replace TransportPySerial, TransportCSerial, libtransport.so, and CobbsFraming with a Rust stretch-transport crate exposed through PyO3.	3 months	high	COBS framing + CRC-16 Modbus in Rust transport crate PyO3 Transport.do_rpc compatibility wrapper 72-hour soak test over all stepper and PowerPeriph RPC paths
Phase 2: Stepper Months 4–6	Rewrite core/stepper.py and subsystem motion wrappers so the hot path no longer crosses Python in the control loop.	3 months	high	Type-safe RPC command/status structs for P7/P8/P9 Rust trajectory and calibration logic for stepper devices PyO3 Arm/Lift/OmniBase compatibility layer
Phase 3: Sensor Fusion Months 7–8	Migrate power_periph and IMU fusion pipeline into Rust to remove unchecked reply parsing in sensor status handling.	2 months	medium	Rust PowerPeriph with full RPC coverage IMU config/status pack-unpack and sync trigger coordination Deterministic DeviceTimestamp rollover handling
Phase 4: Robot Server Months 9–11	Replace robot_server, robot, and client_server with a tokio-driven Rust server and keep Python API compatibility through a PyO3 client bridge.	3 months	medium	Async command dispatch + status broadcast replacing ZMQ REQ/REP Graceful shutdown and process lock management Production-ready static binary + Python client compatibility

PyO3 Bridge Architecture

Python SDK

↓

PyO3 FFI bridge (stretch-ffi)

↓

Rust server on tokio runtime

↓

Transport + Stepper + Sensor Fusion crates

↓

SAMD21 motor and power firmware boards

PyO3 bridge architecture layers
Layer	Component	Interface	Responsibility
Python API	Existing robot.arm.move_to() scripts	Python methods	Keep researcher and customer workflows unchanged during migration.
Compatibility Bridge	stretch-ffi (PyO3)	PyO3 FFI boundary	Expose Rust classes with Python-compatible signatures and status dictionaries.
Rust Control Plane	stretch-server on tokio	Async channels + RPC dispatch	Run command orchestration, status publication, and lifecycle control without GIL contention.
Rust Device Crates	stretch-transport, stretch-stepper, stretch-power	COBS + CRC-16 framed USB RPC	Provide low-latency deterministic control for transport, motion, and sensor fusion.
Firmware	SAMD21 motor + power boards	RPC V0/V1 frames	Execute 1 kHz board loops with synchronized motor start and status feedback.

Summary Stats

3,455

Total quantified issues across memory, GIL, latency, type safety, and error handling categories.

2,804critical

Type safety gaps

259high

Memory safety issues

205high

GIL contention points

148medium

Latency-critical paths

39moderate

Error handling issues

Issue Category Breakdown

Type safety gaps81.2%critical

Dict chain access and bare RPC constants dominate the codebase and hide interface breakage until runtime.

Memory safety issues7.5%high

Unchecked reply indexing and unpack_slice patterns are concentrated in firmware communication modules.

GIL contention points5.9%high

Threading and sleep-heavy control paths serialize status and command work in Python.

Latency-critical paths4.3%medium

Byte-wise COBS/CRC processing and busy waits exist directly in transport and control-loop paths.

Error handling issues1.1%moderate

Bare except and sys.exit in library code create fail-stop behavior instead of recoverable flows.

Top 10 Highest-Risk Files

Top audit hotspot files by issue count category
File	Memory	GIL	Latency	Type	Errors	Total
core/stepper.py Largest concentration of unchecked firmware reply parsing.	99	8	1	357	0	465
subsystem/power_periph.py IMU and power status unpacking path has high reply-index risk.	83	2	2	249	0	336
robot/robot.py Status thread orchestration heavily contends on Python threading.	0	41	0	63	0	104
core/transport/transport_pyserial.py Primary transport bottleneck for framing, CRC, and RPC dispatch.	30	1	45	23	0	99
robot/robot_client.py Sleep-heavy request flow with runtime-typed status access.	0	16	0	64	1	81
robot/robot_server.py Dispatch and status paths rely on runtime tuple and dict shape assumptions.	0	0	0	45	0	45
core/transport/transport_util.py High-frequency struct pack and unpack in the transport hot path.	18	0	18	3	0	39
core/feetech/protocol_packet_handler.py Servo packet parsing uses unchecked indexing and busy polling loops.	20	4	0	0	0	24
core/feetech/feetech_SM_chain.py Threaded servo polling plus bare except handling.	3	16	0	0	2	21
core/transport/cobbs_framing.py Byte-wise framing and CRC loops run in Python in every RPC cycle.	0	0	20	0	0	20

Counts shown are the quantified file-level findings explicitly called out in PYTHON-AUDIT.md.

Key Findings

Memory safety hotspots: stepper.py (99), power_periph.py (83), and transport_pyserial.py (30) account for 212 of 259 memory safety issues.
GIL contention is concentrated in status polling and transport paths, including robot.py threading and transport serial locking.
Latency-critical path concentration: 85 transport latency points in 1,515 lines, dominated by COBS and CRC processing in Python.
Type safety is the largest risk class with 2,804 findings, mostly dynamic dict access and raw RPC constants.

V1 PIMU: Added INA228 power monitor, pre-charge circuit, safe PC shutdown, system shutdown mode (103µA vs 12–30mA), brake button float mode, IMU+Mag, better USB hub ICs
Charging improved: 10A (2.5hrs vs 4.5hrs)
Requires 36V 8A adapter specifically
BMS comms limited to 9600bps
Blocking I2C at 1kHz was problematic
EMC: USB hub renumeration from ESD fixed with new ICs
Grounding still critical
Stretch 3 passed Class A EMC in both operational and charging modes

Development Roadmap

Phase 1Architecture & Schematic

Define power architecture (Efuse, INA228 alerts)
Consolidate 3V3 rails
Master/slave UART protocol spec
Select Class B EMC-friendly USB hub ICs
Select Jetson Orin NX vs AGX for compute
Design SAMD51 with dedicated I2C buses (IMU/Mag/INA228 separated)

Phase 2Prototype & Bring-up

In-house pre-compliance EMC scans
Actuator protection validation
Non-blocking I2C implementation + DMA transfers
Reverse current path testing
Jetson Orin integration + AI stack validation
ESP32-S3 UART protocol bring-up

Phase 3Validation & Compliance

FCC Class B formal testing
Automated RDK test suite complete
SOC accuracy validation (INA228 vs BMS)
System grounding audit
VLA model inference benchmarks on Orin
Dex Teleop + LeRobot pipeline validation

Phase 4Production Readiness

Final BOM review & cost optimization
Manufacturing test fixtures
Firmware OTA update pipeline (via ESP32-S3)
Documentation & handoff
'Contribute data' toggle for fleet learning

Stretch 5 Planning

Strategic Priorities

Key Strategic Insight

Competitive Landscape

Technology Trends

AI/ML Integration Roadmap

Compute Requirements for AI Stack

Executive Summary

4-Phase Timeline

PyO3 Bridge Architecture

Summary Stats

Issue Category Breakdown

Top 10 Highest-Risk Files

Key Findings

Lessons from Stretch 3 → 4

Avoid in Stretch 5

Recommended for Stretch 5

Open Questions for Team

Development Roadmap