Documentation Index
Fetch the complete documentation index at: https://docs.roboticks.io/llms.txt
Use this file to discover all available pages before exploring further.
GPU setup
GPU-backed runners host Gazebo Harmonic worlds, GPU-rendered Webots scenes, and any test that exercises CUDA. The platform discovers GPU capability from the runner’s declared capabilities — but the runner only declares what the host can actually serve. This page covers the host-side work.Supported hardware
| Class | Examples | Notes |
|---|---|---|
| Data-centre | T4, A10, L4, L40, A100, H100 | Recommended for production GPU pools |
| Workstation | RTX 3090, 4090, 5090, A6000 | Fine for dev pools; check driver compatibility |
| Embedded | Jetson Orin AGX, Orin NX | Supported for ARM64 self-hosted; aarch64 binary |
Prerequisites on the host
Install nvidia-container-toolkit
This is what lets Docker pass
/dev/nvidia* and CUDA libraries into the test container.Declare GPU capability
Inrunner.yaml:
rbtk-runner reload or restart the service). The dashboard will show the GPU capability immediately after the next heartbeat.
Verify with doctor
✗ flags one of the host-side prerequisites; fix it before relying on the runner for GPU jobs.
Multi-GPU pools
Two patterns work.One runner, multiple GPUs
If you have a 4-GPU box and want the runner to schedule one GPU per job:NVIDIA_VISIBLE_DEVICES per-job so each container sees exactly one GPU. Jobs that declare gpu_count: 2 get two.
One runner per GPU
If you want hard isolation (one process per GPU), run multiplerbtk-runner instances on the same host with disjoint GPU sets:
Routing for GPU jobs
A test config requests GPU like this:job_router filters to runners whose declared gpu block satisfies all three. If multiple match, it picks the least-loaded one. If none match, the job queues until one comes online (or times out per project policy).
Common pitfalls
Container sees no GPU even though host does
Container sees no GPU even though host does
The Docker daemon needs the NVIDIA runtime registered. After
nvidia-ctk runtime configure you must restart Docker (sudo systemctl restart docker). Verify with docker info | grep -i runtime.CUDA mismatch errors at runtime
CUDA mismatch errors at runtime
The host driver must support the container’s CUDA major version. Driver 535 covers CUDA 12.x; for CUDA 13.x you need driver 575+.
Gazebo Harmonic black screen / shader errors
Gazebo Harmonic black screen / shader errors
Gazebo needs OpenGL via EGL. Add
--gpus all -e __GLX_VENDOR_LIBRARY_NAME=nvidia -e __NV_PRIME_RENDER_OFFLOAD=1 — the runner does this automatically when sim: gazebo-harmonic is declared, but if you override the image, copy these envs into your Dockerfile.Multiple processes contending for one GPU
Multiple processes contending for one GPU
Set
resources.max_concurrent_jobs: count so the runner does not over-subscribe. Per-GPU memory caps via NVIDIA MIG are out of scope for v2.x.Next steps
Pool management
Per-pool stats, tagging, draining.
Troubleshooting
Capability mismatch, MCAP upload, version skew.