Deploy
Deploying models
Run any model on a hosted endpoint, an edge device, a ROS 2 node, or your own container.
Deployment surfaces
- Hosted — managed endpoint with autoscaling and a signed invoke URL
- Jetson — a signed
.debpackage with systemd unit and Triton runtime - ROS 2 — drop-in launch file that wires inputs and outputs to topics
- Self-hosted container — the
nodedata/runtimeimage runs anywhere
Hosted inference
Create a deployment from any model you have licensed. Cold start is typically under three seconds for models under 2 GB.
python
deployment = node.deployments.create(
model="acme/grasp-policy-v2",
revision="1.4.0",
accelerator="nvidia-l4", # nvidia-l4 | nvidia-a100 | nvidia-h100 | cpu
min_replicas=1,
max_replicas=8,
scale_to_zero=True,
)
print(deployment.endpoint)
# → https://infer.nodedata.dev/d/dpl_01HZ.../invokeInvoke
bash
curl https://infer.nodedata.dev/d/dpl_01HZ.../invoke \
-H "Authorization: Bearer $NODE_DATA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"inputs": { "rgb": "<base64>", "depth": "<base64>" }
}'Billing
Hosted deployments are billed per active replica-second and per inference call. See pricing & revenue.
Jetson and edge packages
Node Factory generates a Jetson-ready package containing a quantized engine plan, a systemd unit, and the inference server. Packages are signed and verified at install time.
bash
# Generate a Jetson-ready package for an asset
node-data export \
--model acme/grasp-policy-v2 \
--revision 1.4.0 \
--target jetson-orin \
--precision fp16 \
--out ./pkg
# On the device
sudo dpkg -i ./pkg/nodedata-grasp-policy_1.4.0_arm64.deb
systemctl enable --now nodedata-grasp-policyROS 2
The nodedata_runtime ROS package exposes any model as a node with configurable input and output topics. It supports both rclpy and rclcpp.
bash
# launch file
ros2 launch nodedata_runtime model.launch.py \
asset:=acme/grasp-policy-v2 \
revision:=1.4.0 \
topic_in:=/camera/rgb \
topic_out:=/policy/actionSelf-hosted container
Dockerfiledockerfile
FROM nodedata/runtime:1
ENV NODE_DATA_API_KEY=<provisioned-at-deploy>
RUN nodedata pull acme/grasp-policy-v2 --revision 1.4.0
CMD ["nodedata", "serve", "--port", "8080"]Precision and quantization
| Target | Default precision | Notes |
|---|---|---|
| H100 / A100 | bf16 | fp8 available for supported architectures |
| L4 | fp16 | int8 calibration if asset ships calibration data |
| Jetson Orin | int8 | fp16 fallback when calibration is unavailable |
| CPU | int8 | onnxruntime with VNNI/AVX-512 |
Logs and metrics
- Structured logs streamed via
GET /deployments/{id}/logs - p50 / p95 / p99 latency and throughput on the deployment dashboard
- OpenTelemetry exporter for shipping traces to your own backend