Deploy

Deploying models

Run any model on a hosted endpoint, an edge device, a ROS 2 node, or your own container.

Deployment surfaces

Hosted — managed endpoint with autoscaling and a signed invoke URL
Jetson — a signed .deb package with systemd unit and Triton runtime
ROS 2 — drop-in launch file that wires inputs and outputs to topics
Self-hosted container — the nodedata/runtime image runs anywhere

Hosted inference

Create a deployment from any model you have licensed. Cold start is typically under three seconds for models under 2 GB.

python

deployment = node.deployments.create(
    model="acme/grasp-policy-v2",
    revision="1.4.0",
    accelerator="nvidia-l4",        # nvidia-l4 | nvidia-a100 | nvidia-h100 | cpu
    min_replicas=1,
    max_replicas=8,
    scale_to_zero=True,
)

print(deployment.endpoint)
# → https://infer.nodedata.dev/d/dpl_01HZ.../invoke

Invoke

bash

curl https://infer.nodedata.dev/d/dpl_01HZ.../invoke \
  -H "Authorization: Bearer $NODE_DATA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": { "rgb": "<base64>", "depth": "<base64>" }
  }'

Billing

Hosted deployments are billed per active replica-second and per inference call. See pricing & revenue.

Jetson and edge packages

Node Factory generates a Jetson-ready package containing a quantized engine plan, a systemd unit, and the inference server. Packages are signed and verified at install time.

bash

# Generate a Jetson-ready package for an asset
node-data export \
  --model acme/grasp-policy-v2 \
  --revision 1.4.0 \
  --target jetson-orin \
  --precision fp16 \
  --out ./pkg

# On the device
sudo dpkg -i ./pkg/nodedata-grasp-policy_1.4.0_arm64.deb
systemctl enable --now nodedata-grasp-policy

ROS 2

The nodedata_runtime ROS package exposes any model as a node with configurable input and output topics. It supports both rclpy and rclcpp.

bash

# launch file
ros2 launch nodedata_runtime model.launch.py \
  asset:=acme/grasp-policy-v2 \
  revision:=1.4.0 \
  topic_in:=/camera/rgb \
  topic_out:=/policy/action

Self-hosted container

Dockerfiledockerfile

FROM nodedata/runtime:1
ENV NODE_DATA_API_KEY=<provisioned-at-deploy>
RUN nodedata pull acme/grasp-policy-v2 --revision 1.4.0
CMD ["nodedata", "serve", "--port", "8080"]

Precision and quantization

Target	Default precision	Notes
H100 / A100	bf16	fp8 available for supported architectures
L4	fp16	int8 calibration if asset ships calibration data
Jetson Orin	int8	fp16 fallback when calibration is unavailable
CPU	int8	onnxruntime with VNNI/AVX-512

Logs and metrics

Structured logs streamed via GET /deployments/{id}/logs
p50 / p95 / p99 latency and throughput on the deployment dashboard
OpenTelemetry exporter for shipping traces to your own backend

Hosted inference

Create a deployment from any model you have licensed. Cold start is typically under three seconds for models under 2 GB.

python

deployment = node.deployments.create(
    model="acme/grasp-policy-v2",
    revision="1.4.0",
    accelerator="nvidia-l4",        # nvidia-l4 | nvidia-a100 | nvidia-h100 | cpu
    min_replicas=1,
    max_replicas=8,
    scale_to_zero=True,
)

print(deployment.endpoint)
# → https://infer.nodedata.dev/d/dpl_01HZ.../invoke

Invoke

bash

curl https://infer.nodedata.dev/d/dpl_01HZ.../invoke \
  -H "Authorization: Bearer $NODE_DATA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": { "rgb": "<base64>", "depth": "<base64>" }
  }'

Billing

Hosted deployments are billed per active replica-second and per inference call. See pricing & revenue.

Jetson and edge packages

Node Factory generates a Jetson-ready package containing a quantized engine plan, a systemd unit, and the inference server. Packages are signed and verified at install time.

bash

# Generate a Jetson-ready package for an asset
node-data export \
  --model acme/grasp-policy-v2 \
  --revision 1.4.0 \
  --target jetson-orin \
  --precision fp16 \
  --out ./pkg

# On the device
sudo dpkg -i ./pkg/nodedata-grasp-policy_1.4.0_arm64.deb
systemctl enable --now nodedata-grasp-policy

ROS 2

The nodedata_runtime ROS package exposes any model as a node with configurable input and output topics. It supports both rclpy and rclcpp.

bash

# launch file
ros2 launch nodedata_runtime model.launch.py \
  asset:=acme/grasp-policy-v2 \
  revision:=1.4.0 \
  topic_in:=/camera/rgb \
  topic_out:=/policy/action

Target

Default precision

Notes

H100 / A100

bf16

fp8 available for supported architectures

fp16

int8 calibration if asset ships calibration data

Jetson Orin

int8

fp16 fallback when calibration is unavailable

CPU

int8

onnxruntime with VNNI/AVX-512