Skip to content

Infra And Ops

Last updated: 2026-06-28

Infrastructure and operations cover how Cortensor services are deployed, routed, monitored, and recovered. Operators use this reference when running routers, gateways, miners, validators, dashboards, and supporting services.

Operational Surfaces

Surface Operational focus
Fleet management Host inventory, role assignment, secrets, binary version, and upgrade flow.
System services Service units, restart policy, process ownership, and log paths.
Router pools Pool naming, upstream URLs, session mappings, model capacity, and health checks.
Reverse proxy Hostnames, TLS, allowed routes, upstream timeouts, streaming behavior, and request-size limits.
L2/L3 infrastructure Chain ID, RPC, sequencer, explorer, bridge, gas token, and contract deployments.
Monitoring Health endpoints, logs, metrics, alert thresholds, dashboard views, and incident history.
Incident response First checks for routers, miners, queues, gateway, contracts, storage, and private payload handling.

Ops Checklist

  • Active domains and endpoints.
  • Network matrix.
  • Secret-management rules.
  • Owner list for each service.
  • Last-tested deployment and rollback procedure.

Service Flow

flowchart TB
  DNS["DNS / TLS"]
  Proxy["Nginx / reverse proxy"]
  Gateway["Portal API gateway"]
  RouterPool["Router pool"]
  Nodes["Miners / validators / oracles"]
  Contracts["Contracts"]
  Dashboard["Dashboard"]
  Monitoring["Logs / metrics / alerts"]

  DNS --> Proxy
  Proxy --> Gateway
  Proxy --> RouterPool
  Gateway --> RouterPool
  RouterPool --> Nodes
  Nodes --> Contracts
  Dashboard --> Contracts
  RouterPool --> Monitoring
  Gateway --> Monitoring
  Nodes --> Monitoring

Incident Runbook

Incident First checks
Router unavailable Process status, bind port, reverse proxy health, /status, upstream session IDs.
Gateway errors API-key verification, model mapping, router pool URL, usage database writes.
Miner not receiving tasks Registration, session membership, queue state, model capacity, logs.
Validation failing Validator session IDs, verdict policy, raw replica payloads, queue lifecycle.
Private payload failure Encryption seed, allowlist, off-chain storage credentials, URN reachability.
Dashboard mismatch Selected network, contract addresses, wallet permissions, RPC status.

Operator Handoff

Each public environment exposes a small operator handoff record:

Field Purpose
Environment Names the deployment, such as Testnet-0, Testnet-1, Mainnet-Lite, or Mainnet-Full.
Public endpoints Lists dashboard, Portal, gateway, router, RPC, explorer, and bridge URLs where applicable.
Binary version Identifies the supported cortensord release for nodes in that environment.
Route families Lists enabled completion, delegate, validate, factcheck, trial, x402, MCP, and A2A routes.
Contract matrix Lists verified module addresses and explorer links for the environment.
Support path Names the support channel for node registration, API access, billing, or incidents.