Infra And Ops¶
Last updated: 2026-06-28
Infrastructure and operations cover how Cortensor services are deployed, routed, monitored, and recovered. Operators use this reference when running routers, gateways, miners, validators, dashboards, and supporting services.
Operational Surfaces¶
| Surface | Operational focus |
|---|---|
| Fleet management | Host inventory, role assignment, secrets, binary version, and upgrade flow. |
| System services | Service units, restart policy, process ownership, and log paths. |
| Router pools | Pool naming, upstream URLs, session mappings, model capacity, and health checks. |
| Reverse proxy | Hostnames, TLS, allowed routes, upstream timeouts, streaming behavior, and request-size limits. |
| L2/L3 infrastructure | Chain ID, RPC, sequencer, explorer, bridge, gas token, and contract deployments. |
| Monitoring | Health endpoints, logs, metrics, alert thresholds, dashboard views, and incident history. |
| Incident response | First checks for routers, miners, queues, gateway, contracts, storage, and private payload handling. |
Ops Checklist¶
- Active domains and endpoints.
- Network matrix.
- Secret-management rules.
- Owner list for each service.
- Last-tested deployment and rollback procedure.
Service Flow¶
flowchart TB
DNS["DNS / TLS"]
Proxy["Nginx / reverse proxy"]
Gateway["Portal API gateway"]
RouterPool["Router pool"]
Nodes["Miners / validators / oracles"]
Contracts["Contracts"]
Dashboard["Dashboard"]
Monitoring["Logs / metrics / alerts"]
DNS --> Proxy
Proxy --> Gateway
Proxy --> RouterPool
Gateway --> RouterPool
RouterPool --> Nodes
Nodes --> Contracts
Dashboard --> Contracts
RouterPool --> Monitoring
Gateway --> Monitoring
Nodes --> Monitoring
Incident Runbook¶
| Incident | First checks |
|---|---|
| Router unavailable | Process status, bind port, reverse proxy health, /status, upstream session IDs. |
| Gateway errors | API-key verification, model mapping, router pool URL, usage database writes. |
| Miner not receiving tasks | Registration, session membership, queue state, model capacity, logs. |
| Validation failing | Validator session IDs, verdict policy, raw replica payloads, queue lifecycle. |
| Private payload failure | Encryption seed, allowlist, off-chain storage credentials, URN reachability. |
| Dashboard mismatch | Selected network, contract addresses, wallet permissions, RPC status. |
Operator Handoff¶
Each public environment exposes a small operator handoff record:
| Field | Purpose |
|---|---|
| Environment | Names the deployment, such as Testnet-0, Testnet-1, Mainnet-Lite, or Mainnet-Full. |
| Public endpoints | Lists dashboard, Portal, gateway, router, RPC, explorer, and bridge URLs where applicable. |
| Binary version | Identifies the supported cortensord release for nodes in that environment. |
| Route families | Lists enabled completion, delegate, validate, factcheck, trial, x402, MCP, and A2A routes. |
| Contract matrix | Lists verified module addresses and explorer links for the environment. |
| Support path | Names the support channel for node registration, API access, billing, or incidents. |