Owner: @ash (arch) + @alex (ops). Ratification: binding decisions accepted as ADR-0008 on 2026-04-27. Per-component implementation lives in the relevant ansible roles (configs/ansible/roles/{patroni,redis-sentinel,haproxy,prometheus,loki}/) and the per-region storage strategy is captured in ADR-0016. This plan is the umbrella that binds them.
The HA story is constrained by three non-negotiable service targets:
- p95 ≤ 200 ms, p99 ≤ 500 ms (Performance SLA).
- ≥ 99.9 % responsiveness — we commit to 99.99 %.
- ≤ 30 s data freshness.
Every topology decision below is traced back to which of these numbers it protects.
1. Design principles
- Single-region HA first; multi-region DR second. The initial
build window forces us to ship a strong single-region deployment at launch, with cold DR in cloud. Multi-region active/ active is explicitly out of scope for v1.
- Ingest must never block serving. If the ingestion plane slows,
the serving plane serves stale-marked responses — it does not error.
- Decouple hot from cold. Anything in the 30-second serving
hot path lives in Redis; everything older reads from TimescaleDB; archive + replay source is MinIO. Three tiers, three failure domains.
- No single machine's failure takes us below SLA. Redundancy is
N+1 at minimum for every stateful component; the Tier-1 three-validator aspiration (ADR-0004) provides N+2 for the validator / history-archive layer. > Update (2026-04-23): stellar-rpc was removed from > production ingest and is now diagnostics-only (rpc-probe + > fixture capture). The N+2 redundancy goal applies to the > validators per ADR-0004; it does not describe stellar-rpc, > which is no longer on the ingest path. The §3.2 sizing below > predates this removal — see docs/operations/r1-deployment-state.md.
- Every component has a "degraded mode" defined up front. If
Aquarius ingestion dies, what does /v1/price?asset=… return? The answer is in §9, not invented during an incident.
- We own the hardware we need to own. Captive-core + Galexie on
colocated R640s (per ADR). Everything stateless lives in cloud. Cloud is DR target; colo is primary.
2. Physical topology
Cross-region context: this section shows the per-region layout — one full stack, as deployed in each of three regions. The 3-region architecture (primary / sync-replica / async-replica, graceful degradation across regions) lives in infrastructure/multi-region-topology.md. The phased 1 → 3 validator rollout lives in infrastructure/validator-rollout.md. Per-node hardware spec is in infrastructure/archival-node-spec.md. At launch we run exactly one region (R1, Hetzner FSN1 in Falkenstein, DE) with this topology; R2 (AWS us-east-1) and R3 (Vultr Singapore) join post-launch with the same per-region shape, modified per ADR-0016 for each provider's storage economics.
2.1 Primary (colo)
┌──────────────────────────────┐
│ Internet (Anycast + CDN/WAF) │
└──────────────┬───────────────┘
│
┌────────────────────────┴─────────────────────────┐
│ │
┌──────┴──────┐ ┌──────┴──────┐
│ HAProxy-A │ │ HAProxy-B │
│ (keepalived VIP) │ │
└──────┬──────┘ └──────┬──────┘
│ │
└─────────────────────┬────────────────────────────┘
│
┌─────────────────────────┴─────────────────────────┐
│ stellarindex-api pool (N=3) │ stateless
└─────────────────────────┬─────────────────────────┘
│
┌──────────────────────────────┼────────────────────────────┐
│ │ │
┌──────┴───────┐ ┌───────────┴──────────┐ ┌─────────┴────────┐
│ Redis cluster│ │ TimescaleDB │ │ MinIO (erasure) │
│ (3 masters, │ │ Patroni-managed HA: │ │ EC(6+3) on 9 │
│ 3 replicas, │ │ 1 primary + │ │ hosts; bucket │
│ Sentinel) │ │ 2 sync replicas │ │ versioning on. │
└──────┬───────┘ └───────────┬──────────┘ └─────────┬────────┘
│ │ │
└──────────────────────────────┼────────────────────────────┘
│
┌───────────────┴──────────────┐
│ stellarindex-aggregator │ one active, one
│ (leader-elected via Redis) │ standby
└───────────────┬──────────────┘
│
┌───────────────┴──────────────┐
│ stellarindex-indexer pool │ one shard per source
│ (per-source orchestration) │
└───┬──────────┬──────────┬────┘
│ │ │
┌─────┴───┐ ┌────┴────┐ ┌───┴─────┐
│ core A │ │ core B │ │ rpc A/B │
│+galexie │ │+galexie │ │(captive)│
│ (R640) │ │ (R640) │ │ (R640) │
└─────────┘ └─────────┘ └─────────┘Component counts in §3. Each box is ≥ N+1; the stellar-core nodes are N+2 because the Tier-1 aspiration (ADR-0004) requires three independent archives post-launch.
2.2 DR (cloud — AWS primary)
- Stateless services (
stellarindex-api,stellarindex-aggregator)
warm-standby in AWS. Scale-to-zero when not failing over; scale-out on DNS flip.
- TimescaleDB async logical replica (not streaming — crosses a WAN,
pg_logical is resilient) at AWS RDS with 5-minute RPO budget.
- Redis not replicated cross-region: warm-standby is cold.
Acceptable because Redis is cache; re-hydrates from Timescale after failover within minutes.
- MinIO Veeam-style replicated to S3 via
mc mirror --overwrite;
RPO 1h for the archive bucket. galexie-live/ replicated at 5 min.
- stellar-core and stellar-rpc not replicated to cloud — they are
rebuilt from our own MinIO archive on DR activation (~4 hours to CATCHUP_RECENT). This is intentional: running captive-cores in AWS violates our cost envelope.
2.3 Why colo primary, cloud DR
Already ratified in ADR-0002 alternatives: captive-cores at 8 vCPU / 32 GB / large NVMe scale are ~3× cheaper on dedicated hardware than cloud IOPS-matched instances. The colocated R640 fleet is already provisioned; cloud is pay-as-you-use for DR.
3. Component-by-component HA
3.1 stellar-core + galexie fleet
- Instances: 3 × R640 (
core-01,core-02,core-03). Each
runs captive-core in CATCHUP_RECENT mode + galexie in live-export mode writing to the shared MinIO galexie-live/.
- Quorum set: our 3 nodes vote with the public SDF + 3 Tier-1
organisations (SDF, LOBSTR, Satoshipay) per the Tier-1 aspiration (ADR-0004).
- Failure mode: 1 captive-core down → aggregator continues
reading from the surviving 2 nodes' Galexie output (dedup by (ledger, hash) on ingest). 2 captive-cores down → SEV-1; the third + SDF public quorum keeps us writing ledgers but freshness degrades toward 30 s ceiling.
- RPO: 0 for live ledger (captive-core replays on restart from
local state); 5 min for galexie export (configurable).
- RTO: 2 min failover (the ingester stops reading node X and
starts reading node Y). Captive-core restart: 10–20 min from cold to catchup-recent if local state is intact; 4 h if it has to rebuild from our MinIO archive.
3.2 stellar-rpc
- Instances: 2 × stellar-rpc on
core-01andcore-02(one
captive-core each).
- Why 2 not 3: stellar-rpc's SQLite is not a cluster; each
instance is independent. Two is enough for serving live-event subscriptions.
- Failure mode: 1 rpc down →
stellarindex-indexerswitches its
getEvents stream to the survivor via configured [stellar_rpc].endpoints array.
- Historic event retention: we do not rely on stellar-rpc
for historic event queries. Historic goes through Galexie → our own events hypertable. This sidesteps the SQLite ceiling flagged in the adversarial audit §6c.
3.3 TimescaleDB cluster
- Topology: Patroni-managed primary + 2 synchronous replicas on
3 separate R640s (db-01, db-02, db-03). synchronous_commit=remote_apply, synchronous_standby_names='ANY 1 (db-02, db-03)'.
- Etcd quorum: 3 nodes for Patroni leader election.
- PgBouncer: a pair with keepalived VIP in front of the
Patroni cluster. Transaction-mode pooling. Pool size sized against PostgreSQL max_connections and the api-pod count.
- Hypertables:
- trades — partitioned by ts daily; raw rows kept forever (migration 0031 removed the old 90-day retention — invariant 8: storage is not a constraint, and Postgres is the served tier, not the full archive). Compression still applies for space. - oracle_updates — same shape as trades, smaller volume. - prices_1m, prices_15m, prices_1h, prices_4h, prices_1d, prices_1w, prices_1mo — continuous aggregates (CAGGs) with add_continuous_aggregate_policy. - soroban_events — the ADR-0029 Soroban-event landing zone the projector tails (compressed after a window). - asset_supply_history — supply-history hypertable; retention indefinite. - asset_metadata — ordinary table (small).
- Retention policy (invariant 8 — ADR-0034):
- trades raw: indefinite (no drop_after; migration 0031 removed the rogue 90-day policy — if you see one on trades, it's drift, remove it). - prices_1m, prices_15m: retention also removed — indefinite. - prices_1h, prices_4h, prices_1d, prices_1w, prices_1mo: indefinite (daily OHLC spans back to 2015).
- Backup:
- pgBackRest to MinIO with WAL-stream, --type=full weekly, --type=diff daily, --type=incr hourly. - RPO 5 min (WAL archiving lag SLA). - Restore test: monthly automated; reported in ops dashboard.
- Failover: Patroni leader election. Target RTO 60 s.
- Connection secret: read via secret manager at startup, never
on disk.
- Cross-region consistency: API endpoints reading from the CAGG
tables serve only closed buckets per ADR-0015; the in-progress window is never exposed. This makes "all 3 regions return the same rate" a property of the design rather than a hopeful side-effect of replication latency. See the ADR for trade-offs and the ≤30 s freshness contract it implies.
3.4 Redis Sentinel cluster
Amended 2026-05-01 to remove the Cluster-vs-Sentinel contradiction the original draft had. Ratified by ADR-0024. The redis-sentinel ansible role (role docs) deploys this exact topology.- Topology: 1 primary + 2 replicas, Redis Sentinel mode (no
sharding). 3 Sentinels co-located on the same 3 cache hosts; Sentinel quorum = 2.
- Why Sentinel, not Cluster: our hot-set is small
(~few GB across all categories below); sharding adds operational tax without solving capacity. Sentinel is simpler at SEV-1 time and the migration to Cluster, if we ever need it, is a one-time cost rather than an ongoing tax. Full reasoning in ADR-0024.
- Client connection model: clients use
go-redis/v9's
NewFailoverClient and ask any Sentinel for the current primary. There is no VIP or HAProxy in front of Redis — the client SDK does the discovery itself. (This is why the Redis sub-role of Task #72 ships standalone: no companion HAProxy role is required for cache, only for Postgres.)
- Data categories:
- Hot prices — key price:<asset> → latest aggregated price JSON, TTL 60 s (refreshed by aggregator). - VWAP precompute — key vwap:<pair>:<tf> → value+ computed-at, TTL matches the window. - Rate-limit buckets — key rl:<api_key>:<min>, TTL 120 s. - SEP-1 / home-domain cache — key toml:<domain>, TTL 15 min. - Asset-metadata cache — key meta:<asset>, TTL 5 min. - SSE subscriber registry — key sub:<channel>, no TTL (heartbeat).
- Failure mode: master down → Sentinel failover, 15–30 s
window. During the window stellarindex-api returns stale_flag=true on affected keys (pulls from Timescale as fallback).
- Persistence: AOF every-second. RDB nightly. We do not
rely on Redis persistence for correctness — a wiped Redis re-hydrates from Timescale within 2 min (the stellarindex-aggregator re-warms).
- Caveat: token-bucket rate-limit resets on a wipe (users get
a grace minute). Acceptable.
3.5 MinIO
- Topology: 9 nodes with EC(6+3) erasure coding. Tolerates 3
node failures before losing availability, 6 node failures before losing data.
- Buckets:
- galexie-live/ — current captive-core exports, versioning on, object-lock mode COMPLIANCE disabled (we may re-export). - galexie-archive/ — immutable past exports, object-lock COMPLIANCE on, 1-year retention. - backups/ — Timescale pgBackRest, object-lock off. - docs/ — docs site build artefacts, public-read.
- Replication: bucket-level replication to AWS S3 via
mc mirror
every 5 min for live, 1 h for archive. Runs from ops-01 with circuit-breaker (if replication lag > 30 min, page).
- Upgrade strategy: rolling, one node at a time; Galexie
retries transient writes with exponential backoff.
3.6 stellarindex-api pool
- Instances: 3 pods on 3 hosts, stateless, behind HAProxy.
- Health checks: HTTP
/healthz(shallow: process up) and
/readyz (deep: every registered ReadyChecker polled in parallel; wave-110 split into critical (Postgres → 503) vs non-critical (Redis → 200 with status="degraded"). HAProxy routes only to readyz=200, so a Redis-only outage no longer drains the pool — cache misses fall through to Timescale per ADR-0007.
- Autoscaling: static 3 at launch; target 50% CPU. Scale-up
requires an operational decision; we do not let the autoscaler paper over a bug.
- Rolling deploy: 1-at-a-time, 60 s drain, 30 s settle.
- Graceful shutdown: 30 s for in-flight requests + SSE
connections (SSE peers re-connect to the new pod automatically).
3.7 stellarindex-aggregator
- Instances: 2, leader-elected via Redis
SET key NX EX 30
with periodic renewal.
- Role: reads
tradeshypertable, computes running VWAP/TWAP,
writes to Redis hot-key + Timescale precompute tables.
- Why leader-elected instead of sharded: our aggregation compute
load is small (< 1 core per second on current market volume); a single active instance is simpler and preserves strict ordering.
- Failure mode: leader dies → standby acquires lock within 30 s;
Redis hot keys stale-flag for ≤ 30 s until the new leader writes fresh values.
3.8 stellarindex-indexer fleet
- Topology: one
stellarindex-indexerprocess per source (SDEX,
Soroswap, Aquarius, Phoenix, Comet, Blend, Reflector, Redstone, Band, CEX, FX — roughly 11 processes).
- Cursors: persisted in Timescale per-source
(cursor(<source_id>)). On restart the indexer resumes from the saved cursor.
- Backfill: triggered via
stellarindex-ops backfillsubcommand;
writes into the same hypertable with idempotent upserts keyed on (source, ledger, tx_hash, op_index, ts).
- Failure mode: one source dies → others continue. The dead
source's freshness timer in Prometheus breaches the 60 s alarm; /v1/price for pairs that rely on that source sets reduced_redundancy=true in the envelope.
3.9 stellarindex-migrate
Not a long-running process. Runs before each deploy in a pre-start job. Uses PostgreSQL advisory lock pg_try_advisory_lock(...) to prevent two migrators from racing.
3.10 stellarindex-ops
Admin CLI. Runs from an operator's SSH session on ops-01. Top-level subcommands cover backfill, gap-detection, archive-completeness verify/check/fix, source decoder verification, RPC probe, archive hash-walking, and supply-snapshot generation. The authoritative list is the binary's own help output (stellarindex-ops --help) and the source at `cmd/stellarindex-ops/main.go`; operator runbooks under `docs/operations/runbooks/` cite the specific subcommand each playbook needs (e.g. runbooks/all-ingestion-down.md references stellarindex-ops backfill).
4. Capacity planning — napkin math
These are lower-bound estimates. Week 9 load-test supersedes them.
4.1 Traffic envelope
Assume 50 wallets × 200 active users each = 10 000 daily actives. A typical wallet asset-detail page makes ~5 API calls per render. Assume 10 renders per user per active day.
- Baseline: 10 000 × 10 × 5 = 500 000 requests/day = ~6 rps.
- Peak (everyone checks during a market move): ~60 rps.
- Service requirement: 1 000 req/min per client = ~17 rps per client.
Capacity target: 500 rps sustained, 2 000 rps burst. That is ~30× baseline; headroom protects us through a year of growth.
4.2 Per-component headroom
| Component | Sustained need | Headroom target |
|---|---|---|
stellarindex-api pods (Go, net/http) | 500 rps | 2 000 rps (4×) |
| PgBouncer | 500 qps most cached, ~100 qps actual Timescale | 1 000 qps (2×) |
| Timescale primary | 100 write-tps (trades) + ~50 read-qps | 500 write-tps (5×) |
| Redis | 5 000 ops/s (pre-+post-cache) | 50 000 ops/s (10×) |
| MinIO | 10 MB/s Galexie write, 50 MB/s backup replication | 400 MB/s (4×) |
Single-pod Go net/http routinely serves 10 000 rps on a modern host for lightweight handlers. Our handlers are mostly "Redis GET → JSON encode → return." Hitting 2 000 rps per pod with 3 pods is comfortable.
4.3 Storage growth
trades: ~150 trades/sec sustained Stellar network-wide × 1 kB
average row = 150 kB/s = ~13 GB/day uncompressed.
- With TimescaleDB native columnar compression: expect 10×
reduction → ~1.3 GB/day compressed.
- At that rate: ~500 GB/year post-compression. A single TB NVMe
disk lasts 2 years before we start pruning.
Validated assumption: Stellar's trade volume is roughly stable year-over-year at this order of magnitude. Re-measure post-launch.
5. Failure matrix
| Component dies | Blast radius | Behaviour | Time-to-recover |
|---|---|---|---|
1 stellarindex-api pod | 33% reduced serving capacity | HAProxy routes to other 2; auto-restart | < 30 s |
2 stellarindex-api pods | 66% reduced | degraded SLA warning alert | 1–5 min manual intervention |
| Redis master | One hash slot unavailable for ~30 s | stale_flag on affected keys; /v1/readyz returns 200 with status="degraded" during the window (wave-110 critical/non-critical split — Redis is non-critical, cache misses fall through to Timescale); HAProxy keeps the backend in service | Sentinel failover 15–30 s |
| Timescale primary | Writes fail | Patroni elects replica; api switches read pool via PgBouncer | 30–60 s |
| PgBouncer pair | All DB access fails | Depends on keepalived VIP failover timing | 5–15 s |
| 1 stellar-core | Aggregator loses one ingest source | duplicate stream from others; dedup by hash | instant |
| All 3 stellar-core | No new ledger events | API returns stale_flag=true and 30 s-old data from cache | minutes–hours |
| 1 stellar-rpc | getEvents subscribers fall over to survivor | automatic | < 10 s |
| MinIO 1–3 nodes | EC(6+3) preserves reads/writes | auto-heal on replacement | hours |
| MinIO 4–6 nodes | Writes fail; reads OK | alert SEV-1 | hours–days |
| HAProxy active | Keepalived VIP failover to peer | < 2 s drop | < 2 s |
| Aggregator leader | Standby acquires leadership | stale hot-keys for ≤ 30 s | 30 s |
| Colo power | Full primary outage | manual DR activation to cloud | 4 h (per DR runbook) |
| Internet link to colo | API unreachable | DNS failover to cloud DR | 5 min |
No single-component failure breaches 99.9% monthly (≤ 43 min/month). Two-component failures can breach; catalogued above with response times.
6. Security posture
Not the full threat model (that lives in docs/operations/threat-model.md, Week 9), but the HA-relevant items:
- Secrets: Vault (colo) + AWS Secrets Manager (cloud), cross-
replicated via periodic sync. Application reads at startup via a sidecar; no secret ever on a disk outside Vault.
- TLS everywhere internal and external. Internal: mTLS between
api↔pgbouncer↔timescale and api↔redis. External: Let's Encrypt + HSTS.
- Network segmentation: Management VLAN, data VLAN, DMZ for
HAProxy. api pool has no egress except to Timescale, Redis, and logging. Indexers have egress only to pinned CEX/FX IP ranges + stellar-rpc.publicnode.com fallback.
- HSM for validator keys (ADR-0004) — YubiHSM-2 on two physical
hosts.
- Audit log: every
stellarindex-opscommand recorded to an
append-only bucket. Admin surface requires 2FA via the jump host.
7. Observability
- Metrics: Prometheus pair (primary + replica); federated from
cloud Prometheus for DR. Retention: 30 d local, 1 y downsampled to MinIO via Thanos.
- Dashboards: Grafana — one dashboard per component + one
"Golden Signals" board (latency p50/p95/p99, error rate, saturation, traffic).
- Alerts: AlertManager → PagerDuty. Tiers:
- P1: 99.9 % SLA-breaking; pages immediately. - P2: degraded; pages during business hours + daily summary. - P3: informational; ticketed.
- Tracing: OpenTelemetry → Tempo. Sampling 100 % at development,
10 % at production, 100 % on errors.
- Logs: structured JSON via zerolog; shipped to Loki with
14-day retention + 1 y cold.
Alerts already sketched in docs/operations/alerts-catalog.md (Week 9).
8. Backup & restore
| Asset | Tool | Frequency | RPO | Retention | Restore drill |
|---|---|---|---|---|---|
| Timescale | pgBackRest → MinIO | full weekly, diff daily, incr hourly, WAL stream | 5 min | 90 d full, 3 y incr | Monthly to db-drill-01 |
| Redis | AOF every-second | 1 s | AOF last 7 d | 7 d | Not needed (cache) |
MinIO galexie-live | versioning | per write | 0 | 30 d versions | Monthly restore of one ledger window |
MinIO galexie-archive | versioning + object-lock | per write | 0 | indefinite | Annual |
| Configs (in Git) | Git → GitHub | every commit | 0 | indefinite | Every deploy is a restore |
| Secrets (Vault) | Vault snapshot → S3 (encrypted) | 4× daily | 6 h | 30 d | Quarterly |
Restore time objectives:
- Hot (Timescale point-in-time, last hour): 10 min.
- Warm (last week): 2 h.
- Cold (arbitrary historical ledger): 8 h worst case.
9. Degradation modes (what we promise under failure)
We document "what happens when prices become unavailable, sources start to differ, etc." The API envelope (to be specified in api-design.md §Error envelope) carries four boolean flags:
| Flag | Meaning | When we set it |
|---|---|---|
stale_flag | Price > 30 s old | Redis hot key TTL expired + aggregator hasn't written new value |
reduced_redundancy | Price derived from fewer sources than normal | Any configured source for this asset is unhealthy (cursor lag > 60 s) |
triangulated | Price derived via a USD/BTC hop, not direct | Pair has no direct market meeting min-volume threshold |
divergence_warning | Sources disagree > configured threshold | Cross-check against CoinGecko / CMC / Chainlink-HTTP fails bound |
No flag is a response-level error; they're advisory. Clients decide whether to accept. The price value is always best-available; stale_flag=true means "here's the last known good, fix your decision-making accordingly."
Specific "everything is on fire" scenarios:
| Scenario | Response |
|---|---|
| Full primary-colo outage | DNS flip to cloud DR → API serves from AWS + last-synced Timescale replica (RPO 5 min) with stale_flag=true on every response until ingest is re-established. |
| One critical source (e.g., Reflector) offline | Affected assets get reduced_redundancy=true; others unaffected. |
| Divergence: Redstone vs CEX > 5% | divergence_warning=true on affected assets; internal alert to @ash for market-event sanity check. |
| TimescaleDB read-replica lag > 10 s | API briefly reads from primary (via PgBouncer session-mode pool); alert if sustained. |
10. Launch checklist (HA subset)
- [ ] All 3 stellar-core + galexie instances running stably for 7 days with no crashes.
- [ ] Patroni failover drilled end-to-end in staging (simulate primary OOM).
- [ ] Redis Sentinel failover drilled (kill master during load).
- [ ] Load test hits 2 000 rps with p95 ≤ 200 ms on cached endpoints.
- [ ] Restore drill: point-in-time recovery to 24 h ago, < 2 h wall-clock.
- [ ] DR drill: DNS-flip to cloud, serve for 1 h, flip back.
- [ ] Alerts catalogue reviewed — every alert has a runbook link.
- [ ] SEV-1 + SEV-2 playbooks rehearsed with a tabletop exercise.
None of these are green today (Week 1). Every line becomes a PR checklist at its owning week.
11. Open questions — closed
The Week-1 plan called for these to land as ADRs or design docs by end of Week 2. They have:
- Colo provider + physical locations — Hetzner FSN1 (Falkenstein, DE)
for R1; AWS for R2; Vultr for R3. See r1-deployment-state.md + ADR-0016.
- Patroni vs Stolon vs native TimescaleDB HA — Patroni; landed
as configs/ansible/roles/patroni/.
- MinIO EC(6+3) vs EC(4+2) — EC(6+3); fixed in ADR-0008 §2.
- Cloud DR region — AWS eu-west-1 (matching the colo latency
profile for European users); ADR-0008 §5.
- Secret-manager choice — Ansible Vault for inventory secrets;
configs/ansible/inventory/r1.secrets.yml is the source of truth, per the playbook README.
- Observability stack — self-hosted Prometheus + Grafana +
Loki; ansible roles configs/ansible/roles/{prometheus,loki}/ deploy them.
Anything new that surfaces post-ratification gets a fresh ADR rather than an entry here.
12. Cost envelope
Order-of-magnitude; concrete per-line numbers live in the operator's own cost spreadsheet (not checked into the repo). Below is the shape used to size hardware in ADR-0008.
| Line | Monthly | Notes |
|---|---|---|
| 3 × R640 colo + power + bandwidth | $1.5–2k | existing footprint, already owned; incremental |
| 9 × MinIO nodes (smaller chassis) | $2–3k | 180 TB raw, ~120 TB usable after EC |
| 3 × Timescale hosts | already covered by R640s | |
| Cloud DR (AWS) | $1–2k warm, $5k+ on failover | RDS async + stateless scale-to-zero |
| Observability (Grafana Cloud or self-hosted) | $500 | |
| CDN (Cloudflare) | $200 | |
| Domain + TLS + GitHub | $100 | |
| Total steady state | ~$5–8k / month |
Revenue model is out of scope (free public API; SDF grant funds). Cost envelope checked against the infrastructure budget.
13. Appendix — tooling
- HAProxy — 2.9 LTS.
- keepalived — for VRRP VIPs.
- Patroni — 3.x with etcd3 DCS.
- PgBouncer — transaction mode.
- Redis — 7.x with Sentinel.
- MinIO — current RELEASE.* on the docker-compose profile;
baremetal RPMs in production.
- pgBackRest — with MinIO as the repo backend.
- Prometheus + AlertManager + Grafana + Loki + Tempo — "grafana
stack." Possibly replaced with Grafana Cloud depending on cost model.
All tools are Apache-2.0 / MIT / PostgreSQL / BSD-compatible. No copyleft dependencies in the serving path.