⚠️ Retention revised — see migration 0031 (2026-05-14). The 90-day raw-trade retention and 30-day sub-hourly-aggregate retention described below were REMOVED. Raw trades and all price aggregates are now retained indefinitely. Hourly+ aggregates were always indefinite.
Context
Stellar Index must retain:
- Raw trades from on-chain (SDEX, Soroswap, Aquarius, Phoenix, Comet,
Blend), oracle feeds (Reflector, Redstone, Band), CEXes, and FX providers — ~150 events/sec network-wide as of 2026-04, growing.
- Derived aggregates at multiple grains (1m, 15m, 1h, 4h, 1d, 1w,
1mo VWAP / TWAP / OHLC), with 1-hour-and-above retention indefinitely.
- Since-inception historical pricing backfilled from Galexie.
The storage layer needs to satisfy, simultaneously:
- High-throughput, ordered inserts. Append-only, partitioned by
time, monotonic within a partition.
- Point-in-time window queries. "VWAP for XLM/USD from T-24h
to now" must complete in ≤ 200 ms p95 on cached data.
- Precise amounts. i128 Soroban quantities never truncate
(ADR-0003). Numeric columns with arbitrary-precision storage.
- Retention policies. Raw grains age out after 90 days; hourly+
aggregates retained indefinitely. Must be declarative, not batch-deleted.
- Materialised aggregates. Continuous/incremental materialisation
so every /v1/history read doesn't re-scan raw trades.
- Operational maturity. HA (streaming replication / failover),
backup/restore, PITR, observability, community-known tuning patterns. We can't pioneer our storage layer.
The HA plan (docs/architecture/ha-plan.md) already assumes Patroni-managed Postgres-compatible storage and the coverage matrix + API design (docs/reference/api-design.md) depend on continuous aggregates at the exact grains the API serves. We need an ADR to bind the choice rather than leaving "TimescaleDB (planned)" in a half-dozen downstream docs.
Decision
TimescaleDB on PostgreSQL 15 is the primary store for raw trades, oracle updates, and derived price aggregates.
Structure:
- Raw
tradeshypertable, partitioned bytsdaily, chunk interval
1 day, columnar-compressed after 7 days. ts+pair primary-key + source identity index (source, ledger, tx_hash, op_index, ts).
oracle_updateshypertable, same shape astradeswith an
oracle_source column.
events_rawhypertable for Soroban event blobs, compressed after
7 days.
- Seven continuous aggregates (
prices_1m,prices_15m,
prices_1h, prices_4h, prices_1d, prices_1w, prices_1mo) built via CREATE MATERIALIZED VIEW ... WITH (timescaledb.continuous) and kept fresh by add_continuous_aggregate_policy.
- Retention policies via
add_retention_policy— raw retained 90
days uncompressed, compressed indefinitely; sub-hourly aggregates 30 days; hourly+ aggregates indefinite.
- Numeric amounts as
NUMERICcolumns (arbitrary-precision, matches
canonical.Amount string wire form). Never bigint.
HA via Patroni (see multi-region-topology.md): one synchronous replica in the sibling region, one async in the distant region. Backup via pgBackRest to MinIO with 5-min RPO.
Consequences
Positive
- Continuous aggregates directly deliver the required
timeframes + granularities without us maintaining a separate materialisation pipeline.
- Hypertable compression yields ~10× reduction on raw trades
(documented by Timescale + validated in other pricing systems), so our sizing (~500 GB/year compressed per archival-node-spec.md §3.3.1) stays tractable.
- Everything Postgres-ecosystem works: Patroni, pgBackRest,
pg_stat_statements, pgAudit, PgBouncer. Our operational knowledge transfers directly; existing tooling (pgadmin, psql, etc.) just works.
- The
NUMERICtype is genuinely arbitrary-precision, not a
floating-point workaround. i128 amounts round-trip without truncation — validated in amount_test.go against the KALIEN-incident fixture.
- Monorepo discipline (ADR-0005) isn't threatened — Timescale is
an operational dependency, not a Go module dependency.
Negative
- Licensing asymmetry. TimescaleDB Community Edition is
Apache-2.0 on the core hypertable engine but TSL (Timescale License — source-available, not OSI-approved) on continuous aggregates, compression, and data-tiering. We use TSL features. This is not a blocker per our Apache-2.0 repo license (we ship code, not Timescale binaries), but operators self-hosting must accept the TSL on the Timescale binary. Documented in deploy/docker-compose/README.md.
- Single-primary write bottleneck. Patroni gives us HA, not
horizontal scale. If write throughput exceeds one Postgres primary we have to shard, which Timescale supports (distributed hypertables are deprecated; the recommended path is app-layer sharding). Not a near-term concern at the network's volume.
- Continuous aggregate refresh adds background CPU. A badly-
tuned refresh policy can starve foreground queries. Requires operator discipline.
- **Operating Postgres + Timescale is more work than managed
alternatives** (AWS RDS/Aurora, Timescale Cloud). We pay this cost deliberately — see "Alternatives considered" below.
Operational impact
- Backups: pgBackRest with WAL streaming, ~5-min RPO. Monthly
automated restore drill, documented in docs/architecture/ha-plan.md §8.
- Upgrades: Timescale major-version upgrades require `ALTER
EXTENSION timescaledb UPDATE` + sometimes downtime. Tested in staging before production, never skipped.
- Observability: built-in
timescaledb_informationviews +
pg_stat_statements → Prometheus exporters. No custom metric pipeline required.
Downstream design impact
- Aggregation layer (
internal/aggregate) queries continuous
aggregates by default, falls back to raw trades only when the requested window is inside the uncompressed-raw horizon.
- Retention decisions are declarative SQL in
migrations/— easy
to reason about, reviewable, version-controlled.
- Redis cache layer (ha-plan §3.4)
hydrates from Timescale on miss; cache keys map 1:1 to continuous- aggregate rows for the "last closed N-period" case.
- API
/v1/historyand/v1/ohlc(api-design.md)
queries hit the continuous aggregate matching the requested granularity — no on-the-fly rollup.
Alternatives considered
- Plain PostgreSQL (no TimescaleDB extension). Rejected: we'd
have to hand-roll chunk management + materialised view refresh policies + compression. That's several months of work Timescale gives us for free. PostgreSQL declarative partitioning exists but is not as operator-friendly for time-series workloads at our grain count.
- ClickHouse. Seriously considered. Genuinely faster for
columnar analytical reads, single-server scales further than Timescale. Rejected for three reasons: - Ecosystem mismatch — we're a Postgres-native shop, the rest of our tooling (ORMs, migrations, Patroni, pgBackRest) doesn't apply. Doubling the operational surface costs us more than Timescale's per-query speed saves. - Mutable data semantics are weaker (CollapsingMergeTree, ReplacingMergeTree). Our ingest has real duplicate-handling needs (region de-dup, backfill re-runs); Timescale's upserts on a hypertable primary key are more natural. - Numeric precision: ClickHouse has Decimal256 but its JSON I/O path in several drivers historically truncates. Our i128 invariant (ADR-0003) demands we be certain about precision at every boundary.
- InfluxDB (OSS or Cloud). Rejected. Flux query language is
its own learning curve; the OSS 2.x line had stability and retention-policy usability issues for people at our scale; InfluxDB's pricing story has moved several times in the past three years (3.x architecture pivot). Risky to build a customer- facing pricing SLA on.
- Apache Cassandra / ScyllaDB. Rejected. Write throughput is
great; read pattern for our use case (time-windowed aggregations across many pairs) is an anti-pattern for Cassandra's data model. We'd end up building a secondary OLAP layer anyway.
- Amazon Timestream. Rejected. AWS-locked; the multi-region
story (multi-region-topology.md) requires colo / bare-metal. Open-source self-hosting is a hard project constraint. Timestream violates it.
- Timescale Cloud (managed). Rejected for the same self-host
constraint. Revisit once we have a managed-offering tier of the product.
- Parquet files in MinIO + DuckDB / Trino at query time.
Considered for historical backfill-only storage. Rejected as the primary store because query latency won't hit p95 ≤ 200 ms for arbitrary windows. Retained as a future tiered-storage option — aged-out Timescale chunks can be exported to Parquet in MinIO, queried via DuckDB for analytical research workloads. Not in the v1 scope.
References
- Related ADRs:
- ADR-0003 (i128 no-truncation) — NUMERIC columns satisfy this. - ADR-0002 (MinIO/S3-compat) — MinIO is the backup + tiered-storage target. - ADR-0005 (monorepo) — Timescale is an operational dependency, not a Go-module dep.
- HA + topology:
- ha-plan.md §3.3 — Patroni topology. - multi-region-topology.md §5 — sync + async cross-region replication shape.
- External:
- TimescaleDB docs — <https://docs.timescale.com/> - Continuous aggregates — CREATE MATERIALIZED VIEW ... WITH (timescaledb.continuous) - Compression — Timescale columnar compression design docs. - Timescale License (TSL) — <https://github.com/timescale/timescaledb/blob/main/tsl/LICENSE-TIMESCALE>