Skip to main content
Back to research
ArchitectureLast verified 2026-05-03

Aggregation plan — the policy chain from raw trade to served price

The policy chain from raw trade to served price — class filter, outlier filter, VWAP, freeze gate. Every load-bearing decision the aggregator makes.

View source on GitHub

Every served price flows through one path:

Timescale `trades` hypertable    ← decoders write here (per ingest-pipeline.md)
    │   (TradesInRange per (pair, window))
    ▼
internal/aggregate/orchestrator/ ← tick loop, one (pair, window) per call
    │   1. fetchForTarget(target, window)
    │      ├─ direct TradesInRange(target, …)
    │      └─ optional: stablecoin-backer expansion (XLM/fiat:USD →
    │                  XLM/USDT, XLM/USDC, XLM/DAI, XLM/PYUSD,
    │                  XLM/USDP — each rewritten via ProxyPair onto
    │                  the target)
    │   2. class filter (default: drop non-ClassExchange rows)
    │   3. σ-threshold outlier filter (default 4σ)
    │   4. VWAP via internal/aggregate/vwap.go
    │
    ▼
Redis  ← key `vwap:<base>:<quote>:<window-seconds>`, TTL = window
    │
    ▼
internal/api/v1/  ← /v1/vwap, /v1/twap, /v1/price (cache-first)
    │
    ▼
HTTP consumer

/v1/sources is the read-only sibling: it surfaces the same external.Registry the class filter consults so API consumers can see which venues contribute to VWAP and which are visible-only.


The policy chain

The orchestrator applies three filters between TradesInRange and aggregate.VWAP. Each step is independent and falls back to "input unchanged" when its config flag is off.

StepDefaultConfig flagPurpose
1. Stablecoin expansionOFFaggregate.enable_stablecoin_fiat_proxyExpand fiat-quote targets to direct + stablecoin backers; rewrite via aggregate.ProxyPair
2. Class filterONaggregate.disable_class_filter (inverted — zero is filter ON)Drop non-ClassExchange rows; aggregator / oracle / authority_sanity classes don't contribute to VWAP
3. Outlier filterON (σ=4.0)aggregate.outlier_sigma_thresholdDrop trades whose price differs from the window mean by > σ standard deviations

Order matters: class filter runs before the outlier filter because the σ arithmetic should run over a pair-homogeneous, exchange-only sample. Stablecoin expansion runs before both — it re-stamps the rewritten trades onto the target pair, and the class filter then treats each row by its venue identity (binance, coinbase, …) not by the original on-chain pair.

Why each filter is here, not at ingest

  • Decoders never re-stamp pairs. A USDT depeg event is news;

rewriting XLM/USDT → XLM/USD at decode time would hide it.

  • Decoders never drop trades by class. A CoinGecko poll is

data we want to record + serve via /v1/sources; we just don't want to fold it into our own VWAP. Filtering at decode would strip information we need.

  • Decoders never drop outliers. σ-deviance is a window-relative

signal. A row that's 5σ from the per-pair window mean is noise on a single pair but might be perfectly normal across all pairs combined.

In short: ingest preserves truth; aggregation applies policy.


Configuration surface

[aggregate] in TOML drives the orchestrator. Operator overrides win; empty falls back to library defaults.

TOML keyLibrary defaultEffect
pairs[] → built-in (XLM/BTC/ETH × USD/EUR/GBP)Operator-supplied coverage set as canonical pair strings
windows[][5m, 1h, 24h]Per-window cadences as Go time.Duration strings
interval_seconds30Tick cadence — gap between successive (pair, window) refreshes
max_trades_per_window10 000Per-(pair, window) row cap
disable_class_filterfalseOff ⇒ ClassExchange-only VWAP (default)
enable_stablecoin_fiat_proxyfalseOn ⇒ fiat-target fan-out across stablecoin backers
outlier_sigma_threshold4.0σ-threshold (0 disables)
vwap_window_seconds300Legacy alias retained for backwards-compat
twap_window_seconds300TWAP-specific cadence (used by api/v1/twap.go)
min_usd_volume10 000Eligibility threshold
triangulation_enabledtrueMaster switch for the post-refresh triangulation pass; false skips the tick regardless of aggregate.triangulations rows. Triangulated rows now serve via /v1/price (PR for F-0014) — the switch is the operator-side kill-switch when the feature itself needs to be paused.

The full reference lives at `docs/reference/config/README.md`; this table is the curated subset that drives aggregator behaviour day-to-day.


Observability

Three Prometheus rules (deploy/monitoring/rules/aggregator.yml) consume four counters from internal/obs/metrics.go:

CounterLabelsUsed by
stellarindex_aggregator_ticks_totaloutcome (ok/error)aggregator_silent alert
stellarindex_aggregator_vwap_writes_totalaggregator_silent alert
stellarindex_aggregator_empty_windows_total(Operator dashboards; see runbooks)
stellarindex_aggregator_dropped_trades_totalreason (class/outlier)aggregator_outlier_storm + aggregator_class_drop_spike alerts

Alert runbooks at:

Baseline-comparator alerts use offset 1h to auto-tune to operator traffic. Suppress for the first hour after deploy — the comparator returns zero before there's an hour of history.


API surface

EndpointBacked byPurpose
GET /v1/vwap?pair=…Redis cache (vwap:<base>:<quote>:<window-seconds>)The aggregator's primary product
GET /v1/twap?pair=…Trades hypertable (on-query)Time-weighted average — internal/aggregate.TWAP runs against raw trades for the request's window. The orchestrator does not pre-compute TWAP today (TWAP-via-orchestrator path stays out of scope; see Deferred).
GET /v1/price?pair=…Redis cache → trades fallbackLast-trade or VWAP depending on freshness
GET /v1/sourcesexternal.Registry (static)Class + IncludeInVWAP metadata for every known venue
GET /v1/marketsTimescale DistinctPairsTrade-table coverage; orthogonal to the registry

/v1/sources and the orchestrator's class filter agree by construction — they consume the same external.Registry, so a venue listed with include_in_vwap=true *will* contribute to the cached VWAP, and one with false *will not*. Discrepancies between the two surfaces are a bug to surface in PR review, not a runtime concern.

Closed-bucket-only serving (cross-region consistency)

Per ADR-0015, the API endpoints above (/v1/price, /v1/vwap, /v1/twap, /v1/ohlc) NEVER expose the in-progress (currently-filling) window — only the most recent closed bucket. The orchestrator writes both the in-progress and closed-window CAGG rows to Timescale; query handlers MUST filter bucket_to_ts <= now() so clients only ever see immutable, content-addressed values.

This is what makes "all 3 regions serve exactly the same rate" a real property rather than a hopeful one: closed-bucket rows are deterministic given the same trade inputs, and (sub-second to seconds-of-replication-lag aside) replicate to all regions byte-identical. See ADR-0015 for the trade-off analysis and the ≤30 s freshness contract this places on the default /v1/price window.


Boundaries — what this layer does NOT do

  • No persistent state. The orchestrator is stateless across

ticks; everything it needs comes from Timescale or external.Registry at refresh time. Restart-friendly by design.

  • No cross-binary state coupling. Aggregator → API

communication is via Redis keys + the static registry. The API has no read path into the orchestrator's in-memory Stats().

  • No write path back to Timescale. VWAP results live in Redis

with TTL; if Redis loses the world, the next tick rebuilds it from raw trades. Continuous-aggregate materialised views (when they ship under migrations/) provide the long-tail historical answer; the orchestrator focuses on the hot, freshness-sensitive cache.

  • No per-pair Prometheus labels. Cardinality stays bounded —

pair-level lenses live in the Redis key namespace and on the API contract, not on /metrics.


Shipped since the original draft (2026-04-25)

These were deferred when this doc was first written; they landed during the launch-readiness sweep:

  • Triangulation. Shipped — internal/aggregate/orchestrator/triangulate.go

runs after the per-pair refresh, computes implied legs (e.g. XLM/USD × USD/EUR = XLM/EUR), writes to the same VWAP key namespace with a :provenance marker, and the flags.triangulated envelope field is populated by the API. X2.5 forex-snap rule closes the across-region consistency gap (closes F-0014).

  • Divergence detection. Shipped — divergence.Service queries

CoinGecko + Chainlink HTTP per-pair on every aggregator Tick (per internal/aggregate/orchestrator/divergence_refresh.go, PR #429), writes div:<asset> to Redis with a 5-min TTL, and the API's flags.divergence_warning reads the cache. Per-Tick outcomes labelled by ok / no_vwap / parse_error / refresh_error via stellarindex_divergence_refresh_total; sustained refresh_error → stellarindex_divergence_refresh_error_dominant alert (P3).

Deferred — natural follow-ups

Listed here so a future contributor can pick one up without re-deriving the design space:

  • TWAP-via-orchestrator pre-compute. /v1/twap reads

the trades hypertable on every request today; the orchestrator could pre-compute time-weighted averages alongside VWAP and serve them from Redis. Deferred behind real production traffic data — VWAP is the dominant query shape; pre-computing TWAP too costs Redis without an established demand-side signal.

  • MAD-based outlier filter. σ-mean is brittle on small

windows with fat tails. Switch to median-absolute-deviation behind the same outlier_sigma_threshold flag once we have pubnet runtime data backing the change with an ADR.

  • Continuous-aggregate refresh driver. Timescale's background

job handles materialised-view refresh today. A custom driver with tighter freshness guarantees lands when API consumers start hitting historical CAGGs at fresh-data SLAs.

  • Per-source weighted VWAP. Currently every contributing

source weights at 100. The Metadata.DefaultWeight field is shaped to support per-source overrides via config; the math change to aggregate.VWAP lands when an operator actually needs it.

Each is a drop-in extension — no shape change to the existing orchestrator's Config or to the surrounding contracts.


References