Added
- OpenAPI 200 examples for every GET operation: all 70 JSON GET
operations in
openapi/stellar-index.v1.yaml now carry a media-level
example under their 200 response (previously 7 of 96 operations had
one). Examples are trimmed live responses from api.stellarindex.io;
auth-gated surfaces (/account/*, /dashboard/*, /signup/verify,
/auth/sep10/challenge) carry small hand-crafted examples matching
their schemas. Rendered reference, Postman collection, and explorer
types regenerated. - Contract event rows decode their payload (S-016 tail): the events
table showed fifty bare "transfer" rows — /v1/contracts/{id} rows now
carry
topics (human-readable renderings of topics[1:] — addresses as
strkeys, i128 amounts as integers) and data, and the explorer renders
a Detail column with accounts/contracts linked. New scval.Display
renderer (depth-capped, truncating, display-lossy by design).
Fixed
- Catalogue analytics actually ship (the v0.7.4/v0.7.5 fix landed in
the wrong function): two code paths share a byte-identical price-fill
line and the twin-stats merge anchored on the first — writeCataloguePage
(class-filtered) instead of serveCatalogueUnifiedPage (the page the
explorer serves). Root-caused by running the API locally against the
production database. The unified page now merges twin stats AND honours
include=sparkline7d; native XLM enriches via its dedicated row reader;
merged twins get the same supply-derived market-cap fill as classic rows.
Verified against production data pre-release: XLM $6.9B mcap, real 24h
changes across the catalogue.
Fixed
- Catalogue enrichment, third time honest (v0.7.4 follow-up): the Q=
refinement still merged nothing live — Q substring-matches the
code/slug/issuer COLUMN VALUES, so a full asset id (longer than any
column) can never match. The twin lookup now uses the exact Issuer
filter and picks the matching asset id; the pin test's stub mimics the
real SQL semantics so neither wrong reader shape can pass again.
Fixed
- Catalogue rows actually gained their analytics (v0.7.3 follow-up):
post-deploy verification showed the twin merge merging nothing — only
the listing query computes the windowed change columns, and the
per-asset reader returns them nil. The enrichment now reads the
listing reader with an exact-id filter; a stub test pins the
dependency by mimicking production's reader asymmetry.
Added
- Developer-experience batch (dev-surfaces audit): the root README's
"Start here" gains a Go SDK section (install line + the doc.go quick
start + pointers to
examples/curl / examples/postman); four new
runnable curl examples — 11-price-at.sh (point-in-time price),
12-history-since-inception.sh, 13-sac-wrappers.sh,
14-asset-detail.sh — all verified against the live API, with the
examples index updated (and its quote=USD corrected to
quote=fiat:USD); and the openapi CI job now regenerates the
Postman collection and fails on drift, completing the
generated-artifact drift guard (API reference + explorer TS types
were already gated).
Fixed
- Dev-surface examples match the shipped SDK (dev-surfaces audit):
the explorer /sdk page's snippets compile against
pkg/client
again — p.AssetID not p.Asset, t.Timestamp not t.TS,
HistoryRangeQuery for raw-trade history, OHLCQuery without the
nonexistent Interval field, and the SSE pattern now shows the
intended raw net/http consumption instead of a PriceStream
method the SDK never had; the page intro no longer claims every
API endpoint is reachable through the SDK. The homepage "Try the
API" EUR example 404'd (/v1/assets/euro) — now
/v1/external/assets/euro. docs/getting-started.md stops claiming
SDKs prefer X-API-Key (the Go SDK sends Authorization: Bearer)
and its stale "ships as PR queue lands" SDK method list is replaced
with the real ~36-method surface. - Catalogue rows carry full analytics; assets stop appearing twice
(Pass-B AM-10 + the catalogue-dash residual): the unified listing's
verified rows (XLM, USDC, …) now absorb their Stellar-network twin's
1h/24h/7d changes, 24h volume, supply and market cap, and the classic
twin row is suppressed — one canonical row per asset instead of a
dash-only catalogue row plus a second ranked copy. Contract detail
additionally serves
protocol when attribution is known (CON-3). - OpenAPI spec quality batch (2026-07-03 docs audit): the spec is
now fully self-describing — descriptions for all 8 shared
components/parameters (Timeframe, Granularity, From, To,
TypeFilter, CodeFilter, IssuerFilter, Limit), real descriptions for
the 11 summary-only operations (the three SEP-40 oracle
passthroughs now say they mirror the on-chain contract calls,
GET/POST /price/batch, /ledgers/{seq} + /transactions, /pairs,
/version, /assets/{id}/metadata, /account/me), and descriptions for
the undocumented inline params (window_days ×3, entity_type,
issuer g_strkey, the /contracts/{id}/transfers filter set, the
per-op limit params on explorer listings). Tag hygiene: auth +
dashboard are declared in the top-level tag list, the Prices /
prices casing strays merged into the single price group, and
/contracts/{id}/transfers moved from meta to explorer. Auth is
now honest: APIKeyAuth documents the Bearer sip_… scheme +
where keys come from, a new SessionCookie scheme is applied to
all /dashboard/* operations, and /account/me carries the 401 it
actually returns (new shared Unauthorized response). The 16
operations that documented no error shape at all now reference the
shared problem+json responses, the X-RateLimit-* headers on the
shared 429 are described (including when they are absent:
rate-limiter disabled or Redis fail-open), and info.version is
promoted from 1.0.0-draft to 1.0.0. - Postman collection is authable out of the box: the generator
now injects collection-level bearer auth bound to a new empty
bearerToken collection variable (with instructions on where to
get a key), replacing the converter's noauth — previously no
request in the shipped collection could authenticate, while the
README documented a bearerToken variable that didn't exist.
README corrected: baseUrl includes /v1 (so the localhost
override is http://localhost:3000/v1). Generation stays
deterministic (seeded-random preload + id-stripping unchanged). - pkg/client doc + example truthfulness: the package Coverage
doc claimed "every server endpoint has a typed method (35 as of
2026-05-09)" and listed methods that don't exist (Coins, Coin,
Currencies, Currency); rewritten to the real ~36-method
pricing/read surface with the deliberate exclusions that
spec_contract_test.go registers and enforces. Examples now use the
current
sip_ key prefix instead of the legacy rek_.
Added
- ADR-0044 (Proposed): explorer rendering moves to edge SSR — the
2026-07-03 site audit traced three production failure classes (the
silent 20k-file deploy freeze, bake-time poisoned pages, between-deploy
staleness) to rendering at build time; OpenNext on Cloudflare Workers
with per-route-family edge caching replaces static export, with a
staged migration and the pruned static path supported until cutover.
Fixed
- The "richest account" is no longer the SDF burn address unlabelled
(Pass-B ACC-1): wealth rows now carry
locked: true for provably
unspendable accounts (master weight 0, all thresholds 0, no signers —
decoded from the current account entry) and the explorer badges them,
so $11.3B of burned XLM reads as what it is. - `include=sparkline7d` on /v1/assets actually works (Pass-B AM-03):
the explorer's directory has requested it since the coins→assets
dissolution and the server silently ignored it — a dead chart column on
every row. The unified listing (catalogue + classic phases) now honours
it with one batch read per page, and the param is documented.
- Transactions, operations, and ledgers pages get insight (site audit
S-004/S-005): the three directory pages were bare paginated lists while
the API already served the aggregates — only /network consumed them. A
shared component pair (daily-throughput chart with a metric selector,
ranked op-type mix bars) now leads each page with the series that answers
that page's question (txs / ops / ledger cadence), and /network consumes
the same shared code instead of its private copy.
- SAC contract pages answer "what is this?" (site audit S-016/S-014):
the contract header names the wrapped asset with a link to its asset page
(fed by the new event-derived SAC identification), the code-history panel
no longer promises a backfill will produce bytecode for a SAC, and pool
tables render colon-form classic assets (USDC:GA5Z…) with the same
code + issuer-org treatment as dash-form rows instead of a raw string.
- Three site-audit P0/P1 fixes in one pass (S-006/S-009/crawl):
(a) contract pages now identify uncaptured-instance SACs from their
CAP-67 event topics with a spoof-proof derivation cross-check — ~55k
contracts (e.g. the upvoteICE SAC) stop rendering an unexplained void
and instead say which asset's SAC they are; (b) the API's
trailing-slash 308 now runs INSIDE the CORS middleware — it previously
carried no Access-Control-Allow-Origin, so browsers killed the
redirect and every trailing-slash API URL was as dead as the 404 the
redirect exists to prevent; (c) market-pair pages no longer canonical
to double-encoded 404 URLs (the route param arrives pre-encoded;
~500 pages were telling crawlers their real URL was a dead page).
- The /assets listing serves the real universe (site audit
S-002/S-011): page 1 of the unified listing now fills from the ~191K
classic long tail when the curated catalogue is shorter than the limit
(the pager's own doc always promised this; in practice page 1 was 11
rows presented as the entire asset universe), and the search box's
q=
now actually filters server-side across both phases (the storage layer
supported it all along — the handler never passed it). - Navigation tells the truth (site audit S-001/S-017/S-019): "DEX /
AMM" now lands on /dexes (the per-protocol venue + pools view that
already existed) instead of the verification index; the rail's
single-protocol "Soroswap Router" entry becomes "Verification" (the whole
15-protocol index); the search palette's "Account" entry points at
/dashboard instead of a route that never existed.
- Impersonated identities no longer render on flagged issuers (site
audit S-010): the "LOBSTR — SCAM" row was actually a stellar.expert-listed
counterfeiter whose on-chain home_domain impersonates lobstr.co — our
pipeline was serving the stolen identity as the row's name, indicting the
victim brand. Flagged, UNVERIFIED issuers now serve no self-declared
org_name/home_domain (the G-key + reason are the honest identity); the
explorer badge is category-aware (SCAM / DEPRECATED / UNSAFE), so a real
org's deprecated legacy issuer no longer wears a red SCAM tag.
- ClickHouse snapshot rows were being merge-destroyed (site audit,
P0-data): checkpoint state-snapshot rows all carried the same
(tx_hash="", op_index=-1, change_index=0) tail of the ReplacingMergeTree
sort key, so every snapshot entry modified in the same ledger collapsed
to ONE arbitrary survivor at merge time — measured >55% of the 48M-entry
Phase-C set already gone (blast radius: account-state, trustline, supply,
wasm readers). ChangeIndex is now crc32(key_xdr) — per-key unique,
re-run idempotent; the wasm/SAC instance reads repoint at
ledger_entries_current (current-state MV, merge-immune, PK-prefix
lookup). Operator: re-run
state-snapshot -write -scope all to re-land
the destroyed entries.
Added
- Asset logos for wallets (board #47):
/v1/assets/verified rows now
carry image (the issuer's SEP-1 logo URL, https-only + sanitized — the
bulk surface for wallet icon loading; per-asset detail already served it).
Root-caused the biggest asset's missing metadata en route: Circle's
on-chain home_domain (circle.com) 404s its stellar.toml — a curated
domain-override map (same hand-vetted pattern as knownIssuers) redirects
the FETCH to the still-serving centre.io while the on-chain value stays
authoritative for identity; the bidirectional org-verification still
holds because centre.io's TOML lists the issuer back. - Point-in-time price: `GET /v1/price/at?asset=&ts=` (board #46, the
wallet-builder accommodation the RFP audit recommended): the closed
1-minute VWAP bucket at-or-before a historical instant — the
cost-basis/PnL/tax lookup every portfolio tool needs.
observed_at is
the bucket's own close time (never the requested ts), and a nearest
bucket more than 24h before ts is an honest 404 instead of fabricated
continuity across dead markets. SDK gains Client.PriceAt. - Deep CEX history + queryable per-market inception (board #44):
kraken's raw-fills endpoint (
/Trades, full history, nanosecond-cursor
pagination) is now a backfill path — backfill-external -raw-trades
reaches 2018-era XLM/USD where the OHLC endpoint's 720-candle horizon
returned nothing (golden-tested on a real captured 2018 frame; venue
rate-limit paced). /v1/markets?include=inception serves
first_trade_at per market — the RFP's "since inception = first
recorded trade" as a queryable fact rather than a footnote. - 1-month OHLC granularity + query-selectable price window (board #43,
the last RFP-text gaps):
/v1/ohlc?interval=1mo serves true
calendar-month bars from the prices_1mo CAGG (which existed since
migration 0002 — only the endpoint's validator rejected it), completing
the RFP's suggested-granularity ladder; /v1/price?window=300|3600|86400
serves the aggregator's continuously-published rolling VWAP for that
window (default 60 = the closed-1m-bucket behavior, unchanged) — an
unpublished window is an honest 404, never a silent substitution. - Tip freshness hardening (board #42): an empty rolling window on
/v1/price/tip now escalates once to the 30s SLA bound before ANY
closed-bucket fallback. Live samples had shown ~90s staleness on quiet
seconds — the 5s default window missed and fell straight through to the
closed-minute store price; staleness now exceeds 30s only when the pair
genuinely had no trade in the last 30s (window_seconds reports the window
actually used; test pins the quiet-pair path). The /price-vs-/price/tip
freshness contract is now spelled out in the spec.
- Batch price rows carry `change_24h_pct` (board #41): /v1/price/batch
rows with a fiat:USD quote pair current price with the signed trailing-24h
change — the exact bulk shape the Freighter RFP names for portfolio
screens. Silent-nil when no comparison bucket exists (a missing change
never costs the price itself). Audit corrected en route:
fdv_usd,
max_supply, and detail change_24h_pct already existed with correct
null-omission semantics — the first pass mistook null-omission for
absence. - SAC contract-address lookups now return the wrapped asset's full detail
and price (board #40, the RFP audit's biggest wallet-facing gap): a
/v1/assets/{C...} lookup whose contract is a Stellar Asset Contract
resolves to the classic (or native) identity it wraps — trust-anchored on
the lake instance's core-minted StellarAsset executable AND a
derivation cross-check (the resolved asset must re-derive to the queried
address, so a spoofed metadata name can never redirect pricing;
adversarial test pins it). Classic + native asset detail now carries
contract_id (the deterministically-derived SAC address, valid even
pre-deployment), golden-tested against the on-chain USDC + XLM SACs.
Changed
- Legacy-brand purge (pre-rebrand naming), everywhere it was still live.
The API's diagnostic cache header is now
X-Stellarindex-Cache;
region.home_domain serves the current domain; the legacy-domain Caddy
alias is REMOVED (hard cutover — the old domain no longer serves); the
spec + docs drop the legacy key-prefix examples and the last legacy-prefix
API key was deleted from the store; MinIO's root identity and the
duplicate legacy reader user/policy are renamed/removed on r1 along with
the old hostname, orphaned /etc/default/* env files, and /opt remnants;
the prod-target guard and load-test allowlists drop the old hosts.
Deliberately kept: immutable history (changelog entries, frozen audit
records) and operator instructions that name legacy external artifacts
slated for deletion.
Added
- Ansible is now r1's config deployment path, with drift guardrails.
After the two-way audit: 18
--check --diff rounds reconciled the
archival-node role against live r1 (the dry runs caught an inventory
pointing the partition-carver at a live pool disk, a placeholder
authorized_keys that would have locked out operator + deploy, a
pre-ADR-0034 toml template that would have dropped the ClickHouse config,
and stale/broken vault secrets), then staged application converged the
host: services de-privileged to the stellarindex user (CS-118/119),
galexie moved off MinIO root creds onto the dedicated writer user,
postgres config single-sourced (the hand-tuned 8GB max_wal_size had been
inert behind postgresql.auto.conf all along). One real incident during
apply — the role downgrade-broke the upstream OpenZFS userspace and
deleted the dkms module (recovered in minutes from the migration debs;
packages now apt-mark held, install gated, three new assertions).
Guardrails: weekly ansible-drift.yml (fails on divergence), CI ansible
syntax+lint job, hourly config-assertions (now 12 checks), and the
CLAUDE.md rule: every r1 host change lands in configs/ansible in the
same PR. - r1 ↔ ansible drift audit + config-assertion watchdog. Follow-up to the
rsyslog apply-gap finding: audited BOTH directions between r1's live state
and the ansible roles. Live-only fixes that a playbook render would have
erased are now codified (CS-010 supply reserves → template + r1 inventory
vars; redis maxmemory → archival-node task; ssh root-access var pinned);
repo-ahead apply-gaps closed (three alert groups + rebrand wording synced
to r1's rules). New hourly
config-assertions.sh timer on r1 asserts the
load-bearing guard configs' CONTENT (9 assertions, all green) with
stellarindex_config_assertion_failed/_stale alerts in both trees;
full findings table + standing "check --diff first" rule in
docs/operations/r1-ansible-drift-2026-07-03.md. - Root-disk fast-fill early warning (
stellarindex_node_root_disk_filling_fast,
both rule trees + runbook + catalog): pages when root trends to full within
30 minutes — the 2026-06-11 ClickHouse log-wedge loop filled root at
~3.8 GB/min, going healthy→full in ~5 minutes, faster than the static <10%
page can be acted on. Deployed live to r1.
Fixed
- The 2026-06-11 log-discipline rsyslog rules were never live on r1 —
the loki + clickhouse-server
stop rules (the belt-and-braces that keeps a
journald flood out of /var/log/syslog) existed only in ansible role
15-log-discipline.yml, which does not auto-run against r1; the postmortem
recorded codified-as-applied, leaving the syslog half of the root-fill loop
open. Applied and probe-verified (tagged test line reaches journald, not
syslog); the file names the ansible role as source of truth.
Fixed
- `mint_and_forward` rows were rejected at TWO layers the v0.7.0 change
missed. Rollout verification on r1 (1,684 insert errors, zero rows
persisted) found the event type gated in three places, not one: the
decoder (fixed in v0.7.0), migration 0038's SQL CHECK (new migration
0070 extends it), and the storage layer's
CCTPEventType.IsValid Go
enum (extended here, with the three-layer lesson documented on the
type). Replay after deploy populates the historical rows.
Added
- CCTP `mint_and_forward` is decoded (board #31). The CctpForwarder
contract emits a fifth event our decoder didn't handle — those events
reached the lake but never
cctp_events. Schema reverse-engineered from
real mainnet events (single Symbol topic; body map {amount: i128,
forward_recipient: Address, token: Address}), golden-tested against the
actual lake fixture; no migration needed (the generic cctp_events shape
fits). The recognition blind spot is closed too: cctp's three contracts
are now pinned in the reconciliation catalogue, so a future unhandled
cctp topic caps THIS source's verdict instead of vanishing into the
system-wide bucket. New docs/protocols/cctp.md records the full inventory.
Historical catch-up (operator): projector-replay -source cctp -from
62403000. - Phoenix is contract-identity gated (ADR-0040 §1 mechanism 2, CS-026).
Matches() now requires the emitting contract to be in the curated mainnet
set (phoenix.MainnetGatedSet: the page-verified 11 pools + 3 stake
contracts; multihop excluded — it emits no events) — phoenix topics are
plain string tuples any pubnet contract can forge, and were previously
attributed on shape alone. The factory's creation events predate the lake,
so the in-code seed is the trust root (wired into gatedSources for the
protocol_contracts warm + future live-upsert anyway). Reject test pins the
injection vector closed; operator rollout steps (deploy → re-derive →
verdict watch) in the register. Defindex deliberately did NOT ship: the
lake now shows 88+22 emitters vs the 57 verified three weeks ago, and its
create events don't carry vault addresses — gating on raw emitter lists
would bake potential look-alikes into the trust root, so it moves to the
ADR-0040 §3 cross-check enumeration (pages + ADR updated with the
evidence).
- Explorer error boundaries —
global-error.tsx (own html/body, inline
styles) + a shared design-system RouteError + 19 per-segment error.tsx
wrappers across the data-heavy routes; previously ONE boundary existed and
a render throw white-screened the route. Verified with a forced throw in a
real browser. - Commercial funnel (LC-060/061/062/064/065): the pricing API is now in
the primary nav + a homepage product section; the dashboard's first-request
example is a copy-pasteable curl that actually works (the old
/v1/price/XLM-USD 404s — verified live); Bearer is the one taught auth
header (X-API-Key mentioned once as the alternative, matching middleware
precedence); Business tier consistently 60,000 req/min (backend truth);
billing copy no longer promises self-service that doesn't exist.
- The agent skill library (
.claude/skills/, indexed in CLAUDE.md): nine
executable skills encoding this repo's procedures AND its incident-corpus
judgment so sonnet-class agents (and humans) work at standard without
tribal knowledge. Construction skills (/add-onchain-source,
/add-cex-connector, /add-endpoint, /add-metric) each end in the
machine checks that catch the historical failure mode (lockstep test,
contract test, guard chain); ops skills (/cut-release, /deploy-r1)
encode the release/deploy discipline + rollback; /review-stellarindex
distills the F-####/CS-### corpus into per-subsystem adversarial
checklists, each check citing its incident; /diagnose-stellarindex turns
the runbook corpus into triage decision trees with the exact r1 commands
and prior wrong turns; /verify-done is the pre-completion gate stack
every other skill terminates in (including the new staged-content check
from this session's own 6161dd50 near-miss). - Two delegation-ready structural specs (docs/architecture/):
storage-layering-spec.md — eliminates the 13 verified upward imports from
storage/timescale into compute/sources via storage-owned *Row types +
caller-side conversion, in 4 grouped commits, gated by the new
storage-purity import-lint rule that is the actual payoff; and
wiring-decomposition-spec.md — extracts the api binary's inline adapters
(main.go 3,338 → <800 lines target), collapses the Options/Server
triple-touch via embedding (rejecting a DI container as clever), groups the
ops CLI's 55-case switch into a declarative subcommand table (rejecting
cobra), and scopes the optional per-source pipeline registry now that the
lockstep guard exists. - ADR-0043 (Proposed) + `scripts/ops/restore-drill.sh`: the DR answer to
CS-110/111/112. Design: pgBackRest gains an offsite encrypted
repo2
(templated into the ansible role, gated off until the operator reviews the
rendered diff; refuses to render repo2 without its cipher pass); the CH lake
is protected by drilled RE-DERIVE + daily DDL/tail push instead of multi-TiB
full backups (the lake is derived data — the raw LCM exists in two archives;
the full-backup decision deliberately waits on the drill's measured
throughput). The restore drill is a non-destructive scratch restore on r1
(throwaway postgres on :5499, tip-lag + hash-chain + window row-count
verification, optional CH re-derive RTO measurement) appending to an
append-only evidence log. Runbook footguns fixed: --stanza=main →
stellarindex (CS-114) and dr-activation's false "Drilled" claim replaced
with the honest status (CS-113). - ADR-0042 (Proposed): the v1 wire shape. The decision package for the
public flip, awaiting @ash sign-off: execute the Unit-D Tier-3 cross-chain
wire collapse pre-flip (rejecting the freeze fallback — pre-v1 with zero
consumers is the only free moment), give the dual-shape
/v1/assets/{slug}
an explicit kind discriminator (catalogue/stellar_asset, oneOf +
typed SDK union, explorer stops shape-sniffing), and define the v1.0 freeze
contract: spec = the contract, SDK-coverage register = honest SDK scope,
explorer surfaces marked x-stability: experimental at v1.0. - Ansible: non-root services + the missing system user (CS-118/119/122).
The
stellarindex user is now created FIRST in the role (a clean apply
previously FAILED chowning to a user that never existed); the api /
indexer / aggregator daemons and six timer oneshots run
User=stellarindex with the hardened-unit settings ported into the
role's real templates; env files go 0640 root:stellarindex;
archive-completeness deliberately stays root (documented follow-up).
Patroni's REST API now defaults to the private interface and REFUSES to
render without basic-auth credentials (assert + unconditional auth block)
so it can never land unauthenticated on 0.0.0.0. Ordered r1 migration
steps live in the operator register; deploy workflow verified compatible. - `stellarindex-ops verify-served-values` — the data-truth harness. The
recurring audit theme was "code-correct ≠ data-correct" (CS-010: XLM market
cap read +58% until hand-sampled). The new subcommand reconciles a curated
set of SERVED values against independent ground truth — XLM total/circulating
supply vs the SDF lumen API, USDC-on-Stellar supply vs Stellar Expert — and
emits node_exporter textfile gauges (
served_value_{ok,rel_err,last_run_unix})
with two alerts in both rule trees (drift sustained two daily runs; harness
dark 48h) + runbook. Its FIRST live run caught three things: its own unit
bug (F2 supply fields are base-unit strings — fixed), the standing CS-010
config gap (XLM circulating 47% off until sdf_reserve_accounts is set —
the alert now stands as pressure), and a NEW finding: served USDC supply is
85% below Stellar Expert (under investigation). Price cross-checks stay with
the divergence worker; lake↔served counts stay with compute-completeness.
Changed
- Explorer builds fail hard instead of baking fallback HTML. New
buildFetch.ts (bounded 429-aware retry, per-build memo, incident-history
contract): a build-time fetch failure for a promised entity now FAILS
next build — the class behind baked "Asset not found" pages and the
XLM/WXLM 330× price incident. ~200 lines of per-page scaffolding deleted;
the new layer immediately caught two real pre-existing baking bugs
(mixed-case slug variants; issuer fetches timing out under build
concurrency). Full 3,830-page build green against the live API. - Four D3 duplication extractions (net −LoC, behavior-preserving,
CAPABILITY-INVENTORY updated):
wsclient.Loop (the ~50-line WS reconnect
loop duplicated across binance/kraken/coinbase/bitstamp — venue behavior
preserved via hooks), internal/httpx WriteJSON/WriteProblem (dashboard
handler copies), ratelimit.FixedWindowCounter (login/signup throttles,
Redis key bytes unchanged), canonical.SafeUnixSeconds/Millis (three
decoder timestamp-clamp copies; bound-checks the raw u64 before the cast —
the router deadline_ts wrap-negative class). - The explorer now derives every wire type from the generated OpenAPI
contract.
src/api/types.ts (generated, CI-drift-checked) was imported
nowhere; all consumed shapes were hand-typed across hooks.ts,
explorer-shared.tsx, and ~20 pages — an API field rename shipped to prod
undetected. All hand interfaces are now aliases into
components['schemas'] (35 files, −448 net lines; ~90 call sites gained
honest null-narrowing, zero !/as casts), so spec drift is a tsc
failure. Eight // SPEC-GAP intersections remain where the HANDLER serves
fields the spec under-documents — tracked for spec-side fixes.
Fixed
- The explorer-surface OpenAPI gaps are closed — every field the handlers
serve that the generated-types migration had to bridge with
SPEC-GAP
intersections is now in the spec: the Asset coin-overlay block (slug,
class, change_1h/7d_pct, first/last_seen_ledger, observation_count,
markets/trade counts, price_history_24h/7d, ath, top_markets,
issuer_scam_reason) + type enum values global/external;
GlobalAssetView.class (required); Source.class enum gains
bridge/lending/router (all live in the registry); issuers list rows gain
org_verified + scam_reason (detail gains scam_reason); ContractEvent gains
contract_id; /account/me documents the session-cookie user/account shape;
the protocols bespoke block and evolving diagnostics fields are
documented as described-loose surfaces per ADR-0042's experimental tier.
SDK types mirror every addition (contract test green); all three artifacts
regenerated; zero SPEC-GAP markers remain — the surviving intersections
are re-labeled for what they now are: required-narrowing over spec-optional
fields.
- Classic-asset supply was silently SAC-only — the trustline/claimable/LP
observers never matched their watched set. Root-caused from the
verify-served-values USDC finding (served 40M vs Stellar Expert 265.9M):
the three observers compare decoded keys in
CODE:ISSUER form, but the
config (correctly, per its own docs) supplies CODE-ISSUER — the raw
strings went straight into the watched sets, so all three observers
observed nothing since they shipped and every classic asset's served
supply degraded to its Soroban-wrapped slice. Cross-checked against the
lake: net SAC supply_flows for USDC ≈ 272.9M vs SE 265.9M — the lake was
right; the served tier was missing the entire classic trustline component.
Fix: supply.CanonicalizeWatchedClassic (one home, loud error on
unparseable entries so a config typo can never silently zero a supply
component again) applied in all three observer constructors + regression
test pinning dash-in/colon-match. Operator follow-ups (deploy, historical
state seed, harness watch) are in the register — served values heal only
after both. - CS-089: the Chainlink divergence reference now rejects stale rounds. It
read
latestAnswer() — no timestamp at all — so a frozen feed was served as
a fresh reference, able to both mask a real divergence and fabricate a false
one. Now calls latestRoundData(), decodes updatedAt, and rejects rounds
older than the feed's MaxAge as ErrPriceUnavailable (reference
unavailable — feeding the CS-088 no_reference machinery). Defaults: 3h for
crypto feeds (≤1h heartbeat), 76h for the FX feeds (24h heartbeat + they
pause over market closes, so a Friday round is legitimately ~72h old on
Sunday). Operator override via new [divergence.chainlink.feeds]
max_age_hours. A proxy answering the legacy 32-byte shape now fails loudly
instead of decoding garbage. - CS-084 (High): the `-ch` completeness projection reconcile is now strict
per-ledger. The production path compared window TOTALS (Σ expected vs Σ
served), so a real drop in ledger L netting against a phantom overcount
elsewhere reported
complete=true — the per-ledger maps were already
computed on both sides; only the comparison collapsed them. All three
reconcile branches (event re-derive, SDEX census, ContractCall census) now
compare per-ledger via completeness.ReconcileCounts. The four oracle
sources (reflector-dex/cex/fx, redstone) opt out via a documented
aggregateReconcile reason (legacy backfill vintages keyed
oracle_updates.ledger by the oracle-timestamp ledger — strict compare
would false-flag the vintage boundary) and keep the totals compare.
Verified empirically on r1: per-ledger lake-vs-served counts for cctp match
exactly across 200k ledgers once the decoder's topic set is applied. The
same spot-check surfaced a NEW finding tracked separately: CCTP contracts
emit mint_and_forward, which the decoder does not handle. - The contract test's first run caught real three-way drift, all fixed:
the spec's
Price schema documented ~19 asset-enrichment fields
(market_cap_usd, top_markets, ath, supplies, sparklines…) that the
/v1/price handler has never served — trimmed to the honest 8-field
PriceSnapshot shape; /v1/pools items were documented as a bare untyped
object — now a real PoolRow schema; the healthz schema omitted the
always-served uptime/status_root; the Asset schema omitted the served
change_24h_pct; Source/MarketRow omitted the served stats + sparkline
fields. SDK (`pkg/client`): gained the served-but-missing fields —
PriceSnapshot.confidence(+factors), HistorySeries.price_type,
Source.on_chain+stats, Market.last_price+sparkline, LendingPool
30d net-flow fields, Issuer.org_verified (CS-100 was never mirrored),
Account.key_prefix, KeyCreated.key_prefix, Health.checks/status_root —
and lost `AssetDetail.is_experimental`, which no handler and no spec ever
served (SDK invention; it always decoded to false). Stale rek_ prefix
examples in auth comments updated to sip_. - Load-test production guard had a collapsed host list. The two rebrand
sweeps mapped both legacy hosts (
api.ratesengine.net, api.ratesengine.io)
onto api.stellarindex.io, leaving the guard in the Makefile and
test/load/scenarios/lib/env.js with a duplicate entry — the legacy domains
(which may still route to production) were unguarded against accidental k6
targeting. Restored the distinct legacy hosts; verified prod + legacy are
refused and the documented staging target still passes. - Per-row handler fan-out is now concurrency-bounded. The catalogue
market-cap/price fills (
/v1/assets listings, /v1/assets/verified) and the
slug-expansion markets merge spawned one goroutine + one DB round-trip per
row with no cap — safe only because the verified catalogue is small today,
but a latent connection-pool-exhaustion vector as it grows. All five sites
now go through a shared forEachBounded helper (cap 16, the same bound as
the price batch, which was already correct). Race-tested that the bound is
actually respected.
Security
- `main` is now protected (CS-097) and lint baselines are growth-guarded
(CS-098). Two repo rulesets:
main-integrity blocks force-pushes and
branch deletion for everyone (no bypass); main-required-checks makes the 12
core CI jobs required status checks, with a repository-admin bypass so the
operator's direct-push workflow keeps working (the push-triggered CI run on
main stays as the tripwire for that path — its ci.yml comment now says so).
New scripts/ci/lint-baseline-growth.sh (wired into the import-checks job)
fails any change that GROWS scripts/ci/*.baseline or the KNOWN_INERT
metric allowlist unless the commit carries an explicit Baseline-Growth:
trailer — closing the "edit the gate's own allowlist in the same commit"
bypass. Probe-tested all three paths (clean / undeclared growth / declared). - Middleware rejections (401/403/429) are no longer shared-cacheable.
Four problem+json writers — auth 401s (
writeAuthProblem), per-key policy
403s (writeKeyPolicyDenied), signup email-verification 403s, and monthly-
quota 429s — never overrode the route directive the CacheControl middleware
pre-sets, so on publicly-cacheable routes (e.g. /v1/price) a per-key/per-IP
denial carried public, max-age, s-maxage and a shared cache keyed on the
URL could store one caller's rejection and replay it to everyone. All four
now set Cache-Control: no-store (matching every other problem writer), the
cachecontrol.go invariant doc now enumerates them, and a regression test
drives all four rejection paths through the real CacheControl composition.
Removed
- The dead `consumer.Orchestrator` seam (−896 LoC). The per-source-
goroutine runner +
Source/CursorStore/Cursor types had zero production
callers and were exactly the RPC-era topology docs/architecture/ingest-
pipeline.md forbids — yet doc.go presented them as a reference template.
consumer.Event (the load-bearing contract) moved to event.go untouched;
doc.go now states the retirement + points at the dispatcher path. Stale
comments advertising the deleted <source>-backfill subcommands fixed in
sorobanevents/timescale. Follow-up: stellarindex_source_lag_ledgers lost
its only (never-production) setter — retirement folded into the
docs-integrity sweep.
- ADR-0041: ingest durability semantics. Settles CS-028's cursor question:
the ledgerstream cursor is a RESUME HINT, not a durability claim — the
ADR-0033 completeness verdict (strict per-ledger since CS-084) is the
durability claim, with the lake as the heal source. Consequences shipped
with the ADR:
clickhouse_live_sink + clickhouse_projector_source now
default to `true` (r1 already ran both; the certified-lake substrate
must not be opt-in for the coverage claim to mean anything — explicit
opt-out documented for CH-less deployments), and the previously-unalerted
ch_live_sink_ledgers_total{outcome="dropped"} counter gains a two-tier
alert (ticket at 10m of drops, page at 1h sustained) in BOTH rule trees +
a new runbook (ch-live-sink-drops.md). - ADR-0040: completing contract-identity gating (CS-026). Design for the
four still-ungated decoders: phoenix + defindex ship as curated-set /
factory-descended childgate registries (both already enumerated in
docs/protocols/ — the "waiting on team data" framing was stale), aquarius
gets a lake-derived enumeration procedure, and comet — the no-factory hard
case — gets a WASM-code-hash gate design (audited hash set + off-hot-path
registry sweep). Includes the rollout preconditions (seed before gate
binary, lake re-derive, verdict green) that prevent a fail-closed gate from
dropping live trades.
- Prometheus rule-tree semantic differ (
scripts/ci/lint-rule-equivalence,
wired into make monitoring-check). The multi-host and r1-overlay rule
trees are hand-maintained near-copies; file pairing was checked but nothing
enforced that paired rules stay semantically equivalent — a threshold fixed
in one tree silently diverged the other (the api.yml header has warned about
this since F-1222). The differ compares every paired rule's expr (job labels
normalized), for, and labels; the two genuine host-shape divergences
(redis replica expectation, scrape-job list) live in a shrink-only
rule-equivalence.baseline covered by the CS-098 growth guard.
Probe-verified: a one-line for: change in one tree fails with a precise
diagnosis. - Pipeline lockstep guard (
internal/pipeline/lockstep_ast_test.go). The
five hand-synced wiring sites (HandleEvent / IsProjectedEvent /
tradeFromEvent / projector buildSource / dispatcher registration) had no
machine check — the IsProjectedEvent comment cited an "ADR-0030 lint guard"
that never existed, and drift is silent data loss (F-1316). The new test
AST-walks the switches and every projected source package's consumer.Event
implementations: a projected event without a persist arm, a source package
event missing from IsProjectedEvent, a stale entry after a rename, or an
IsProjectedEvent package with no registry case now fails CI. Probe-verified
(removing rozo.Event from IsProjectedEvent fails with the exact F-1316
diagnosis). - SDK↔OpenAPI contract test (
pkg/client/spec_contract_test.go). Three
gates: every SDK method's route must exist in the spec; every spec operation
must be either SDK-covered or explicitly allowlisted with a reason (new
endpoints now fail CI until consciously triaged); and for covered endpoints
the spec's data schema properties must exactly match the SDK payload
struct's JSON tags in both directions. Closes the third edge of the
route↔spec↔SDK triangle (lint-docs.sh already reconciles routes↔spec).
Documentation
- Docs-integrity sweep: the institutional-knowledge layer agrees with
itself again.
docs/architecture/overview.md now EXISTS (CLAUDE.md and
engineering-standards.md cited it for months; it routes to the real docs);
the CS-129 kubectl-on-a-systemd-fleet commands in insert-errors +
all-ingestion-down are systemd/psql; the CS-008 finding-ID collision is
re-IDed with a register note; the remediation STATUS deferred list and
launch-todo carry staleness banners naming what shipped since they were
written; the coverage tracker's header/table contradiction is annotated;
and the never-emitted stellarindex_source_lag_ledgers gauge (its only
setter was the deleted Orchestrator) is removed from obs + docs, with the
two archived runbooks that cited it scrubbed to historical prose. - CAGG price-math verified vs the exact engine; `twap` column marked dead.
The
prices_* continuous aggregates compute vwap with the per-row form
sum((quote/base)*base)/sum(base) instead of the exact sum(quote)/sum(base);
measured on r1 the divergence is ≤ 1.0e-16 relative (40,565 1h-bucket
comparisons) — below the 12-decimal wire truncation, so no rematerialization.
New aggregates must use the exact single-division form (migrations/README.md
rule 8). The CAGGs' twap column is an equal-weight mean, not time-weighted,
and is read by nothing — documented as do-not-use in the TWAP/OHLC methodology
doc (/v1/twap computes real TWAP on demand from raw trades).
Changed
- `/v1/operations` directory first page is now short-TTL cached (3s). The
network-wide directory page is identical for every caller between ledgers but
costs a ~300ms multi-column DESC-LIMIT read over the lake + a 24h op-type
aggregation; a small per-
limit in-process cache makes it effectively free
once traffic is concurrent, at most ~one ledger stale. Cursor (paged) requests
are unaffected. Follows the v0.6.1 summary-shape change.
Changed
- `/v1/operations` network-wide directory is now a summary (~10× faster).
The no-
?ledger directory listing previously read the large body_xdr
column and XDR-decoded every op, which dominated latency over the
multi-billion-row lake (~430ms regardless of limit). It now returns each
op's identity + type only — fields/raw_xdr are omitted (they were
already omitempty). Fetch the fully-decoded op from the per-ledger form
(?ledger=<seq>) or /v1/tx/{hash}. The explorer already degrades cleanly
(its per-op summary was best-effort). No change to the per-ledger op view.
Fixed
- Latency-SLO burn alerts no longer false-fire at near-zero traffic. Added a
min-traffic guard (
and rate > 5 req/s) to the pricing-latency burn alerts —
at synthetic-only traffic (~2.4 req/s smoke+prewarm) a few cold-cache outliers
tripped the 0.1% budget. (Applied live to r1 monitoring.) - `ecb` FX reference no longer false-fires `data_source_stale`. ECB is a
daily source (publishes on business days only) but sat under the oracle
domain's 3h freshness threshold; gave it a 4-day threshold. It was healthy all
along (writes to
oracle_updates on its daily cadence).
Added
- `GET /v1/external/assets` + `GET /v1/external/assets/{slug}`. The non-Stellar
side of the assets split (LC-001) — fiat currencies + reference-only coins. See
the BREAKING note under Changed.
- `flags.divergence_checked` on the price envelope (CS-087). Signals whether
the cross-reference divergence check actually ran (≥1 responding reference). When
false,
divergence_warning is blind and must not be read as "prices agree". - Regression guardrails. (1)
scripts/ci/lint-i128.sh (ADR-0003) — rejects
int64(x.Lo) i128 truncation + BIGINT/float monetary migration columns; (2) the
exhaustive linter (ADR-0010), scoped to our domain enums, so a new
AssetType/Class/etc. variant added to a switch without handling fails CI;
(3) foundation-purity import rules (internal/canonical, nettools,
sources/external/scale, version pinned to their dependency floor); (4) the
testcontainers integration suite now RUNS in CI as a blocking gate (CS-070 —
it was previously only compiled). All in make verify / CI. - Contributor docs for agents.
/CAPABILITY-INVENTORY.md (intent→symbol index,
to stop rebuilding existing helpers) + docs/contributing/ checklists (add a
source / CEX / endpoint / metric / migration / observer). - Stablecoin self-peg pricing for the crypto-ticker form.
/v1/price?asset=crypto:USDC"e=fiat:USD (and the EUR/MXN pegs,
/v1/price/tip, /v1/observations, /v1/oracle) now returns the ~$1 peg
(price_type: "peg") instead of 404. The classic-issued form
(USDC-GA5Z…) already resolved via the operator's
usd_pegged_classic_assets list (F-1232), but the abstract global-ticker
form the catalogue + explorer use (crypto:USDC, crypto:EURC, …) fell
through — no on-chain trade quotes crypto:USDC in fiat:USD.
tryStablecoinFiatProxy now consults aggregate.FiatProxy first: when the
asset is a crypto:<STABLE> ticker whose peg fiat equals the requested
quote, it synthesises 1.0 (consistent with the aggregator's
stablecoins-as-fiat policy; a depeg still surfaces via the divergence
subsystem). A cross-peg quote (crypto:USDC/fiat:EUR) deliberately does NOT
fire — that's a real FX cross-rate. (launch-todo P2-4(b).) - SEP-41 token supply now served on `/v1/assets/{id}`. A Soroban (SEP-41)
token has no LCM observer supply snapshot unless its contract is on an
operator watch-list — impractical at 10k+ tokens, so Algorithm 3 produced
nothing and
total_supply was null for every SEP-41 token. /v1/assets/{id}
now falls back to the lake-derived per-token supply (ch-supply's
token_supply, Σmint−Σburn−Σclawback over the certified ClickHouse lake — the
same source /v1/assets/{id}/supply already uses) for Soroban-contract assets
with no observer snapshot, with a new sep41_lake_flows supply basis. Every
SEP-41 token's total_supply / circulating_supply / market_cap_usd is now
complete + served from the full archive. The CH read fires only for Soroban
tokens (classic assets keep their Algorithm-2 snapshot). - Data-freshness watchdog — the "never get behind" alert. New
data-freshness.sh (every 15 min via data-freshness.timer) emits per-domain
ingest-freshness gauges (stellarindex_data_freshness_{age_seconds,stale}) AND
the per-source ADR-0033 completeness verdict
(stellarindex_completeness_incomplete) to the node_exporter textfile
collector, with three alerts (stellarindex_data_source_stale,
stellarindex_completeness_incomplete,
stellarindex_data_freshness_watchdog_silent) + runbooks. Closes the gap the
audit found: the ingest gap-detector covered on-chain source gaps, but
reference oracles, FX, supply, the issuer-metadata cron, and the verdict itself
had no freshness alert — so coingecko rotted 11 days and sep1 metadata never
populated, unnoticed. Now any source past its cadence, or a real served≠lake
gap, pages. (launch-todo steady-state / "never behind".) - `massive` FX feed registered as an external source. The active fiat-FX
feed (
internal/sources/forex worker, massive.com = Polygon's backend,
fx_quotes path) was missing from external.Registry, so it never appeared
in /v1/sources and Lookup("massive") fail-closed. Now registered as an
external FX source (ClassExchange/SubclassFX, off-chain). Fixes a latent
classification bug: IsOnChain("massive") previously fell through to true
(which would have placed the off-chain FX feed on the explorer's Stellar
/network surface); it is now correctly false. (launch-todo P0-7.) - Two missing cron timers — `sep1-refresh` + `compute-completeness`. Neither
had ever existed as a systemd timer, so both data sets silently froze: issuer
org_name/org_verified only updated on a manual sep1-refresh, and the
ADR-0033 completeness verdict (completeness_snapshots) had drifted 17–21
days stale (watermarks at ~63.0M while the network was at 63.27M). New
Ansible templates under the archival-node role install both (daily, 05:12 /
05:30 UTC). The completeness timer runs run-compute-completeness.sh — a
self-chunking per-source driver: it walks each source's
[watermark, tip] in 25k-ledger windows because (a) the watermark write
overwrites rather than max()s, so a global run would regress sources already
ahead, and (b) the high-volume SDEX projection reconcile blows ClickHouse's
12 GiB per-query limit above ~30k ledgers. Self-healing: any backlog (initial
catch-up or post-outage) is chunked automatically. Phase-0 of the launch
to-do (docs/operations/launch-todo.md). - Bidirectional SEP-1 org verification (`org_verified`).
/v1/issuers
now carries org_verified — true only when the issuer's home_domain
stellar.toml lists THIS issuer's account back in its [[CURRENCIES]]
(i.e. the domain owner attests to the account). A one-directional
home_domain → ORG_NAME match is spoofable: anyone can point their
account's home_domain at circle.com and inherit "Circle". The
explorer's issuer table renders a ✓ Verified badge only on the
bidirectional match, so org grouping/merging is trustworthy. The
sep1-refresh cron computes the flag (tomlListsIssuer) and persists it
in the sep1_payload JSONB; /v1/issuers reads
sep1_payload->>'OrgVerified'. Each OK line now prints verified=….
New sep1-refresh -issuer <g_strkey> force-refreshes one specific account
on demand (bypassing the staleness queue) — for onboarding a newly-verified
org without waiting for it to surface through ~43k pubnet issuers. stellarindex-ops state-snapshot — reads a history-archive checkpoint's full
current ledger-entry state (the bucket list) via the SDK's
CheckpointChangeReader and tallies it by entry type. The read-only
foundation of the data-truth backfill (DATA-TRUTH-PLAN G1–G3): the served
ledger_entries_current projection only holds entries changed since ledger
~62M, so dormant-pre-62M accounts / trustlines / contract code+instances are
missing (the contract-WASM user-contract tail, incomplete account state +
issuer flags, possible trustline-supply undercount). A checkpoint snapshot is
the source of truth for that tail, read in one pass (no genesis replay). Its
-write mode backfills the contract_code + contract_instance entries (the
bounded G1 scope) into ledger_entry_changes via a direct insert that writes
NO commit-marker ledgers row (so it never advances the completeness
watermark) — closing the contract-WASM gap for user contracts whose code was
deployed before the entry-capture window. Default mode is read-only tally.- Staff customer look-up (
/account/admin, audit 2026-06-19 item 16):
the cockpit's first tool is now live instead of a placeholder. New
staff-gated GET /v1/account/admin/lookup?email=|slug= resolves an
account (tier, status, overrides) plus the users on it; the explorer's
admin page searches by email or account slug. Double-gated — RequireSession
+ an explicit is_staff check (a non-staff customer gets 403, never another
customer's data) — and the access is audit-logged. Tier overrides + incident
tooling remain honestly marked "Coming in Phase 1.5" (they need write/
impersonation endpoints). source_volume_1h continuous aggregate (migration 0068) — per-source
hourly trade-count + pre-aggregated USD-volume inputs. The source
page's activity chart now reads this CAGG instead of scanning raw
trades, making the 7d window cheap (the live derivation was ~18s for
the heaviest source, past the 8s API ceiling). The explorer's 24h/7d
toggle on /dexes/{source} + /exchanges/{source} is now live, and
the 24h sparkline is faster too. /v1/sources?include=sparkline7d is
now surfaced by the frontend.on_chain boolean on /v1/sources (and external.IsOnChain) — true
for sources that observe the Stellar network directly (DEX, on-chain
oracles, lending, routers, bridges), false for off-chain reference
feeds (CEX / FX / aggregators / Chainlink).
Changed
- BREAKING — assets split into Stellar (`/v1/assets`) and external
(`/v1/external/assets`) (LC-001).
/v1/assets now lists Stellar assets
only (native XLM, classic credits, Soroban tokens, and verified-catalogue
currencies with a Stellar issuance — USDC, EURC, AQUA). Fiat currencies and
reference-only coins (BTC, ETH, …) — which have no Stellar issuance — moved to
the new `GET /v1/external/assets` listing and `GET /v1/external/assets/{slug}`
detail. A non-Stellar slug now returns 404 on /v1/assets/{slug} (with a
cross-pointer; no redirect), and vice-versa — each asset resolves on exactly
one path. asset_class=fiat returns an empty page on /v1/assets. The
explorer gains an /external/assets directory + detail page and drops the
fiat chip from /assets. Root cause of the old mixing: the browse listing fed
off catalogue.Browseable(), which drops reference-only coins but still
included fiat. - XLM circulating-supply basis is now honest when no SDF reserves are
configured. Previously stamped
xlm_sdf_reserve_exclusion even with an empty
reserve set (circulating == total), silently overstating circulating supply +
market cap; now emits xlm_total_only so the misconfiguration is self-evident.
(The correct circulating still needs sdf_reserve_accounts set in inventory.) - Dependencies brought to latest.
go-stellar-sdk v0.5→v0.6 (adapts the new
datastore.GetFile size return; VERSIONS.md compat pass); the explorer + status
apps to React 19.2 / Next 16 / TypeScript 6 / Tailwind CSS 4 / ESLint 10 (flat
config; ESLint 10 via a one-line eslint-plugin-react pnpm patch), and the
React Compiler (babel-plugin-react-compiler 1.0) is now enabled. - Re-enabled the `min_usd_volume` VWAP gate at $10k (r1 template). Pinned to 0
during the on-chain-only bootstrap; the CEX connectors now flow live volume
(binance/coinbase/kraken/bitstamp), so fiat:USD pairs clear the floor easily
while thin/manipulable pairs are gated. The CS-040 fix (per-source
Decimals in
the USD-volume sum) makes the gate FX-safe. - Prometheus TSDB relocated off the 49G OS root onto a ZFS dataset. The
~13G TSDB kept the root chronically >90% full (
stellarindex_node_root_disk_full
alert). Moved to data/prometheus (zstd, ~12× → 1.31G on disk); root dropped
94%→60%. Added the dataset to the archival-node ZFS-role defaults so a rebuild
doesn't reland it on root. (launch-todo P0-5.) - The
/network page is now Stellar-only: "Top markets" reads
/v1/pools (on-chain DEX pools, not the CEX-dominated /v1/markets),
"Most active sources" + the venue-composition donut + the hero
Markets/Sources tiles all filter to on-chain sources. Off-chain
reference feeds stay on /exchanges + /aggregators. - The
/sources directory is now the Stellar on-chain source registry
(DEX / oracle / lending / router / bridge) — previously it listed
every venue *and* silently dropped the lending/router/bridge classes
(blend, cctp, rozo, defindex, soroswap-router now appear). - Source activity chart defaults to the 7d window when available.
- Market/asset OHLC chart now picks the finest granularity each window
allows under the API's 1000-bar cap (24h→5m, 7d→15m, 30d→1h, 90d→4h)
— far more detail per window.
Fixed
- Security & correctness audit remediation (2026-07-01). Highlights:
- SSE crash + DoS.
streaming.Hub.Publish could send on a closed
subscriber channel (process-crashing panic) when a client disconnected
mid-publish — now guarded by a per-subscription mutex. The SSE handler
cleared its write deadline entirely, so a non-reading client leaked its
goroutine/conn/FD forever — now a rolling per-write deadline + a concurrent-
stream cap (CS-012 / CS-013).
- Dashboard CSRF. The session cookie was SameSite=None though the
dashboard and API are same-site — now SameSite=Lax, blocking cross-site
credentialed POSTs to the /v1/dashboard/* mutation handlers (CS-124).
- SSRF. The OG-image edge function double-decoded + interpolated the URL
path unescaped into satori markup (blind SSRF) — now escaped/single-decoded
(CS-009). The three copies of the outbound-URL SSRF blocklist (SEP-1 +
webhook registration/delivery) diverged — two missed Oracle Cloud's metadata
IP 192.0.0.192; unified into one internal/nettools guard (CS-008).
- Issuer impersonation. /v1/issuers/{id} dropped org_verified, so the
explorer rendered an unverified self-declared org_name as authoritative —
now surfaced + shown with a Verified/Unverified chip (CS-100).
- Webhook replay. Delivery HMAC signed only the body — now timestamp-bound
(X-StellarIndex-Timestamp) so a captured delivery can't be replayed (CS-055).
- Data-truth signals. Completeness watermark could regress to a stale tip
(now GREATEST-guarded, CS-083); a total divergence-reference outage counted
as success (now a distinct no_reference outcome + alert, CS-088); the
ingest cursor gauge advanced even on a failed persist (CS-029); dormant-pair
VWAP served stale=false forever (CS-017); the USD-volume gate assumed 1e8
for FX sources that stamp 1e6 (CS-040); negative circulating supply clamp
(CS-038).
- Accessibility. The API-request dialog + mobile nav drawer gained a real
focus-trap/escape/restore; form errors/success now announce to screen
readers (LC-050 / LC-051 / LC-052).
- Ops config. Alertmanager rendered webhook secrets world-readable (now
0640, CS-121); the sshd password-auth Ansible gate inverted on a string
override (now | bool, CS-120); the User-Agent was injected unescaped into
the plaintext magic-link email (CS-071). - Completeness verdict false-negative on factory-gated sources (blend).
compute-completeness (the daily verdict, ADR-0033) never seeded the
factory-child gate registry — only verify-reconciliation did. So its
childgates were the static protocol_contracts seed and went stale as new
pools deployed: blend reported complete=false (expected=0) on windows
whose activity was on pools missing from the seed, while the live decoder
(self-seeding from deploy events) captured them — i.e. a checker bug, not a
served-data gap. Now compute-completeness preseeds factory children from the
creation events [genesis, lo) before each re-derive (matching
verify-reconciliation), making the watchdog self-maintaining as pools deploy. - CoinGecko Pro key would have 404'd — the poller now auto-switches to
`pro-api.coingecko.com`. A Pro key (
COINGECKO_API_KEY) only authenticates
against the paid host; the poller hard-coded the public host
(api.coingecko.com), so an operator upgrading to the paid tier (to fix the
dead oracle feed — it had hit the 10k free-tier limit) would have silently
kept failing. The poller now selects pro-api.coingecko.com whenever a Pro
key is set and the endpoint wasn't explicitly overridden. (launch-todo P0-3.) - `sep1-refresh` could never reach good issuers — failed fetches now bump
`sep1_resolved_at`. A resolve failure (dead
home_domain, TLS error,
SSRF-blocked) used to continue without writing anything, leaving the
issuer's sep1_resolved_at NULL. Since IssuersNeedingSep1Refresh orders
sep1_resolved_at ASC NULLS FIRST, the 43,156 pubnet issuers with dead
domains permanently occupied the front of the queue — the refresh re-tried
the same dead domains every run and never made forward progress to the
~100 good issuers behind them (Circle, Aquarius, …). org_name /
org_verified could therefore never populate at scale. New
MarkIssuerSep1Attempted bumps sep1_resolved_at on failure (without
writing a payload), so a dead domain moves to the back of the queue and is
retried only on the next -older-than cadence; a later success overwrites
the payload as before. - `/v1/price` latency-burn incident (page severity) — root-caused + fixed.
LatestClosedVWAP1mForPair's "latest closed bucket" predicate was
bucket + INTERVAL '1 minute' <= now() — a function on the indexed bucket
column, so it's not sargable: TimescaleDB couldn't do chunk exclusion or
an ordered index scan, and max(bucket) ran a full per-chunk partial
aggregate over the pair's ENTIRE prices_1m history (~13.7k rows/chunk × every
chunk back to 2015). Harmless while a pair was sparse; once
crypto:XLM/fiat:USD accrued dense history (CEX coinbase/kraken trades, from
~20:00 UTC 2026-06-19) it ballooned to ~446ms execution + 55k planner
buffers, driving the price p95 from ~50ms to ~400ms and the SLO burn /
sla-probe alerts. Two-part fix: (1) rewrote the non-sargable predicate to
the arithmetically-identical bucket <= now() - INTERVAL '1 minute'
(execution 446ms → 26ms); (2) that still left ~280ms of *planning* time —
prices_1m has ~374 chunks and now() only enables runtime chunk exclusion,
so the planner still enumerated every chunk. Added a LITERAL recent lower
bound (bucket >= <cutoff>, computed in Go) so the planner prunes old chunks
at PLAN time, collapsing planning to ~2ms. Net: ~390ms → ~8ms end-to-end.
Idle pairs (no closed bucket in the 14-day fast window) fall back to the
unbounded scan so the latest-closed-bucket contract is preserved. (The rc.133
fix to this function only bounded the sparse case; the sibling
ORDER BY bucket DESC LIMIT readers were verified unaffected.) /v1/contracts/{id}/wasm now distinguishes a Stellar Asset Contract
(the built-in SAC behind native, USDC, and every classic asset — among
the busiest contracts on the network) from a genuinely-uncaptured WASM
module (audit 2026-06-19 item 13). The reader found the SAC instance but,
since its executable isn't a WASM module, returned the generic
"unresolved" 404 — so the explorer wrongly said "resolves once a backfill
lands" for contracts that will never have WASM. SACs now return a distinct
contract-is-sac 404 and the explorer shows "this is a Stellar Asset
Contract — no WASM." Because the busiest SACs (native XLM, USDC) were
deployed long ago and their instance entries also predate capture, SACs are
detected deterministically too — a contract id is matched against the
operator sac_wrappers registry AND the computed SAC derivations of the
native asset + every verified-catalogue classic asset — so
native/USDC/AQUA/… report "SAC, no WASM" without needing a captured
instance. (Real user contracts whose code was uploaded before the
entry-capture window still show the honest "not captured yet" state pending
the Phase-C backfill.) apiGet now also surfaces the RFC-9457 problem
title/detail in thrown errors so clients can tell apart same-status
failure modes.- Class-filtered + unified
/v1/assets listings now carry price_usd.
?asset_class=crypto|stablecoin|fiat (and the explorer's
?asset_class=all first page) projected catalogue rows from the
price-less catalogue projection, so every row — even XLM — listed
price_usd: null (audit 2026-06-19 item 4). The sliced page now fills
the headline price through the same three-tier chain as the single-asset
/v1/assets/{slug} view, bounded to the page (not the whole catalogue)
so the unified first page doesn't fan a price computation over every
catalogue entry. Stellar-only tokens (AQUA, yXLM, SHX, …) that have no
global CEX/aggregator price fall back to their Stellar trades-derived
price (the same one the classic listing shows), so a class-filtered row
matches the classic asset row instead of listing null. /v1/assets market_cap_usd + circulating_supply now cover every
classic asset, not just the ~9 with a precise supply-pipeline figure
(audit 2026-06-19 item 4: market_cap was null for all 500). The precise
three-domain supply_1d figure is still preferred where it exists; the
long tail falls back to a broad circulating supply derived from the sum
of all (non-removed, positive) trustline balances per asset — the exact
definition of classic-asset circulating supply — read from the ClickHouse
lake via one cached GROUP BY (~0.5s, 10-min TTL + single-flight, kept off
the API hot path). market_cap = (circulating / 10^decimals) × price.
Assets without a price stay honestly null (no fabrication)./v1/protocols/{name} cold-path latency cut ~3× (audit 2026-06-19 item 8):
the three independent lake reads (daily series, event breakdown,
per-contract activity — ~5s each via the contract_id bloom index) ran
serially (~15s total); they now run concurrently and write disjoint view
fields, so the cold path is ~5s — comfortably under the 25s ceiling. The
"untyped" reconciling bucket is appended after the barrier (it needs the
series total). Repeat hits stay instant via the cache below./v1/protocols/{name} event breakdown now NAMES AMM swap/sync events
instead of lumping them into "untyped" (data-truth G4). Soroswap's events
are [String("SoroswapPair"), Symbol(name)], so the lake's topic_0_sym
(which only captures a Symbol topic[0]) is empty and the real event name
lives in topic[1]. The breakdown now recovers it: when topic[0] isn't a
Symbol, it decodes topic[1]'s Symbol from the raw topics_xdr — so
soroswap shows swap/sync/deposit/withdraw/skim (≈190k events that
were "untyped") and the untyped remainder collapses to ~0./v1/protocols/{name} is now served from a 60s per-server single-flight
cache, so concurrent requests no longer each re-run the ~15s lake scans
and peg CPU (compounding the 25s ceiling below)./v1/protocols/{name} can no longer peg CPU for minutes: the
lake-analytics + bespoke scans (~15s warm) had no request ceiling and
were observed running away to several minutes under concurrent load
(2026-06-19 incident). Added a 25s timeout; the enrichment helpers
degrade gracefully on cancellation. (The proper fix — a CAGG so these
are fast — is tracked in docs/archive/page-audit-2026-06-19/.)/v1/protocols/{name} event counts now reconcile (audit 2026-06-19
item 8). events_total was the typed-breakdown sum, which counts only
events whose topic[0] is a denormalized Symbol in the lake — so for
Soroswap it read 236 while the activity chart summed to ~200k (the
swap/sync events carry a non-Symbol topic[0]), and it could even fall
*below* events_24h. events_total is now the unfiltered window total
(= the activity-series sum), and event_breakdown carries a synthetic
untyped bucket for the non-Symbol-topic'd remainder, so
sum(event_breakdown) == events_total == sum(activity_series). This also
fixes protocols (e.g. phoenix) showing an empty breakdown while the chart
had data.- MEV feed notionals no longer read ~$0 on real cycles: the arb scanner
read raw
usd_volume (NULL for SDEX XLM/token + token/token legs), so
cycle notionals summed to ~$0. It now estimates each leg's USD value
from the XLM leg × current XLM/USD VWAP (the same fallback the markets
queries use), matching how the rest of the API values on-chain volume.
Applies to newly-detected cycles. - Chainlink divergence cross-check now actually runs — it had produced
zero
divergence_observations rows ever (audit 2026-06-19). The
divergence Chainlink reference carried its own env-less rpc_url that
fell back to a public RPC (eth.llamarpc.com) which now answers
eth_call with a Cloudflare JS-challenge HTML page, so every
LookupPrice failed its JSON decode silently. CHAINLINK_RPC_URL now
also overrides the divergence reference's endpoint (it already drove the
ingest poller), so one operator-provided RPC serves both. BTC/USD +
ETH/USD now cross-check against Chainlink mainnet feeds (~0.1% delta,
verified live on r1). /v1/issuers/{g} now populates auth_required / auth_revocable /
auth_immutable / auth_clawback from the on-chain AccountEntry flags
bitmask we already index (via the explorer's AccountState), instead of
leaving them null when the dedicated SEP-1 flag resolver hasn't run. The
explorer's Auth-flags panel shows real values instead of "Not yet
resolved" for any issuer whose account we've observed.- Explorer per-page audit fixes (2026-06-19, frontend):
- `/assets/XLM` showed wrapped-XLM data (~330× wrong price) — a
scam "XLM" classic asset shared the
XLM listing slug. fetchCoin
now resolves XLM/native directly to the native asset.
- Case-sensitive asset/embed routes — lowercase slugs (/assets/btc,
/embed/asset/xlm) 404'd or rendered a half-empty GlobalAssetView.
fetchCoin now does a case-insensitive cache lookup and
generateStaticParams emits both cases for every slug.
- `/convert/{from}/{to}` inverted rates ("1 USD = 1.15 EUR" was the
EUR→USD rate mislabeled) — now inverted correctly.
- `/sources` "Last ingest" always "—" — the cursor venue lives in
sub_source (the source field is the cursor type); both the index
and per-source panels now key on the venue.
- /oracles dropped the always-zero "24h updates" column (oracles
don't trade).
- /divergences now marks each reference Active / Configured / Planned
— only CoinGecko + Chainlink-HTTP are actual cross-checks; Reflector/
Redstone/Band are ingested as oracle feeds, not yet compared here, so
the page no longer implies all five are live. /v1/price latency regression (caused a latency-burn incident
2026-06-19): the rc.131 cross-direction VWAP combine scanned a pair's
ENTIRE prices_1m history (back to 2015) before LIMIT 1 — ~1s warm,
~9s under load. Now it finds the latest closed bucket via an index
max() per direction (UNIONed), then point-reads + combines just that
bucket — bounded, ~250ms, same result.- CEX dust no longer pollutes OHLC high/low on the API. Sub-$0.001
streamed CEX fills — tiny integer amounts whose
quote/base is a
meaningless round fraction (1/8, 1/10, …) — are dropped at ingest
(new stellarindex_external_dust_dropped_total). Ingested, a single
such $0.00000001 fill set the unweighted /v1/ohlc high/low
(e.g. an XLM/USD low wick of $0.125) while carrying ~zero real volume;
the candle body (volume-weighted VWAP) was always correct. Existing
dust was purged and the price CAGGs re-derived. - Flipped markets are no longer double-counted: XLM/USDC and USDC/XLM
(the SDEX decoder records both on-chain trade directions) now collapse
to a SINGLE market wherever pairs are read —
/v1/markets, /v1/pools,
and the per-pair /v1/price VWAP. Volume + trade count sum across both
directions; the price uses one canonical orientation (quote-rank: fiat
> stablecoin > XLM > token, so XLM/USDC quotes in USDC), and the VWAP
combines both directions over the latest closed bucket (so it uses full
liquidity, and returns a price even when the latest minute traded only
the flipped way). Query-time via canonical.Orient — no data migration. - Charts now label their time axis in the viewer's local timezone
(intraday) instead of UTC, so the current bar lines up with the
viewer's wall clock instead of reading an hour "behind". Date labels
on daily/weekly views stay UTC (a daily bar is a UTC calendar bucket).
- Source activity chart no longer shows a gap for the current hour:
source_volume_1h now uses real-time aggregation (migration 0069),
so the in-progress (not-yet-materialized) hour is computed live
instead of reading as zero until the hour closed.
Changed
- Alerting fanout migrated from Slack to Discord. Both the R1
standalone Alertmanager config (
configs/alertmanager/) and the
multi-host Ansible template now use Alertmanager-native
discord_configs. Discord incoming webhooks are locked to one
channel each, so the page tier and ticket tier take separate webhook
URLs (DISCORD_WEBHOOK_URL_PAGES / DISCORD_WEBHOOK_URL_ALERTS in
/etc/default/alertmanager-secrets; alertmanager_discord_webhook_url_pages
/ _alerts vault vars for the Ansible role) — point both at the same
webhook for a single channel. apply.sh now drops each receiver's
config block independently when its URL is unset (marker-specific
strip, so one empty URL never collateral-removes the other Discord
receiver). Operator runbooks, the SEV playbook comms channels, and
pre-launch-check.sh updated accordingly. The old SLACK_WEBHOOK_URL
/ alertmanager_slack_* knobs are removed.
Added
- Source activity chart: hoverable values. The per-source activity chart
(trade-count line over USD-volume bars) gains a crosshair legend so both the
trade count AND the USD volume are hoverable. The shared
LineChart gains an
opt-in crosshair legend; non-legend usages are unchanged. - `/v1/sources?include=sparkline7d` (backend, not yet surfaced). Adds a 7d
volume_history_7d series + window-parameterizes the source-volume-history
query ($1::interval) with a 60s-TTL cache slot, and the explorer chart has a
ready 24h/7d toggle. NOT wired to the frontend yet: the live 7d derivation is a
~18s raw-trades scan (past the 8s API ceiling), so it stays off until a
per-source hourly volume continuous-aggregate backs it.
Added
- Per-hour `trade_count` on the source sparkline.
/v1/sources?include=sparkline
buckets now carry trade_count alongside volume_usd (the count was already
computed in GetSourceVolumeHistory24h, just dropped on serialization).
Powers the source page's new activity chart — trade-count line over USD-volume
bars — replacing the flat volume-by-hour sparkline. Additive + backward-compatible.
Added
- `GET /v1/markets/sources` — per-source 24h volume breakdown. Trailing-24h
USD volume + trade count grouped by source for a single pair (
?base="e=)
or an asset across every pair it appears in (?asset=), with each source's
share_pct of the total. Backs the volume-by-source pie on the market-pair +
asset pages (the /v1/history feed only samples recent trades, so an accurate
24h share needs this server-side aggregate). Same XLM/USD volume derivation as
/v1/sources?include=stats.
Changed
- Operator-minted API keys now use the `sip_` prefix (Stellar Index
Pricing), matching the dashboard minter —
internal/auth was still
emitting the pre-rebrand rek_ prefix. Validation is SHA-256 of the full
plaintext, so existing rek_ keys keep authenticating unchanged; the prefix
is a human-facing namespace label. SDK + OpenAPI key-prefix examples updated.
- Status page moved onto the main site at `/status`. The standalone status
app (its own Cloudflare Pages project at
status.stellarindex.io) is now a
/status route inside the explorer, so it inherits the site's nav/footer —
one site, one navigation. Per-incident postmortems live at
/status/incident/[slug]. The old subdomain 301-redirects every path to
https://stellarindex.io/status (web/status/public/_redirects, deep-links
preserved) — no DNS change; the existing stellarindex-status CF Pages
project just serves the redirect now. Every in-app "Status" link (footer,
sidebar, search, degraded banner, error states, contact) + the sitemap now
point at /status.
- Explorer visual consistency. The
/issuers page now uses the standard
Container/PageHeader (it was the one page on an ad-hoc fixed-width
container with a hand-rolled header). Asset sparklines (assets table + home
top-assets) now stroke via the up/down semantic tokens (currentColor)
instead of frozen — and partly off-palette — hex, so they track the theme.
- Explorer navigation cohesion. A market pair's two asset badges (e.g. XLM /
USDC) are now click-throughs to each asset's page; the tx, contract, and
market-pair detail pages use the shared
Breadcrumbs (consistent Home / …
trail, fixing the tx page's wrong "Ledgers" parent) instead of ad-hoc nav.
- Market/asset/exchange charts now show real OHLC candles + volume bars.
The candle charts on
/markets/[pair], /assets/[slug], and
/exchanges/[name] were rendering FLAT fake candles (open=high=low=close=VWAP)
with no volume, because they pulled /v1/chart (VWAP-only). They now use a new
shared MarketChart over /v1/ohlc?interval= — true open/high/low/close plus a
volume histogram underneath (CandleChart gained an optional volume series),
with a single exchange-style timeframe control that auto-picks candle
granularity. One reusable component replaces three near-duplicate chart impls.
Added
- Blend lending: USD TVL + real APY now populate (ADR-0039 follow-ups).
- SAC→USD:
/v1/lending/pools/{pool}/reserves now prices reserves —
it maps each reserve's Stellar-Asset-Contract id back to the classic/native
asset it wraps (computed from the verified-currency catalogue via
xdrjson.SACContractID, validated against the known XLM + USDC SACs) and
prices that, so supplied_usd / borrowed_usd / pool tvl_usd fill in.
- APY: the rate-model config (util / r_* / reactivity / decimals) is read
from blend_admin queue_set_reserve events (metadata) — the on-chain
ResConfig storage entry is uncaptured (set pre-capture, never rewritten),
but the event carries the same config, so borrow_apr / supply_apr
compute without a storage backfill.
Changed
- `key_xdr` bloom skip-index on `ledger_entry_changes` (schema + r1
MATERIALIZE). Point lookups of a specific contract_data key now prune
granules instead of full-scanning ~1.7B rows: the wasm-hash + code-history
readers' instance-key lookups dropped ~21s → ~0.7s on r1. (Blend ResData
is rewritten constantly so its key spans many granules — the bloom prunes
little there, so the reserve reader keeps its recent-ledger window bound.)
Fixed
- CCTP + Rozo bridges now have exact genesis ledgers in the coverage
diagnostics (#40 / #41). Both were absent from
sourceGenesisLedger, so
/v1/diagnostics/ingestion reported them as "no genesis → no density".
Added the exact deploy ledgers derived from the completed WASM-history audits
(docs/operations/wasm-audits/{cctp,rozo}.md): cctp = L62,147,265
(2026-04-16), rozo = L60,829,370 (2026-01-18). Rozo's predates the
contract-storage capture window, so a density gap below ~62M is the honest
"pre-capture history not backfilled" signal.
Fixed
- Blend reserves return real data even without ResConfig. A reserve's
ResConfig (rate-model params) was set before the contract-storage capture
window began (~ledger 62M) and never changed, so it's not in the lake
(count=0) — requiring it produced empty reserves. ResData (the volatile
state) IS captured. Now ResData is mandatory → exact supplied / borrowed /
utilization (config-free), and ResConfig is optional → borrow_apr /
supply_apr are nullable, decimals defaults to 7. So the reserve shows real
current-state TVL + utilization even when the APY can't be computed.
Fixed
- Blend reserves lookup no longer full-scans the lake. rc.121's
/v1/lending/pools/{pool}/reserves fanned out one key_xdr= lookup per
reserve, each a ~20s full scan of the 270M-row contract_data set (no
skip-index on key_xdr) → handler timeout / 500. Rewritten as a SINGLE
batched key_xdr IN (…) query with argMax for the latest version, bounded
to a recent ledger window so it's partition-pruned (~6s on r1). An active
Blend pool rewrites its reserves continuously so the latest state is always
in-window. (A key_xdr bloom index would remove the bound + speed the
wasm/code-history readers too — deferred; heavy MATERIALIZE on a shared host.)
Added
- Real Blend per-pool TVL / utilization / APY (ADR-0039, #84 complete). New
GET /v1/lending/pools/{pool}/reserves reads each reserve's CURRENT on-chain
state from the lake (point-lookup of the ResData/ResConfig contract_data
entries by exact key, decoded with the new storage decoder), and reports
supplied/borrowed amounts, utilization, and supply/borrow APR computed with
the pool's own interest-rate model — verified against real r1 data (Pool #1
USDC: ~$55M supplied, ~$37M borrowed, 67.8% util). USD TVL is best-effort
(priced reserves only); token-unit amounts + util + APR are always exact.
The Lending pool detail page replaces its "#84 pending" placeholder with the
live reserve table. This is real current-state, distinct from the
/v1/lending/pools window net-flow proxy.
- Soroban contract current-state reader (ADR-0039) — Blend reserve decoder +
interest model. First half of #84: read on-chain contract state from the
lake instead of only events. New
internal/sources/blend/storage.go decodes
Blend ReserveData / ReserveConfig / PoolConfig from Soroban storage (by
field name, mirroring the pool contract's storage.rs), and interest.go
ports the pool's interest-rate model (interest.rs / reserve.rs) —
utilization, borrow APR, supply APR — with fixed-point rounding that matches
the chain bit-for-bit (validated against the contract's own unit-test
vectors). ADR-0039 records the read-time-decode architecture. The lake reader
+ real /v1/lending/pools TVL/util/APY wiring follow next.
Added
- Network throughput time-series + `/network` page, and the asset market-cap
timeline chart. New
GET /v1/network/throughput?window_days= returns daily
ledger / tx / op / Soroban-event counts from stellar.ledgers (the
time-series companion to the /v1/network/stats snapshot); a new explorer
/network page (+ nav entry) charts it with metric + window toggles and the
live snapshot tiles. The asset Supply tab's "market-cap timeline" placeholder
is now a real chart, served by the Item-1 /v1/chart?price_type=market_cap
endpoint (off-chain crypto:* reference assets keep a concise note).
- Operations directory + per-type stats.
GET /v1/operations without a
?ledger= now returns the network-wide recent-operations directory (newest
first, keyset-paged via ?cursor=) plus op_type_stats — the per-op-type
counts over the trailing ~24h (?ledger=<seq> still returns that ledger's
ops). New explorer /operations page (+ nav entry) renders the decoded feed
and the type breakdown. r1-timed: reverse-scan 0.16s, stats group-by 0.02s.
- Anomalies + divergence read endpoints (ADR-0019) + live explorer pages.
New
GET /v1/anomalies (the freeze-event timeline from freeze_events:
firing-now count + per-reason tally + clear→firing transitions with duration
and the value served while frozen) and GET /v1/divergence (the current
cross-reference board — latest our-VWAP-vs-reference delta per (pair,
reference) from divergence_observations, widest gap first). Both tables
were already populated by the aggregator's sinks; only the read path was
missing. The explorer /anomalies and /divergences pages now render these
live instead of "coming next" placeholders. No migration (read-only over
existing hypertables).
Fixed
- Account tx/ops participant query no longer full-scans. The rc.118
Phase-B union used
source_account = ? OR (…) IN (subquery), which defeats
the source_account skip-index and full-scans the 23 B-row operations table
(every account, not just whales). Rewritten as a UNION ALL of two
index-friendly arms — sourced (via the source_account index) and participant
(matched on the operations PRIMARY KEY (ledger_seq, tx_index, op_index) via
the account-prefixed operation_participants). r1-timed: the participant arm
is 0.018s; remaining latency for very-high-activity accounts is the
pre-existing source_account-ordering cost, unchanged by this feature.
Changed
- ADR-0027 (LCM cache tiering) accepted. The dual-source read path +
trim/rehydrate operators are implemented and gated behind the safe-by-default
ColdTieringEnabled() flag, so the design is the architecture of record. The
status note records that production activation (enable flag + bulk trim) is a
single operator-gated step — enabling §3 without §4 only adds the cold-path
failure mode with no headroom benefit. ADR-0012 (quorum-set composition)
remains Proposed by design: it's a deliberate placeholder gated on the
post-launch Tier-1 validator rollout (ADR-0004 Phase 3), not an open decision
we can make pre-launch without fabricating a validator trust set.
Added
- Account incoming/participant history (ADR-0038 Phase B completion). A new
stellar.operation_participants index (one row per non-source account an op
touches — payment destination, trustor, merge target, clawback victim, …,
derived in the Go extract via xdrjson.ParticipantAccounts). The
/v1/accounts/{g}/transactions + /operations endpoints now UNION sourced
activity with participant activity and stamp scope: "all" (was "sourced").
Live capture fills the index going forward (the extract is shared by the live
sink + ch-backfill); historical incoming coverage requires re-running
ch-backfill over the range (operator-gated). The explorer account view copy
is updated to reflect full (sourced + incoming) history.
- MEV feed: atomic-arbitrage detection. New
/v1/mev endpoint + a
detection worker (internal/aggregate/mev, runs in the aggregator every
5 min) that flags atomic arbitrage — a single transaction where one taker
trades a closed asset cycle (≥2 legs returning to a starting asset) across
pools/venues. This is the one MEV pattern the served trade data supports
unambiguously (rows lack intra-ledger tx ordering, so cross-tx sandwich
detection would be guesswork). Each event stores its evidence (assets /
venues / cyclic legs / USD notional); idempotent via a dedup key
(migration 0067 adds the arbitrage kind + dedup_key). The explorer
/mev page now renders the live feed. Paired Prometheus metrics
(mev_detect_runs_total / _duration_seconds + mev_events_inserted_total).
profit_usd stays null — v1 detects the cycle structurally, it does not
estimate profit (leg direction is ambiguous in the served rows).
- Blend per-pool lending stats.
/v1/lending/pools now lists pools seen in
EITHER the auction OR position stream and adds a 30-day net-flow proxy per
pool (net_supplied_30d / net_borrowed_30d in token base-units +
utilization_30d_pct); the Lending protocol page gains a Net position by
pool table. These are event-derived WINDOW deltas, not all-time TVL or
on-chain utilisation — real current-state TVL + supply/borrow APYs
(reserve b_rate/d_rate) still need the Soroban pool-storage reader, which
these fields explicitly stand in for.
- Crypto market-cap-over-time.
/v1/chart?price_type=market_cap now
serves on-chain (native / classic / Soroban) base assets instead of
returning 501: each day's market cap = the existing daily USD price series
(with the stablecoin-USD proxy fallback) × that day's circulating supply
from a new `supply_1d` continuous aggregate (migration 0066, last-known
supply per asset/day, forward-filled). Off-chain crypto:* reference
assets (no on-chain supply) return an empty series rather than a fabricated
cap. Supply is scaled at 7 decimals (matches the spot market_cap_usd).
Added
- Accounts directory ranked by USD wealth. New
GET /v1/accounts ranks
accounts by the total USD value of their holdings — native XLM plus every
verified-currency Stellar asset we hold a live price for (stablecoins resolve
through the fiat proxy), summed in one pass over the
ledger_entries_current projection. The /accounts explorer page (with no
?id=) now renders this leaderboard; /accounts?id=G… keeps the
single-account detail view. Coverage tracks the entry-change capture +
Phase-C backfill.
Changed
- Account-state + asset-holder reads use a current-state projection. New
stellar.ledger_entries_current (ReplacingMergeTree fed by a materialized
view on ledger_entry_changes) holds the latest entry per key; the readers
query it FINAL via account_id/asset skip-indexes (~1 row per live entry)
instead of a GROUP BY over all history. Keeps holders/account-state fast as
the backfill scales the lake — unblocks the full genesis re-derive.
Added
- Contract code/upgrade history. New
GET /v1/contracts/{id}/code-history
returns a contract's WASM-hash timeline — each distinct executable its
instance has pointed at, chronologically, so an in-place update_contract
upgrade shows as a new version. Backs a Code history panel on the
contract page (alongside events / decoded WASM / interaction map). Coverage =
the captured entry-change window; fills with the Phase-C backfill.
Fixed
- Dashboard API keys: wrong prefix + a fresh key looked revoked / "last used
~2025 years ago" / first-request-already-done. The key DTO used
omitempty
on time.Time fields, which does NOT omit a zero time (it's a non-empty
struct → "0001-01-01T00:00:00Z"), so the UI saw a revoked_at +
last_used_at on every new key and rendered it revoked, ancient-last-used,
and "traffic seen". Switched revoked_at/last_used_at/expires_at to
pointer times omitted when zero. Also minted keys now use the sip_ prefix
(Stellar Index) instead of the rebrand-leftover rek_ (existing rek_ keys
keep authenticating — the prefix is display-only).
Fixed
- `/v1/assets/{id}/holders` 500'd — the holder-count subquery scanned
ClickHouse
count() (UInt64) into int64 (same driver type-mismatch as the
rc.112 directory fix, missed in the holders count path). Cast toInt64(count()).
Added
- Account state + asset holders (explorer Wave 3 — deep entity state). New
GET /v1/accounts/{g} returns an account's current on-chain state
reconstructed from the lake — native balance, sequence, sub-entries, flags,
home domain, signer set + thresholds, live trustlines (per-asset balance +
limit) and open offers. New GET /v1/assets/{id}/holders returns the top
holders of an asset by trustline balance + the total holder count. Both back
new explorer UI: a State panel on the account page (balances/signers/
trustlines/offers) and a Holders tab on the asset page. - Owner/asset/balance indexing on the entry-change lake (Wave 3 substrate).
The
ledger_entry_changes extractor now populates queryable, bloom-indexed
account_id + asset columns and a balance column from each entry's
ledger key — so account-state / asset-holder reads prune by owner/asset and
sort/aggregate balances in SQL instead of full-scanning + decoding. Additive
+ idempotent (live-capture populates going forward; a ch re-derive backfills
history). Coverage grows with the capture window; exists:false / empty
until the Phase-C genesis backfill completes.
Changed
- Account nav section moved to the bottom of the rail (below the explorer /
protocol / analytics groups).
- Contract interaction query bounded to the subject's most-recent 50k
transactions so mega-contracts (tens of millions of events) no longer time
out.
Fixed
- `/v1/contracts` + `/v1/contracts/{id}/interactions` 500'd against real
ClickHouse —
count() returns UInt64 but the directory/interaction
readers scanned into int64 (the stub-backed unit tests didn't catch the
driver-level type mismatch). Cast with toInt64(count()).
Changed
- Explorer nav restructured toward an entity-centric IA. The left rail is
now Explorer entities (Home, Ledgers, Accounts, Issuers, Assets, AMM Pools) →
Protocols (DEX/AMM, Lending, Aggregators, Bridges, Oracles, Soroswap Router) →
External Markets → Analytics (Anomalies, Divergence, MEV) → Developers (API
docs, SDK, Status). Secondary/marketing pages (Pricing, Methodology,
Diagnostics, the CEX board) moved to footer + search. First slice of the
deep-explorer build; Transactions, Contracts, and a dedicated SDEX order-book
view land with their pages next.
Added
- Contracts as a first-class entity — directory + interaction map. New
GET /v1/contracts ranks the most active Soroban contracts over a recent
window, each tagged with its owning protocol from the factory-anchored
registry (the attribution hinge). New
GET /v1/contracts/{id}/interactions returns the contract's cross-contract
interaction map — other contracts that co-occur in its transactions (a proxy
for cross-contract calls, since Soroban sub-invocations nest within one tx),
ranked by shared-tx count and protocol-tagged. Both back the explorer's new
/contracts directory page (top-level nav) and an interaction-map panel on
the contract detail page (alongside the existing events + decoded-code
panels). Lake-backed (stellar.contract_events), window-scoped to stay on
the primary-key range. - Transactions page + SDEX Markets nav split —
/transactions (recent
network activity, ledger-paged) and a distinct "SDEX Markets" nav entry
(native order book) separate from "AMM Pools". - `/bridges` page — cross-chain settlement directory (Circle CCTP v2 + Rozo),
a category landing over
/v1/protocols (reuses the protocol-directory grid).
Fixed
- Deploys silently skipped every migration — the F-1220 auto-apply was a
no-op.
deploy.yml passes -e "migrations_skip=false", which Ansible
receives as the STRING "false" (truthy in Jinja), so the playbook's
when: ... or not migrations_skip always evaluated false and SKIPPED the
"Sync migrations" + "Apply outstanding migrations" tasks — printing
"migrations_skip=true — skipping" even when the operator passed false. Every
binary deploy since left schema changes unapplied, the exact "healthz 200 but
partial outage" shape F-1220 was meant to prevent (and why the canonical
/usr/local/share/stellarindex/migrations was empty on r1). Fixed by coercing
with | bool in the playbook conditions. Also corrected the
fx-history-missing runbook, which pointed operators at the stale,
unmanaged /var/lib/stellarindex/migrations instead of the deploy-managed
/usr/local/share/stellarindex/migrations. - Doc/config drift: the dashboard SPA at `app.stellarindex.io` was retired
(2026-06-17) into the in-site `stellarindex.io/account`, but config doc-strings,
handler comments, the OpenAPI `/auth/login` description, `package.json`, and
three architecture docs still pointed at the dead host (one with a dead
`web/dashboard/README.md` link). Updated to reality and regenerated the
spec-derived artifacts.
- Hardened against the `nil-Now`-class latent panic found in dashboard auth.
A codebase sweep for the same pattern (a dependency dereferenced on a path
that didn't guard/default it) surfaced three latent cases, all now closed:
lookupUSDPrice derefed the optional Prices reader unguarded on the
populateChange24h path (the sibling populatePriceUSD guarded it) — and
since asset-detail fields populate on child goroutines that
middleware.Recoverer can't cover, an unguarded panic there would crash the
whole API process; the run() helper now recovers per-goroutine. The
statsflush and supply refresher constructors now default a nil logger like
their siblings (both deref it on a background tick).
Fixed
- Dashboard sign-in was completely broken — every authenticated request
500'd.
main.go built the dashboard auth Config without a Now clock
and passed it raw to the session-resolver middleware; NewHandlers defaulted
Now only on its own copy, so resolveSession nil-deref-panicked on every
request carrying a session cookie. The magic link set the cookie correctly,
then /v1/account/me 500'd and the dashboard never loaded — so login looked
dead. Now is now set explicitly and the middleware defaults Now/Logger
defensively. - The emailed "6-digit code" was really 5 digits + a NUL byte. It was
derived from 3 hash bytes → 5 base32 chars, leaving the 6th position unset.
Now derived from 4 bytes so it is a clean 6 ASCII digits — required, since
the code is now a typed credential.
Added
- Sign in with a 6-digit code, not just the magic link. New
POST /v1/auth/verify-code {email, code} consumes the emailed code and mints
the same session cookie the magic-link callback does (returns JSON, no
redirect — the SPA calls it via credentialed fetch). The sign-in page is now
a two-step flow (email → code) and the email leads with the code; the
one-click link still works. Brute-force is bounded by a per-token attempt cap
(migration 0065 adds magic_link_tokens.attempts); all code-failure modes
return one generic 400.
Changed
- Cross-origin session auth (F-03) — the in-site `/account` section now
authenticates on the apex. The session cookie is issued
SameSite=None
when Secure (prod), so the explorer at stellarindex.io can send it on
credentialed requests to the API at api.stellarindex.io; useMe + the
account API client send credentials: include. Requires allow_credentials
= true + cookie_domain = ".stellarindex.io" in the API config (set on r1).
Replaces the prior "explorer always renders signed-out" limitation.
Changed
- Full web redesign — a unified light-mode design system across all three
surfaces (explorer, status, dashboard). Modern, minimal, tech-forward:
Inter + JetBrains Mono (now actually loaded via next/font — they were
referenced but silently falling back to system-ui), a semantic token system
(brand / surface / line / ink / up / down / warn / bad / ok), hairline
borders over heavy shadows, generous whitespace, and one confident blue
accent. Dark mode removed (light only for now). New shared component
library (
web/explorer/src/components/ui) + style guide
(docs/architecture/design-system.md + /dev/styleguide). The status page
is unified with the site UX, and the customer dashboard was fleshed out
into a real product surface (sidebar shell + Overview/Keys/Usage/Settings on
live API data). Fixed latent bugs found en route: -DEFAULT-suffixed colour
classes (generated no CSS) and off-palette chart colours.
Fixed
- SEP-41 supply: mint + clawback silently dropped post-P23 (data loss).
sep41_supply.decodeCounterparty read the counterparty from a FIXED topic
index (mint/clawback → topic[2]) matching the legacy admin-prefixed SAC
shape. CAP-67 / Whisk (mainnet 2025-09-03) replaced that with
["mint", to, sep0011_asset] — counterparty at topic[1], a String at
topic[2] — so AsAddressStrkey errored on the String and the whole row was
dropped. r1-lake-verified: 99.96% of recent mints + 100% of clawbacks are the
CAP-67 shape, all lost; total_supply under-counted for every watched SEP-41
token. Now shape-aware (topic[2] is an Address ⇒ legacy/topic[2], else
CAP-67/bare-spec/topic[1]); burn was already correct. The old back-compat
test passed on a fabricated shape mainnet never emits; replaced with a
lake-faithful shape matrix. Historical recovery (re-derive from the lake) is
a deferred operator job. (audit-2026-06-14) - Explorer pagination dropped rows at page boundaries. Contract-event and
account tx/op listings cursored on
ledger_seq only, but many rows can share
one ledger (a busy AMM emits >limit events/ledger), so a page boundary inside
a ledger silently skipped the remainder. Now a composite keyset cursor
(opaque next_cursor/cursor, ClickHouse tuple comparison). Ledger listing
keeps its correct integer before. (audit-2026-06-14, A11) - Explorer UI: `result_code` rendered every op red. The API emits
result_code as a JSON number (0 = success) but the TS typed it as string
and regex-tested it; success now derives from === 0. Also: account
source_account links 404'd (pointed at /issuers/{g}, which static-exports
only ~100 issuers) — added a /accounts?id= query-param page; and
total_coins (~1e18 stroops) lost precision through Number() — now
BigInt-divided (ADR-0003). (audit-2026-06-14, A17) - SDK `Envelope.Pagination` round-trip drift (A14-01). The Go client typed
Pagination as a value with omitempty — a no-op on a struct — while the
server uses *Pagination, so re-encoding a non-list response emitted
"pagination":{} where the server omits it. Changed to *Pagination (matches
the wire; nil ⇒ absent). Pre-v1 SDK; consumers nil-check before .Next. - S3 credential env field corrupted by its own override (A16-01).
[storage]
s3_access_key_env/s3_secret_key_env hold the NAME of the env var carrying
the credential (buildS3Client does os.Getenv(name)), but ApplyEnvOverrides
+ an env: tag overwrote the name with the env var's VALUE, so
os.Getenv("AKIA…")→"" silently dropped S3 static creds for the
trim/rehydrate-galexie-archive ops commands. Removed the override + tag (the
fields are names with defaults; export STELLARINDEX_S3_ACCESS_KEY=<key> and
it resolves through the name). Latent (the indexer hot path uses the AWS
default chain). - Generated API reference could silently drift on `main` (A19-02). The
spec→rendered-reference sync check was PR-only (path-filtered CI), so a
direct-to-main push that edited
openapi/ without make docs-api slipped a
stale reference onto main (66 vs 73 paths). Added the diff as a lint-docs.sh
section so verify.sh catches it pre-push on every commit, and regenerated
the reference. - Projector decode panic could crash-loop the live indexer (X9). The
projector's per-source goroutine ran decoders on raw lake rows (incl.
historical/upgraded-WASM shapes) with no
recover — the dispatcher path has
one via pipeline.ProcessLedger, but the projector didn't inherit it. A
panic on one poison row crashed the whole stellarindex-indexer, and since
the cursor doesn't advance past the bad row, restart re-read it into a
crash-loop. Per-row recover now demotes a panic to a counted soft-fail
(extracted to a unit-tested processEventSafely). (audit-2026-06-14, X9) - API-key revocation could silently no-op under the Postgres backend (X6).
/v1/account/keys (mint/list/revoke) was wired unconditionally to the Redis
store, but under auth_backend=postgres the runtime validator authenticates
from Postgres — disjoint stores, so a DELETE here removed the Redis record
while the live Postgres row kept authenticating (a "revoked" key stays live).
Latent on r1 (default redis backend, where writer+validator agree). The Redis
account-keys surface is now disabled under the Postgres backend with a loud
log; the Postgres-backed /v1/dashboard/keys (invalidates the cache on
revoke) is the source of truth there. (audit-2026-06-14, X6) - Magic-link login could email-bomb an inbox.
POST /v1/auth/login sent
an email per accepted request, bounded only by the global anon per-IP
rate-limit (60/min) — enough to flood a victim inbox / burn the email-send
quota. Added an optional LoginThrottle (per-IP + per-target-email Redis
sliding window, default 10/h IP + 5/h email); over quota the send is skipped
but the generic 200 is still returned (no enumeration/throttle signal), and a
Redis blip falls open. (audit-2026-06-14, A12) - Migration `down` of 0031/0040 re-armed retention (data-loss footgun). The
down migrations re-added
add_retention_policy('trades'/'oracle_updates', 90
days) — the exact mechanism of the "rogue retention" drift ADR-0034 forbids;
one migrate down crossing 31/40 would schedule deletion of >90d raw rows.
Both downs are now documented no-ops (forward-only). (audit-2026-06-14, A15) - Hot hypertables encoded a 1-day chunk interval.
trades (and
soroban_events / blend_auctions / phoenix_*) were created with
chunk_time_interval => 1 day; trades reached 3445 chunks → per-INSERT
ON CONFLICT walked all chunks → ~6 inserts/s + lock-table pressure. The r1
fix was operational (merge_chunks), so a fresh bring-up re-accrued it. New
migration 0062 widens them to 7 days (affects future chunks only).
(audit-2026-06-14, A15) - k6 99-spike alert silence was a no-op.
test/load/scenarios/lib/
alertmanager.js defaulted to matcher names (APIHighLatencyP95/
APIHighErrorRate) that match NO deployed alert, so the planned-burst
silence never applied and on-call would page. Fixed to the real
stellarindex_api_* alert names. (audit-2026-06-14, A20) - projector-replay silently no-oped — the rewind called UpsertCursor,
whose monotonic-forward guard (F-0020) matched zero rows on a backward
write; the command printed success while the cursor stayed at tip. New
dedicated
RewindCursor store method (backward-only UPDATE; errors on
missing row) wired into the subcommand. Found when the blend
TRUNCATE+replay re-derive wrote nothing.
Added
- Network explorer (ADR-0038) — a read API + UI over the certified
ClickHouse Tier-1 lake:
GET /v1/ledgers, /ledgers/{seq}/transactions,
/tx/{hash}, /operations, /contracts/{c}, /accounts/{g}/transactions
+ /operations, and /search. Classic XDR is decoded to clean JSON
(internal/xdrjson, amounts as strings per ADR-0003) and the served reads
use the lake's bloom skip-indexes (tx_hash, source_account, contract_id).
Next.js static-export UI: ledger / tx / contract / account pages + ⌘K
search. Account activity is sourced/submitted scope only (participant index
is Phase B/C). The two /accounts/{g}/* paths ship with OpenAPI
AccountTransactions/AccountOperations schemas (scope, next_cursor). - GET /v1/coverage — public per-source completeness verdicts
(ADR-0033): the three claims (substrate/recognition/projection), the
verified-to watermark, and the headline complete boolean, served from
completeness_snapshots. The trust story as an API: consumers can audit
the "every protocol, verified complete" claim themselves. Feeds the
explorer Coverage center.
Changed
- Repositioned as a protocol explorer for the Stellar network (the pricing
API remains a flagship product) evolving toward a comprehensive blockchain
explorer.
Removed
- BREAKING (API/SDK, SemVer-major): cross-chain / multi-network asset wire
shapes removed — the public API + Go SDK are now Stellar-only. Part of the
Stellar-focus refactor (
docs/architecture/stellar-focus-refactor-plan.md,
Unit D / Tier 3). Removed: the GlobalAssetView.networks[] array,
VerifiedCurrencyListItem.networks[] + network_count, the NetworkView
and PerNetworkAssetView schemas/types, the GET /v1/assets/{asset_id}/{network}
per-network drill-down route, and the ?network= query param on /v1/assets.
The verified-currency catalogue (internal/currency/data/seed.yaml) is now a
pure Stellar-asset trust registry: every non-Stellar networks: entry was
stripped, so each browseable entry carries at most one (stellar) network
entry. Reference-only coins (BTC/ETH/…/USDT) keep their coingecko_id /
coinmarketcap_id mappings — the divergence/aggregator
reference-price pipeline is unaffected. Pre-v1, no production consumers. - Cross-chain market-cap cache (`internal/currency/marketcap`) removed. The
CoinGecko-backed presentation-only cache (and its refresher goroutine + the
MarketCaps server option + the /v1/diagnostics/ingestion market_cap
state section) populated a CMC-style market_cap_usd for non-Stellar coins.
It was never read by divergence/aggregate. Catalogue crypto/stablecoin
rows no longer carry a catalogue-level market cap (their per-Stellar-asset F2
fields on /v1/assets/{asset_id} remain the canonical source). The legit
Stellar-native market cap (AssetDetail.market_cap_usd, circulating supply ×
price) and the fiat M2 × FX market cap are unchanged.
Fixed
- ledgerstream: a bounded range of exactly one ledger is valid. The
tiered-path range validation rejected
To() == From(), but the SDK models
a single-ledger bounded range as a first-class concept
(ledgerbackend.SingleLedgerRange) and the walk loop handles it as one
iteration. Practical impact: ch-live-catchup's tip-extend failed every
time its 10-minute timer fired exactly one ledger behind the galexie tip
(ch-backfill: invalid end value for bounded range — ~half of r1 runs
flapped red on 2026-06-11). Inverted ranges (To < From) are still
rejected.
- loki (r1): chunk storage moved off the root filesystem to the ZFS pool
(`/tmp/loki` → `data/loki` @ `/var/lib/loki`) + 30-day retention. The
quickstart-scaffold config stored Loki chunks on the 49 GB root via
/tmp/loki — the same failure class as the 2026-06-11
ClickHouse-logs-on-root fill and the 2026-05-10 root-full SEV-2 — grew
without bound (no compactor/retention configured), and lost all log
history on every reboot (/tmp is wiped). Storage now lives on the
data/loki ZFS dataset with retention_period: 720h enforced by the
compactor; log_level codified at warn (matching what r1 actually ran)
instead of the scaffold's debug. Applied live on r1 2026-06-11 with the
existing 21 days of chunks migrated intact.
- sla-probe: measure the ≤30 s spec freshness target on `/v1/price/tip`,
not `/v1/price`. The probe held
/v1/price to the spec's 30 s
price-freshness target, but that surface serves the most recent CLOSED
bucket (ADR-0015 cross-region byte-identical contract): 60 s prices_1m
buckets + the CAGG refresh policy's 30 s end_offset + a 30 s schedule
interval make its observed_at structurally 30–150 s old. Result: the
probe failed every run since metrics began (≥14 days of Prometheus
history), drowning real regressions. The probe now also hits
/v1/price/tip — the rolling-window surface built to deliver the spec's
promise (sub-second observed_at) — and applies the 30 s target there,
while /v1/price is held to a structural 150 s bound
(-closed-bucket-freshness-target) that still catches the closed-bucket
pipeline falling behind (the 2026-06-02/03 chunk-perf regression read
166–186 s and would fail it). Per-endpoint freshness targets are recorded
in the JSON evidence as freshness_target_sec.
- soroswap-router: distinct swaps in one op were collapsed by a coarse PK
(migration 0056). A single InvokeContract op can carry multiple genuinely
distinct router swaps (an aggregator splitting a trade, or a batch to several
recipients); the PK
(ledger_close_time, ledger, tx_hash, op_index) dropped
all but one via ON CONFLICT. The completeness honesty guard confirmed 106
real swaps lost across pubnet history (not auth-tree dup-noise). Added a
per-call discriminator call_sig — RouterSwap.CallSig(), a 128-bit content
hash of function|recipient|path|amount_in|amount_out — to the PK: distinct
swaps get distinct keys (all stored); auth-tree duplicates of the same call
hash equal and still dedup. Operator runbook: stop indexer → migrate → deploy
the call_sig sink → TRUNCATE → ch-rebuild -contract-calls -sources
soroswap-router -write. Last of the coarse-PK class (lint allowlist now OK:). - Completeness census for the event-less ContractCall sources (band,
soroswap-router) now counts distinct served-PK identities, not raw events.
The auth tree surfaces the same authorized call at multiple CallPaths for
multi-entry (co-signed) / nested-auth txs; the served tier dedups them via
ON CONFLICT, so a raw-event census over-counted and reported a phantom
projection Δ (soroswap-router: 107 of 157.3k). The census dedups on the same
(tx_hash, op_index[, ts]) grain. An honesty guard logs any collision whose
row *content* differs — that would be the coarse PK collapsing genuinely
distinct rows (a schema-grain defect), surfaced loudly rather than buried. - soroswap-router swaps with an unrepresentable `deadline` were silently
dropped. The router
deadline arg is a user-supplied u64; some calls pass a
sentinel/garbage value (≈3e18 s → year ~99 billion, or one that overflows
int64 to a BC year) that lands outside Postgres's timestamptz range and
rejected the whole INSERT (SQLSTATE 22008). The swap itself is a real,
successful token movement, so InsertSoroswapRouterSwap now NULLs an
out-of-range deadline_ts instead of dropping the row. This affected both the
live indexer and every backfill — ≈24% of historical router calls (30.7k of
157.3k) were unstorable. Forward-fixes live ingest on the next indexer deploy.
Added
- `ch-rebuild -contract-calls` — lake-replay write path for the event-less
ContractCall sources (band, soroswap-router). These emit no Soroban events,
so neither the event pass nor the ADR-0032 projector can rebuild them. The new
pass streams the lake's InvokeContract ops (filtered on the contract's bytes in
body_xdr — stellar.operations has no contract_id column), runs each
source's ContractCallDecoder, and writes the decoded events through the
production sink (idempotent ON CONFLICT). It shares the exact decode path
(forEachContractCallEvent) with the completeness projection census, so a
written-row re-verify reconciles to Δ=0. This is the ADR-0034 successor to the
retired backfill-router MinIO walk (which under-produced — it pre-dated the
auth-tree-roots extraction and missed router calls nested inside aggregator
contracts).
Added
- `GET /v1/assets/{asset_id}/supply` + explorer supply panel (ADR-0034).
Exposes the live decode-at-ingest supply:
Σmint − Σburn − Σclawback from the
supply_flows lake, current to the latest ledger with no rollup refresh.
Resolves a Soroban contract id (C…) directly, a classic asset via the
operator's SAC wrappers (404 if unmapped), and native/XLM from the ledger
header total_coins (source=ledger_total_coins). Amounts are decimal
strings (ADR-0003). The API server gains a pooled clickhouse.SupplyReader
(nil when ClickHouse isn't configured → endpoint 503s; non-fatal at boot).
The explorer's Supply tab now leads with a live "On-chain supply" section
(total + mint/burn/clawback breakdown) for every token — not just the
handful with an ADR-0011 asset_supply_history snapshot — degrading
gracefully (section omitted) when the endpoint 404s/503s.
- Real-time per-token supply via decode-at-ingest (ADR-0034). Token supply
is now a pure SQL sum over a new
stellar.supply_flows table instead of a
periodically-refreshed rollup. The blocker for real-time supply was that the
amount lives in the event body as a raw i128 XDR scval that ClickHouse can't
decode — so supply required a 16-min Go batch recompute (ch-supply), stale
by up to the refresh interval. Now the indexer decodes the i128 amount at
ingest (DecodeSupplyAmount) for every mint/burn/clawback event and writes
a decoded row to supply_flows (ReplacingMergeTree, ORDER BY contract_id
first for fast per-token reads; event-identity suffix → idempotent under the
lake's drop→heal / re-backfill). The real-time dual-sink feeds it inline, so
a token's supply (Σmint − Σburn − Σclawback, SupplyForContract) is always
current with no refresh job and no read-time XDR decode. History is
seeded once from the existing lake via scripts/ops/ch-supply-flows-seed.sh
(windowed + resumable wrapper over ch-supply -seed-flows — a single-shot
all-history seed exceeds the 1h CH read timeout and, lacking an ORDER BY,
leaves scattered holes; windowing bounds each read); thereafter the dual-sink
keeps it live. The decode logic is shared between ingest and the seed so both
produce identical amounts.
- ClickHouse Tier-1 raw lake (ADR-0034, migration in progress). New
columnar storage tier for the OLAP-scale firehose (every ledger/tx/op/
event), moving it off Postgres where billion-row bulk reprocessing was
infeasible. Ships the Tier-1 schema (
deploy/clickhouse/tier1_schema.sql),
the internal/storage/clickhouse structural sink + LCM extractor (reuses
the proven ingest/CensusLedger/sorobanevents.Capture walk; stores raw
XDR, no SCVal decoding), and the stellarindex-ops ch-backfill command
(-parallel N for concurrent range-walkers — the historic-backfill
throughput unlock). The stellarindex-ops ch-gate command runs the §6 gates
over a backfilled range: it census-walks galexie, asserts the extractor
matches the decoder-independent census oracle, then reads the range back out
of ClickHouse and asserts the stored + actual row counts both equal the
census; it also reports compressed bytes/ledger + a full-history footprint
projection. Gated: a 100k-ledger sample must pass throughput +
completeness-vs-census before any full historic walk. See
docs/architecture/clickhouse-migration-plan.md +
docs/architecture/clickhouse-tier1-decoder.md +
docs/architecture/clickhouse-phase4-decoder-adapter.md.
- Fixed an extractor bug before any full walk: claimAtomCount decoded
CreatePassiveSellOffer via the wrong OperationResultTr union arm
(GetManageSellOfferResult, always ok=false for that op type) and
silently undercounted classic_trade_effect_count vs the census on every
crossing passive offer. Now uses GetCreatePassiveSellOfferResult,
matching sdex.decode + dispatcher.census; covered by a new
per-op-variant test.
- ADR-0033 — completeness verification model. Three independently
provable claims (substrate continuity, recognition, projection
reconciliation) replace threshold-based coverage as the
100%-confidence signal. See
docs/adr/0033-completeness-verification-model.md.
- `ledger_ingest_log` substrate-continuity record (ADR-0033 Phase 2).
Migration 0051. One row per fully-processed ledger, written
post-persist by the live indexer, carrying the LCM-derived census
(
soroban_event_count, classic_trade_effect_count — counted
decoder-independently from the LedgerCloseMeta) plus the header
hash-chain anchors. New stellarindex-ops census-backfill -from -to
populates history. Storage queries FindLedgerIngestGaps (contiguity)
and VerifyLedgerHashChain (cryptographic linkage) are Claim 1 of the
completeness model — both run over the narrow record, never a trades
scan. Once a ledger is recorded with its census, "zero events for
contract C here" is a *proven* quiet period, which is what lets the
confidence signal stop guessing sparsity thresholds.
- Recognition check (ADR-0033 Phase 3 / Claim 2a). New
stellarindex-ops verify-recognition -from -to pulls every distinct
(contract_id, topic_0_sym) shape from soroban_events and runs each
through the production decoder chain's real Matches() (no
hand-maintained topic list to drift). Any shape no decoder handles —
e.g. a topic a WASM upgrade added that we'd silently drop — is listed
and the command exits non-zero (cron/CI-gateable). Backed by
dispatcher.Recognize (side-effect-free), Store.DistinctSorobanTopicSamples,
and internal/completeness.AuditRecognition.
- Projection reconciliation (ADR-0033 Phase 4 / Claim 2b). New
stellarindex-ops verify-reconciliation -from -to [-source S]
re-derives, per ledger, how many trades rows the real decoder would
emit from soroban_events (deterministic recomputation) and diffs
that against the rows actually present — localizing any projector drop
(or phantom row) to an exact ledger. Covers soroswap/aquarius/phoenix/
comet (seeds soroswap pairs via RPC). Backed by
completeness.ReDeriveOutputCounts / ReconcileCounts and
Store.CountRowsByLedger. Correlation sources reconcile correctly
because each logical record's events share one (ledger, tx, op).
- SDEX / classic reconciliation (ADR-0033 Phase 5 / Claim 2b classic).
verify-reconciliation now also covers SDEX, which predates Soroban
and has no soroban_events: its expected count comes from the
LCM-derived classic_trade_effect_count census in ledger_ingest_log
(one ClaimAtom = one trade), gated on the substrate record being
continuous over the range (else it tells you to run census-backfill
first). The existing hubble-check (per-ledger SDEX-vs-Hubble counts
+ amount cross-check) remains the external defense-in-depth anchor.
- Completeness watermark verdict (ADR-0033 Phase 6 / headline).
stellarindex-ops compute-completeness derives the per-source
completeness WATERMARK — the highest ledger where substrate continuity
+ hash chain (Claim 1) AND projection reconciliation (Claim 2b) both
hold from genesis — plus a system recognition verdict (Claim 2a), and
writes them to the new completeness_snapshots table (migration 0052).
/v1/diagnostics/ingestion overlays completeness_pct /
completeness_watermark / completeness_complete onto each source
row, and the status page renders completeness_pct as the headline
(falling back to gap-free coverage when not yet computed). Unlike
density/gap_free this uses NO sparsity threshold — a single proven gap
pins it — so it is the honest 100%-confidence signal. MinGapSizeOverride
is now documented as alerting-cadence only, off the confidence path.
- Projection reconciliation extended to all per-ledger sources +
multi-output fix (ADR-0033 future work).
verify-reconciliation and
compute-completeness now drive off a shared catalogue covering every
source that writes a per-ledger table — trades (soroswap/aquarius/
phoenix/comet), oracles (reflector ×3 / redstone), cctp/rozo/defindex,
and blend's four tables — plus sdex via the LCM census. The re-derive
now buckets outputs by EventKind() (ReDeriveOutputCountsByKind +
SumKinds) and reconciles each table against only the kinds that
route to it — fixing a latent overcount where multi-output sources
(soroswap/phoenix/comet also emit skim/liquidity/stake events to other
tables) were compared whole against trades alone. Recognition gaps
are now attributed per-source for contract-pinned sources (oracles),
with a system recognition snapshot for gaps on unowned contracts.
(sep41/band/soroswap-router remain out of scope — documented in the
catalogue.) Also chunk-prunes those queries via SorobanEventsTimeBound.
- Incremental completeness verify + hourly timer (ADR-0033 standing guard).
compute-completeness gains -from <ledger>: verify only [from, tip],
trusting [genesis, from] as previously verified (substrate hash-chain,
recognition shape scan, and projection reconcile all scoped to the window);
the watermark still extends to tip when the window is clean. scripts/ops/
completeness-incremental.sh computes from = min(watermark) from the prior
snapshots, so each run re-checks only new ledgers — minutes, not the hours a
full genesis→tip sweep takes. It is READ-ONLY on served data (recomputes
completeness_snapshots only) and exits non-zero with the failing source +
range if a source regresses; repair (ch-rebuild over the range) stays a
deliberate action. Wired as stellarindex-completeness.{service,timer} (hourly,
niced). This is the runtime data-driven guard that keeps "verified 100%" true
as the tip advances; it complements the PR-time lint-pk-discriminators.
- `lint-pk-discriminators` CI guard. A new
scripts/ci lint that parses
per-source table PKs and fails the build if a table that can receive multiple
same-key events per operation lacks a per-event discriminator (the coarse-PK
data-loss class) — wired into verify.sh + ci.yml. Guards against
reintroducing the silent-drop bug fixed below for trades/blend/defindex.
Changed
- Sources panel shows "Entries 24h" instead of "Trades 24h". The
old column came from a
GROUP BY source scan over the trades
hypertable whose error was swallowed — so any timeout under load
silently rendered every source 0, and it was structurally 0 for the
many registered sources that don't write trades (oracles, bridges,
FX). It's replaced by a universal per-source trailing-24h event count
sourced from increase(stellarindex_source_events_total[24h]) (the
same counter that backs active_sources) via a new
StatusBackend.SourceEntries24h — cheap, reliable, and non-zero for
every active source whether on-chain or external. New entries_24h
field on /v1/diagnostics/ingestion sources[]; the silent-VWAP
highlight now keys off it too.
- Status-page on-chain coverage is now honest about what it's
measuring (ADR-0033). A source's coverage figure is only shown as a
trustworthy bar once its completeness watermark (
completeness_pct)
has been computed — the substrate+projection-verified signal. Until
then the page falls back to gap_free_pct, a *liveness* proxy ("no
large interior gap detected") that reads ~100% for sources that are
merely sparse or only partially indexed (e.g. phoenix-liquidity at
18 of 11.3M ledgers). Those unverified figures are now rendered muted
and tagged "unverified · N% gap-free" with an explanatory tooltip,
instead of a green ~100% bar that overstated completeness. Because we
cannot distinguish "sparse-but-complete" from "incomplete" without the
watermark, we never dress an unverified figure up as verified coverage.
Fixed
- Real-time projector CH feed-switch no longer risks silent loss
(ADR-0034 #10). The dual-sink (
clickhouse.LiveSink) is best-effort:
it drops whole ledgers under buffer pressure and a flush can partially
fail, so the CH lake can have holes near the tip — and the prior
ch-live-catchup only extended [CH_max+1, tip], which can never re-fill
a hole the sink already wrote past (verified: 48 orphaned ledgers,
[62939016,62939063]). Reading the projector forward from CH with the raw
ledgerstream tip as its bound would skip such holes and lose their protocol
events (the cursor advances unconditionally). Three changes make the
feed-switch safe by construction: (1) Sink.Flush now writes
stellar.ledgers last, making a ledgers row a per-ledger commit marker
(present ⟹ all of that ledger's tables are already durable); (2) the
projector clamps its CH-mode upper bound to ContiguousWatermark — the
highest ledger with no hole below it — so an unhealed drop stalls the
source at the hole instead of skipping it; (3) ch-live-catchup.sh
gap-scans stellar.ledgers and back-fills holes below CH_max, not just
the tip. Net: the lake self-heals and the projector never reads ahead of
provably-complete CH.
- Also: the no-contract-prefilter DEX/lending projector sources
(soroswap/aquarius/phoenix/comet/blend/cctp/rozo/defindex) now exclude the
CAP-67 classic-token firehose (transfer/mint/burn/clawback/
approve/set_authorized — ~99.8% of all events under V4 meta) at the SQL
layer on both read paths. A caught-up source reads a tiny window so it never
mattered, but a far-behind source's 10k-ledger catch-up window was streaming
~5M firehose rows it only discarded via Decoder.Matches, blowing the 60s
cycle budget and wedging the source (aquarius was stuck ~92k ledgers behind,
deadlock-storming the trades table). Exclude-only and audited lossless —
every one of the eight decoders was checked against the six symbols;
set_admin is deliberately retained because blend dispatches on it.
- `trades` no longer silently drops multi-trade-per-op trades
(aquarius, comet). The ADR-0033 projection reconciliation found
aquarius emitting 5 trade events in one operation (a multi-pool swap)
but only 2 rows landing — the decoders keyed the row on the raw
op_index, so every trade after the first in an op collided on the
trades PK (source, ledger, tx_hash, op_index, ts) and was dropped
by ON CONFLICT. They now fan out via canonical.FanoutOpIndex(op,
event_index) (op in the high 16 bits, the Phase-1 event_index in the
low 16), matching the stride pattern SDEX already used. Forward fix;
historical collided ops need re-backfill (delete-then-replay) to
recover. All four event-based trade sources are now fanned out:
aquarius/comet by the event's own index, soroswap by the swap
event's index (RawPair.Swap), phoenix by the swap's first-field
event index (RawSwap.EventIndex). Phoenix's 8-field buffer
emits-and-clears on completion, so router multi-hop segments into
separate swaps correctly — it was the same op_index collision, not a
merge (the old "multihops split on op_index naturally" assumption was
wrong).
- `soroban_events` no longer silently drops events from multi-event
operations.
event_index was hardcoded to 0 at capture, so every
contract event in one operation collided on the
(ledger_close_time, ledger, tx_hash, op_index, event_index) PK and
the writer's ON CONFLICT DO NOTHING kept only the first — Phoenix
(8 events per swap in one op) was archiving 1 of 8. A real
event_index is now threaded from the dispatcher's per-op event walk
through events.Event into Capture/Reconstruct, and
StreamSorobanEvents orders by it for deterministic replay. This is
the precondition for using soroban_events as a completeness oracle
(ADR-0033 Phase 1). Note: rows captured before this fix are missing
the collided events; affected ranges need re-backfilling — the
ADR-0033 reconciliation will surface exactly which.
- /v1/markets no longer returns 500 on unparseable trades rows.
A single stray row with
base_asset='test' 500ed every markets
request on 2026-06-01, tripping page-tier api_error_rate_critical
+ slo_availability_burn_fast until the row was hand-deleted.
The scanner now skips rows whose base/quote fail
canonical.ParseAsset, logs a WARN, and bumps the new
stellarindex_markets_skipped_rows_total counter so operators
can find and remove the offending row without serving 500s to
every consumer.
- SDEX census counts real trades, not both-zero no-op crosses. The
projection census (
claimAtomCount) counted EVERY claim atom — including the
both-zero no-op crosses stellar-core emits when an offer is touched in matching
but both legs round to 0 (dust offers / integer-rounding artifacts; ~1–2% of
SDEX claims). The decoder correctly drops those (one-side-zero KEPT), so the
census over-counted vs COUNT(trades) — violating its own invariant and
showing a spurious SDEX projection Δ. realTradeCount now mirrors the decoder
exactly (skip both-zero), in both mirrored copies (dispatcher/census.go +
clickhouse/extract.go). Going forward the live census equals the served trade
count; the historical retention window re-records once to match.
- SDEX projection reconcile floors at the actual retained boundary. trades
is
drop_chunks-managed, and retentionStart = tip-1.5M is ~100d at the
current ledger rate — ~10d / 150k ledgers below the oldest retained chunk. The
reconcile compared census>0 vs served=0 over that strip, manufacturing a
100%/20% "gap" in the lowest windows for rows retention deliberately dropped.
New store.MinLedger + retentionFloor scope the reconcile to where served
data actually begins; full-history coverage rests on the substrate (ADR-0033).
- `blend_positions` / `blend_emissions` / `blend_admin` / `defindex_flows` no
longer silently drop multi-event-per-op rows. Same coarse-PK class as the
trades fanout above, on the per-source entity tables: their PKs lacked a
per-event discriminator, so a second same-kind event in one operation collided
on
ON CONFLICT and was dropped. Migrations 0053–0055 add event_index (and,
for blend_positions, (asset, user_address)) to the PKs; the decoders +
sinks thread the in-tx event_index through. Forward fix; collided historical
rows recover via re-derive from the lake.
Fixed
- Oracle sources (band, redstone, reflector-dex/cex/fx) now have
gap-detector targets sliced from the unified
oracle_updates
hypertable. Pre-rc.107 these sources showed n/a on the
backfill_coverage listing because no per-source target existed.
Same shape as the rc.104 Soroban-DEX trade targets: shared
hypertable + per-source WhereFilter. Result: customer-facing
coverage_pct now populates for ALL Soroban sources with a
per-source hypertable. defindex + soroswap-router remain n/a
because they're log-only sinks (no per-ledger hypertable rows
to scan).
Fixed
- `coverage_pct` now reflects gap-free-ness, not event-density.
ADR-0031 Phase 2 deprecated the legacy cursor-derived
coverage_pct and the status page fell back to rendering
density_pct. density_pct = distinct_ledgers / expected_ledgers
over [genesis, tip] — for sparse sources (Soroban oracles
pushing once per hour, low-volume DEXes), density is naturally
<1% and the UI was reading that as "1% covered". User feedback
on r1 2026-06-01: that's a misleading metric.
Fix: coverage_pct = gap_free_pct = 1 - max_gap_ledgers /
expected_ledgers. 1.0 means the indexer hasn't skipped any
ledger in this source's window — what "coverage" intuitively
means. Sparse sources hit 100% as long as ingest is healthy.
Fixed
- `stellarindex_external_poller_stale` falsely firing on
chainlink. Live-r1 incident 2026-06-01: chainlink poller
reports ~36 min stale shortly after every indexer restart,
even though it's polling correctly every 30s. Root cause:
the runner's "skipped" branch (when the poller returns
nil, nil, nil — by convention meaning "polled successfully
but no new feed data") did NOT update
stellarindex_external_poller_last_success_unix. Chainlink's
Ethereum feeds update at most every 1 hour, so the vast
majority of its 30-second polls naturally take the skip path.
The alert read this as "the poller hasn't successfully
reached upstream in 30+ min" — wrong: the poller IS
reaching upstream, just finding nothing new.
Fix: bump LastSuccessUnix on the skipped path too — the
outcome="skipped" counter still distinguishes skip from
success, but the timestamp tracks "last time we polled at all"
not "last time we got an event."
Fixed
- Coverage snapshot rows for Soroban-DEX sources.
Post-ADR-0031 Phase 2 removed the cursor-derived density and
routed
/v1/diagnostics/ingestion's coverage listing through
source_coverage_snapshots. The gap detector targets covered
SDEX (via source = 'sdex' WhereFilter on trades) but not the
Soroban-DEX sources (aquarius, soroswap, phoenix, comet) that
also land in the unified trades hypertable. Result on r1
2026-06-01: API reported 0% coverage for all four. Added the
matching per-source targets with appropriate genesis ledgers
and 100K-ledger sparsity overrides — matches the SDEX shape.
Fixed
- PersistWorkers bumped 4 → 8. rc.102 with 4 workers gave
~5 ledgers/min on r1 vs the ~10 ledgers/min network rate;
doubling the concurrent drain lifts processing throughput above
the network rate so the live cursor catches up and stays close
to the SLA-freshness threshold.
Fixed
- PersistEvents parallel drain (4 workers). Live-r1 incident
2026-06-01: even after rc.101's batch-INSERT fix, the indexer
cursor advanced at ~1 ledger/min vs ~10/min network rate.
Root cause: the single-goroutine drain meant only one PG
roundtrip in flight at a time; the indexer's ProcessLedger
goroutine was blocked on
events <- ev waiting for that one
worker to drain. With 4 worker goroutines sharing the same
channel (Go's channel semantics handle concurrent receive
safely), the events channel drains 4× faster; the existing
PG pool of 25 conns carries the concurrent INSERTs. Each worker
maintains its own 200-row trade batch + 200ms flush ticker.
Per-event ordering within a source is not preserved across
workers; the trades hypertable's PK (source, ledger, tx_hash,
op_index, ts) makes that irrelevant for correctness.
Fixed
- Trade-insert throughput lifted ~40× via batch INSERT.
Live-r1 incident 2026-06-01: per-INSERT roundtrip cost capped
sustained trade throughput at ~5 trades/sec on the live indexer,
despite PostgreSQL handling 9000+ single-row INSERTs/sec in a raw
loop (verified). The bottleneck was the serial drain loop in
pipeline.PersistEvents: one event dequeue → one HandleEvent →
one InsertTrade roundtrip, no overlap. With ~300 events per
mainnet ledger, the cap meant ~1.8 ledgers/min processed vs the
~10 ledgers/min network rate, accumulating multi-hour lag.
New Store.BatchInsertTrades writes N rows in one statement
(INSERT … VALUES (…), (…), … ON CONFLICT DO NOTHING); same
idempotency, same per-source source_entry_counts UPSERT semantic,
same TradeInsertOutcomeTotal metrics. PersistEvents now
buffers trade events up to 200 rows OR 200 ms (whichever first),
flushes via the batch path, falls back per-row on a batch DB
error. Non-trade events (oracle updates, supply observations,
log-only events) stay on the single-row HandleEvent path.
Fixed
- Gap-detector no longer pile-drives postgres on huge tables.
Live r1 incident 2026-05-29: three concurrent
SELECT DISTINCT
ledger FROM trades WHERE source='sdex' scans accumulated over
successive gap-detector cycles because the Go-side ctx timeout
didn't propagate to PostgreSQL — the queries kept running and
starved trade-insert latency, lighting the slo_latency_burn
page. Two complementary fixes:
1. Per-target ScanCadence override. New
GapDetectorTarget.ScanCadence lets huge-table targets opt
into a longer scan cadence than the global 30-min interval.
SDEX trades and soroban_events now scan every 6 hours; light
targets keep the 30-min cadence for fast signal.
2. SQL `SET LOCAL statement_timeout` backstop.
CountDistinctLedgers and FindPerSourceLedgerGaps now wrap
their query in a transaction with a 5-min PG-side timeout.
If Go-side cancellation fails (the F-0020-cousin failure mode
we just observed), PostgreSQL itself aborts the query —
in-flight scans can no longer leak across cycles.
Changed
- `/v1/assets/{id}` SEP-1 overlay reads from DB instead of live
HTTPS. Pre-rc.99 the asset-detail handler called
metadata.Cache.Resolve(home_domain) on every uncached request,
which dominated p95 (~4s long tail on cold issuers — drove the
slo_latency_burn_medium page 2026-05-29 11:30). The handler now
reads the issuers.sep1_payload JSONB column populated by the
stellarindex-ops sep1-refresh cron, which is what /v1/issuers
already did. The sep1-refresh cron is extended to persist
Currencies (per-asset metadata) so the overlay's Name /
Description / Image / AnchorAsset fields stay populated on the
next cron run. - ADR-0029, ADR-0031, ADR-0032 promoted to Accepted. Phase 6
of the projection-architecture rollout completes the
documentation contract — three ADRs now describe the single
writer per data domain (projector for Soroban-derived, direct
for trades), the single data-derived coverage signal, and the
raw
soroban_events landing zone they share. CLAUDE.md gains
Invariant 7 ("One writer per data domain") summarising the
contract for future agents.
Added
- ADR-0032 Phase 5 — `projector-replay` operator subcommand.
Single SQL cursor-rewind:
stellarindex-ops projector-replay -source <name> -from <ledger>.
The projector goroutine catches up on its next cycle (≤ 5 s)
and re-projects forward to the live tip. Replaces the family of
*-backfill subcommands deleted in this release. New
projector-replay
runbook captures the new operator flow.
Removed
- ADR-0032 Phase 5 — dead-code deletion. Removed eight
redundant
stellarindex-ops subcommands (~1500 LoC):
cctp-backfill, rozo-backfill, soroswap-skim-backfill,
comet-liquidity-backfill, phoenix-backfill, blend-backfill,
sep41-transfers-backfill, drain-cascade-window. All replaced
by projector-replay + the projector goroutine. Also removed
the cascade-window-drain runbook (superseded by
projector-replay). Runbook + alert references updated.
Changed
- ADR-0032 Phase 4 — projector becomes sole writer for Soroban-
derived events. New
[ingestion.projector] persist_per_source
knob (default true = Phase 3 parallel mode); flipping to
false switches the dispatcher's events-goroutine to
pipeline.SinkModeSkipProjected so it stops writing the
Soroban-derived event subset. The projector becomes single
writer-of-record for trades, blend_*, phoenix_*,
comet_*, soroswap_skim, cctp_events, rozo_events,
sep41_*, oracle_updates (reflector + redstone). Non-projected
events (sdex, external CEX/FX, band, supply-observer
LedgerEntry observations) continue through the events-goroutine
unchanged. New pipeline.IsProjectedEvent is the dispatch
contract — table-driven test pins it.
Added
- ADR-0032 Phase 3 — projector scaffold in parallel mode. New
internal/projector component tails soroban_events (the
ADR-0029 raw-event landing zone) and invokes each protocol's
existing Go decoder, then routes decoded consumer.Events
through pipeline.HandleEvent (newly exported) to the same
per-source persisters the dispatcher uses. Phase 3 runs in
parallel with the dispatcher's existing per-source sinks — both
writers race for the same per-source PKs and ON CONFLICT DO
NOTHING absorbs duplicates, so projector lag versus the live
tip can be measured before Phase 4 flips the writer primary.
New [ingestion.projector] enabled config knob defaults to off;
cmd/stellarindex-indexer/main.go wires + drains the goroutine
on shutdown. - Projector observability. Four new metrics
(
stellarindex_projector_lag_ledgers, _runs_total,
_events_decoded_total, _cycle_duration_seconds) plus a
paired alert (stellarindex_projector_lag_high +
stellarindex_projector_error_rate_high, both P3) and the
projector-lag runbook.
Showing the 40 most recent of 131 releases. Full changelog →