Skip to content

feat(website): add Analytics page with real-time gateway scoring, decisions, and routing stats #238

@tinu-hareesswar

Description

@tinu-hareesswar

Context

The React dashboard in website/ currently exposes Overview, Decision Explorer, Routing Hub, Auth-Rate (SR) Based Routing, Rule-Based (Euclid), Volume Split, and Debit Routing pages (see website/src/App.tsx and website/src/components/layout/Sidebar.tsx).

There is no dedicated Analytics surface. Operators who want to understand how the engine is actually behaving have to query Postgres or Redis directly, or scrape Prometheus metrics from src/metrics.rs. This is a gap for anyone running decision-engine as a standalone product.

Goals

  1. Give operators a single place to see whether routing is actually working: which gateways are being picked, at what SR, and how many decisions/sec the engine is serving.
  2. Surface the live scoring state that drives SR-based routing, so a stuck/low-scored gateway is visible without Redis CLI access.
  3. Show feedback ingestion health (the loop that updates SR scores) so a broken feedback pipeline is immediately obvious.
  4. Complement the per-decision Decision Explorer with aggregated time-series views.

Reference Layout

A rough wireframe of the page. The page should feel like a single-pane "mission control" view with filters at the top, KPI tiles, then time-series charts, then drill-down tables.

flowchart TB
    subgraph Filters[" Top Filter Bar "]
        F1["Merchant"] --- F2["Payment Method Type"] --- F3["Gateway / PSP"] --- F4["Time Range: 15m / 1h / 6h / 24h / 7d / custom"] --- F5["Granularity: 10s / 1m / 5m / 1h"]
    end

    subgraph KPI[" KPI Tiles (auto-refresh every 5s) "]
        K1["Decisions / sec"]
        K2["Decisions last 5m / 1h / 24h"]
        K3["Feedbacks / sec"]
        K4["Avg SR (selected scope)"]
        K5["Error rate %"]
    end

    subgraph Charts[" Time-Series Charts "]
        C1["SR score per PSP (multi-line, moving window)"]
        C2["Decision throughput (stacked area by routing approach)"]
        C3["Gateway share of decisions (stacked area %)"]
        C4["Feedback throughput vs decision throughput (dual-axis line)"]
    end

    subgraph Tables[" Drill-down Tables "]
        T1["Live PSP Scoreboard"]
        T2["Top Priority-Logic Rules by hit count"]
        T3["Recent feedback errors / dead-letter"]
    end

    Filters --> KPI --> Charts --> Tables
Loading

ASCII wireframe (what an operator should see on first paint):

+--------------------------------------------------------------------------+
| Analytics                            [Merchant v] [PMT v] [PSP v]        |
|                                      [Range: 1h v]  [Granularity: 1m v]  |
+--------------------------------------------------------------------------+
| [ 142 dec/s ] [ 8.4k / 5m ] [ 138 fb/s ] [ SR 92.1% ] [ err 0.4% ]       |
+--------------------------------------------------------------------------+
| Realtime SR per PSP                                                      |
|  100% |       _____             ____           ___                       |
|   95% |  ____/     \____   ____/    \____  ___/   \___                   |
|   90% |                                                                  |
|   85% |                                                                  |
|        ----------------- time (moving window) ------------------>        |
|        [stripe] [adyen] [razorpay] [payu] [worldpay]   <- legend         |
+--------------------------------------------------------------------------+
| Decision throughput by approach (stacked area)                           |
|  [SR_SELECTION_V3] [PRIORITY_LOGIC] [NTW_BASED] [DEFAULT] ...            |
+--------------------------------------------------------------------------+
| Live PSP Scoreboard                                                      |
|  Gateway   | SR    | Elim  | Latency | Decisions(1h) | Last update       |
|  stripe    | 0.94  | 0.02  | 142ms   | 12,401        | 2s ago            |
|  adyen     | 0.91  | 0.04  | 188ms   |  9,322        | 3s ago            |
|  razorpay  | 0.87  | 0.07  | 211ms   |  6,118        | 1s ago            |
|  ...       |       |       |         |               |                   |
+--------------------------------------------------------------------------+

Detailed Requirements

1. Top Filter Bar (drives every widget below)

Filter Type Source
Merchant dropdown (multi-select) Postgres merchant_account
Payment Method Type dropdown enum (CARD, UPI, WALLET, NB, ...)
Gateway / PSP dropdown (multi-select) distinct gateways from scoring keys
Time Range preset + custom 15m / 1h / 6h / 24h / 7d / custom
Granularity dropdown 10s / 1m / 5m / 1h (auto-clamped to range)

All charts and tables on the page MUST react to the filter bar. Selected filters should be reflected in the URL query string so views are shareable.

2. KPI Tiles (auto-refresh every 5s)

  • Decisions / sec — instantaneous, derived from API_REQUEST_COUNTER rate.
  • Decisions in last 5m / 1h / 24h — three small numbers in one tile.
  • Feedbacks / sec — rate of feedback ingestion (the signal that drives SR updates).
  • Avg SR for selected scope — weighted avg across selected PSPs.
  • Error rate % — failed decisions / total decisions.

Each tile shows a tiny sparkline of the last 30 datapoints underneath the number.

3. Charts

3a. Realtime SR per PSP (PRIMARY chart)

  • Type: multi-line time series, one line per PSP.
  • X-axis: time, moving window honoring the selected range + granularity.
  • Y-axis: SR score (0-100%).
  • Behavior: auto-scrolls/updates as new datapoints arrive. Hover shows exact value + timestamp + sample size.
  • Data: Redis snapshot for the latest point + Postgres historical rollup for the rest of the window. Keys must match what src/feedback/gateway_scoring_service.rs writes so the UI is the same source of truth as routing.
  • Granularity filter: when user picks 10s we want to see the high-frequency wiggle; 1h should show smoothed daily trend.

3b. Decision throughput by routing approach

  • Type: stacked area chart.
  • Series: each value of GatewayDeciderApproach from src/decider/gatewaydecider/types.rs (SR_SELECTION_V3_ROUTING, PRIORITY_LOGIC, NTW_BASED_ROUTING, DEFAULT, ...).
  • Y-axis: decisions per second.
  • Why: lets an operator see "did SR routing actually take over after I enabled it" at a glance.

3c. Gateway share of decisions

  • Type: stacked area, normalized to 100%.
  • Series: one per PSP.
  • Why: shows whether traffic is concentrating on one gateway or spread.

3d. Feedback vs Decision throughput

  • Type: dual-axis line chart.
  • Series: decisions/sec (left axis), feedbacks/sec (right axis).
  • Why: if feedbacks drop to zero while decisions continue, SR scores will go stale — this surfaces it.

4. Tables

4a. Live PSP Scoreboard (the most important table)

Column Notes
Gateway name
Merchant when "all merchants" filter is active, otherwise hidden
Payment Method Type same
SR score from gateway_scoring_service.rs
Elimination score same
Latency score same
Decisions (selected window) count
Feedbacks (selected window) count
Last updated "Xs ago"
Sparkline last 30 SR datapoints

Sortable by every column. Row click → opens a side panel with that PSP's full history charts.

4b. Top Priority-Logic Rules by hit count

  • Pulled from Postgres (priority logic execution log).
  • Columns: rule name, hits in window, last hit, gateway it routes to.

4c. Recent feedback errors

  • Last N feedback ingestion failures (dead-letter / parse errors).
  • Surfaces a broken feedback pipeline before it silently corrupts SR.

Backend

Add a small read-only analytics API surface that the React app calls. Suggested endpoints:

  • GET /analytics/gateway-scores?merchant=&pmt=&gateway=&range=&granularity= — current snapshot + time series for the SR / elim / latency scores.
  • GET /analytics/decisions?range=&granularity=&group_by=approach|gateway — decision counts bucketed by time.
  • GET /analytics/feedbacks?range=&granularity= — feedback ingestion stats.
  • GET /analytics/routing-stats?range= — top rules, gateway share, error rate.

Sources:

  • Redis reads for the current scoring snapshot (same keys src/feedback/gateway_scoring_service.rs writes).
  • Postgres queries for historical decision / feedback / rule-hit data.
  • Optionally direct Prometheus registry reads for in-process counters (API_REQUEST_COUNTER, API_REQUEST_TOTAL_COUNTER in src/metrics.rs).

All endpoints are read-only, paginated where appropriate, and gated behind the same auth as other routes. Add them as src/routes/analytics.rs and wire into src/routes.rs.

Frontend Plumbing

  • New page at website/src/components/pages/AnalyticsPage.tsx.
  • New sidebar entry in website/src/components/layout/Sidebar.tsx.
  • New route /analytics in website/src/App.tsx.
  • Reuse the data-fetching pattern from OverviewPage / DecisionExplorerPage.
  • Chart library: prefer recharts (small, React-native, fits the existing stack) — confirm during implementation; if another chart lib is already in package.json, use that instead of adding a new dep.
  • Polling: KPI tiles and the Realtime SR chart poll every 5s. Other charts re-fetch on filter change + a manual refresh button. No --watch-style infinite streams.
  • Filter state lives in the URL query string (?merchant=...&range=1h&granularity=1m).

Acceptance Criteria

  • New Analytics sidebar entry and /analytics route render an AnalyticsPage in website/src/components/pages/.
  • Top filter bar with Merchant, Payment Method Type, Gateway, Time Range, Granularity. Filter state is reflected in the URL.
  • All five KPI tiles render and auto-refresh every 5s.
  • Realtime SR per PSP chart is implemented as a multi-line time series with the moving-window behavior described above and respects the granularity filter.
  • Decision throughput stacked area by routing approach is implemented.
  • Gateway share stacked-area chart is implemented.
  • Feedback vs decision dual-axis chart is implemented.
  • Live PSP Scoreboard table is implemented, sortable, and reflects the same Redis keys that gateway_scoring_service.rs writes (UI matches routing source of truth).
  • Top Priority-Logic rules table and recent feedback errors table are implemented.
  • Backend endpoints exist under src/routes/analytics.rs, wired into src/routes.rs, and pull from Postgres + Redis as described.
  • Empty-state handling: when no decisions / feedbacks have been recorded yet, the page renders a friendly empty state rather than erroring.
  • Docs updated: a short "Analytics" page in docs/ explaining the new surface, the data sources behind each widget, and how granularity buckets map to storage.

Out of Scope

  • Writing/mutating scores from the UI (read-only page).
  • Alerting / notifications — that's a follow-up issue.
  • Per-decision drill-down — already covered by Decision Explorer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions