Title: Query Cost Estimator in SQL Lab Before Execution #39650

iskanderknani2005-oss · 2026-04-25T13:46:06Z

iskanderknani2005-oss
Apr 25, 2026

Problem:
Right now when a user types a query in SQL Lab and clicks Run, Superset fires the full query against the database immediately. There is zero warning if that query will scan 500 million rows, join four unindexed tables, or cost $40 on a cloud warehouse like BigQuery or Snowflake. Junior analysts frequently cause accidental full table scans that slow down the entire database for other users. There is no way to know the damage before it happens.

Proposed Solution:
Add a second button next to Run called "Estimate Cost". When clicked, Superset sends the query to the database using its native dry-run or EXPLAIN mechanism instead of actually executing it. The result is displayed as a banner directly below the SQL editor showing three values: estimated rows scanned, estimated data volume in MB or GB, and estimated cost in credits or dollars where the database supports it. The user then decides whether to run the query or refine it first.

How It Works Per Database:
BigQuery → uses jobs.query with dryRun: true flag
Snowflake → uses EXPLAIN command which returns credit estimate
Trino → uses EXPLAIN ANALYZE with VERBOSE option
Presto → uses EXPLAIN with TYPE DISTRIBUTED
PostgreSQL → uses EXPLAIN (FORMAT JSON) for cost nodes
Other → falls back to a row count estimate via SELECT COUNT(*)

What Changes:
Backend → new /api/v1/sqllab/estimate endpoint per engine
Frontend → new Estimate Cost button in SQL Lab toolbar
Frontend → cost banner component below the editor
Config → feature flag ENABLE_QUERY_COST_ESTIMATE to toggle per deployment

Why It Is Not Implemented Yet:
Each database exposes cost estimation differently. A unified abstraction layer needs to be built in the DB engine spec so each connector can register its own estimate method. This is the main missing piece.

mergisi · 2026-04-29T08:12:01Z

mergisi
Apr 29, 2026

This proposal addresses a real production pain point. The per-engine abstraction layer is the right approach — each database has fundamentally different cost semantics (BigQuery bills by bytes scanned, Snowflake by credit-seconds, Postgres by arbitrary planner cost units), so a unified estimate() method in the DB engine spec that returns a normalized response shape is essential.

A few design considerations:

Threshold-based blocking vs. advisory warnings. Teams will eventually want configurable thresholds — warn above 10GB scanned, block above 100GB without admin override. The advisory-only mode is the right v1, but the architecture should anticipate enforcement.

EXPLAIN plan caching for repeated queries. If analysts iterate on a query, caching the EXPLAIN result for the previous version lets you show a delta — "this change reduced estimated rows from 500M to 2M." That feedback loop is more valuable than a single absolute number.

Cost estimation for generated SQL. This becomes especially important as more SQL is generated by AI tools rather than written by hand. Tools in the NL-to-SQL space — ai2sql.io and others — produce correct queries that may not be cost-optimal. An EXPLAIN pre-check catches expensive patterns regardless of whether the query was hand-written or machine-generated.

Fallback strategy. The SELECT COUNT(*) fallback for unsupported engines is dangerous — it executes a full scan to estimate the cost of a full scan. A safer fallback would be EXPLAIN with basic row-count parsing, or simply showing "cost estimation not available for this engine."

The feature flag approach (ENABLE_QUERY_COST_ESTIMATE) is the right deployment strategy.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Title: Query Cost Estimator in SQL Lab Before Execution #39650

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Title: Query Cost Estimator in SQL Lab Before Execution #39650

Uh oh!

iskanderknani2005-oss Apr 25, 2026

Replies: 1 comment

Uh oh!

mergisi Apr 29, 2026

iskanderknani2005-oss
Apr 25, 2026

mergisi
Apr 29, 2026