Skip to content

Investigate rawsql-ts GROUP BY decomposition functionality #19

@mk3008

Description

@mk3008

Overview

Investigate and implement GROUP BY clause decomposition functionality using the rawsql-ts library, enabling GROUP BY processing to be decomposed into CTE (Common Table Expression) stages.

Current Status

Confirmed rawsql-ts v0.11.28-beta Features

  • GroupByClauseParser - GROUP BY clause parsing functionality
  • GroupByClause - GROUP BY clause data model
  • CTEQueryDecomposer - Existing CTE decomposition functionality

Features Requiring Investigation

  • ❓ Specific GROUP BY decomposition functionality
  • ❓ METHOD to decompose GROUP BY + aggregate functions into CTEs
  • ❓ Staged decomposition of nested GROUP BY processing

Technical Requirements

Example SQL for Decomposition

```sql
SELECT
category,
COUNT() as count,
AVG(price) as avg_price,
MAX(created_at) as latest_date
FROM products p
JOIN categories c ON p.category_id = c.id
WHERE p.status = 'active'
GROUP BY category
HAVING COUNT(
) > 10
ORDER BY count DESC
```

Expected Decomposition Result

```sql
-- Step 1: Base data preparation
WITH base_data AS (
SELECT
c.name as category,
p.price,
p.created_at
FROM products p
JOIN categories c ON p.category_id = c.id
WHERE p.status = 'active'
),

-- Step 2: Grouping aggregation
aggregated_data AS (
SELECT
category,
COUNT(*) as count,
AVG(price) as avg_price,
MAX(created_at) as latest_date
FROM base_data
GROUP BY category
)

-- Step 3: Final filtering and ordering
SELECT *
FROM aggregated_data
WHERE count > 10
ORDER BY count DESC
```

Implementation Plan

Phase 1: Investigation

  • Investigate rawsql-ts GroupByClauseParser detailed specifications
  • Research aggregate function (COUNT, AVG, MAX, etc.) identification methods
  • Explore integration possibilities with existing CTEQueryDecomposer

Phase 2: Basic Decomposition Implementation

  • Staged decomposition of queries containing GROUP BY clauses
  • Separation and CTE conversion of aggregate functions
  • Migration of HAVING clauses to post-processing stages

Phase 3: Advanced Decomposition Features

  • Multi-level GROUP BY processing
  • Combined processing with window functions
  • Performance optimization

Expected Benefits

  1. Improved Readability: Complex aggregation processes become easier to understand step-by-step
  2. Enhanced Debugging: Intermediate results can be verified at each CTE step
  3. Better Reusability: Separation of base data and aggregation processing
  4. Improved Testability: Unit testing implementation becomes easier for each stage

Related Technologies

  • rawsql-ts v0.11.28-beta
  • CTE (Common Table Expression)
  • SQL AST parsing
  • Query optimization

Priority

High - GROUP BY processing decomposition is crucial for understanding complex queries

Implementation Team

Development team (SQL processing expert preferred)


This investigation is a new feature proposal based on user request

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions