Skip to content
Change the repository type filter

All

    Repositories list

    • Bespoke OLAP: Synthesizing Workload-Specific One-size-fits-one Database Engines
      Jupyter Notebook
      Apache License 2.0
      33200Updated Mar 25, 2026Mar 25, 2026
    • Generated CPP Artifacts of Bespoke OLAP: Bespoke-TPCH & Bespoke-CEB
      C++
      Apache License 2.0
      0400Updated Mar 1, 2026Mar 1, 2026
    • Rust
      0300Updated Feb 18, 2026Feb 18, 2026
    • Samsara

      Public
      Repository for Samsara: Towards a Multimodal Stream Processing System
      Python
      Apache License 2.0
      2100Updated Feb 17, 2026Feb 17, 2026
    • stretto

      Public
      The Stretto Execution Engine for LLM-Augmented Data Systems
      Python
      MIT License
      1200Updated Feb 16, 2026Feb 16, 2026
    • Unveiling Challenges for LLMs in Enterprise Data Engineering
      Python
      MIT License
      1500Updated Feb 9, 2026Feb 9, 2026
    • Towards Complex Table Question Answering Over Tabular Data Lakes (Extended Version)
      Python
      MIT License
      0000Updated Feb 9, 2026Feb 9, 2026
    • PIPE-X

      Public
      This repository contains code for PIPE-X, a system providing impact measures of preprocessing steps.
      Jupyter Notebook
      0100Updated Feb 1, 2026Feb 1, 2026
    • lcm-eval

      Public
      This is the source code of the SIGMOD paper: "How Good are Learned Cost Models, Really? Insights From Query Optimization Tasks"
      Python
      Apache License 2.0
      12800Updated Jan 21, 2026Jan 21, 2026
    • Tutorial for using GDB and Valgrind
      C
      0200Updated Jan 3, 2026Jan 3, 2026
    • The code for the interactive application building on PIPE-X (https://github.com/DataManagementLab/PIPE-X) .
      0000Updated Dec 10, 2025Dec 10, 2025
    • Examples on how to use perf
      C
      0200Updated Dec 3, 2025Dec 3, 2025
    • Redbench

      Public
      Sourcecode of the paper Redbench: Workload Synthesis From Cloud Traces
      Python
      Apache License 2.0
      5400Updated Nov 19, 2025Nov 19, 2025
    • wannadb

      Public
      WannaDB: Ad-hoc SQL Queries over Text Collections
      Python
      Other
      4903Updated Nov 18, 2025Nov 18, 2025
    • Evaluation scripts for Redbench
      Jupyter Notebook
      Apache License 2.0
      0200Updated Nov 3, 2025Nov 3, 2025
    • Graphical User Interface for LLM-based Data Engineering
      Python
      MIT License
      4000Updated Oct 20, 2025Oct 20, 2025
    • Moderately-optimized hill climbing algorithm implementation to solve the scanset matching optimization problem relevant for Redbench.
      C++
      MIT License
      1000Updated Oct 20, 2025Oct 20, 2025
    • Improving the Performance of Secure Data Analytics via Controlled Intermediate Result Size Disclosure
      Python
      0100Updated Oct 9, 2025Oct 9, 2025
    • Sourcecode of our AIDB '25 paper "JOB-Complex: A Challenging Benchmark for Traditional & Learned Query Optimization"
      Jupyter Notebook
      2800Updated Sep 9, 2025Sep 9, 2025
    • Demo for JUSTINE, our system for self-organizing Schemas
      Python
      0200Updated Sep 2, 2025Sep 2, 2025
    • In this repository we will publish for our vision to go from Query-by-Integration to Query-by-Collaboration.
      0000Updated Aug 5, 2025Aug 5, 2025
    • SPFlow

      Public
      Sum Product Flow: An Easy and Extensible Library for Sum-Product Networks
      Python
      Other
      82000Updated Jul 11, 2025Jul 11, 2025
    • PBench

      Public
      Python
      4000Updated Jul 8, 2025Jul 8, 2025
    • Graceful

      Public
      Sourcecode of our ICDE '25 paper "GRACEFUL: A Learned Cost Estimator For UDFs" (Johannes Wehrstein, Tiemo Bang, Roman Heinrich, Carsten Binnig)
      Python
      Apache License 2.0
      1800Updated May 14, 2025May 14, 2025
    • Code repository for the Lopster paper on data cleaning.
      Python
      3310Updated Apr 30, 2025Apr 30, 2025
    • Source Code for VLDB Demo Submission
      Python
      0310Updated Mar 25, 2025Mar 25, 2025
    • Shell
      0200Updated Mar 24, 2025Mar 24, 2025
    • eleet

      Public
      Repository for ELEET: Efficient Learned Query Execution over Text and Tables
      Python
      MIT License
      0900Updated Mar 17, 2025Mar 17, 2025
    • Train transformer language models with reinforcement learning.
      Python
      Apache License 2.0
      2.6k000Updated Feb 17, 2025Feb 17, 2025
    • C
      MIT License
      1100Updated Jan 24, 2025Jan 24, 2025
    ProTip! Don't forget that you can create saved views to keep track of your most important repositories!