Methodology & code
How the benchmark works
Everything behind the benchmark numbers is here: how to run the full suite yourself, and the exact code each engine runs for every query. The code is extracted directly from the harness source.
Reproduce it
Run it yourself
Every engine is a stock install — polars,
duckdb, datafusion, chdb (ClickHouse)
and pandas from PyPI; data.table and
dplyr from CRAN. The runner is one script; it generates the
synthetic data, runs all engines and writes a single CSV.
# clone, build Ibex in release, then run the whole suite locally: benchmarking/run_scale_suite.sh --warmup 1 --iters 3 # -> benchmarking/results/scales.csv # render these pages from that CSV: python3 benchmarking/gen_website.py benchmarking/results/scales.csv
The published numbers come from a clean cloud box for isolation — an AWS r7i.2xlarge (8 vCPU Sapphire Rapids, 64 GB), one command end-to-end:
./benchmarking/aws/run.sh --on-demand # provisions, runs 1M–50M, uploads, shuts down
Transparency
Exactly what each engine runs
Pick a query. Each engine's code is verbatim from the file linked beside it;
rolling-window frames are shown fully resolved (e.g. the
RANGE BETWEEN INTERVAL vs ROWS clause) so the
time-window comparison is auditable. polars-st runs identical
code to Polars with POLARS_MAX_THREADS=1; ibex+parse
is the same Ibex query timed with parsing included.