Single-core speed that still shows up at scale
Ibex is currently single-threaded, yet stays competitive with engines using all cores on many common columnar queries.
Ibex
Ibex gives DataFrame pipelines their own compact, statically typed language. Explore in a REPL, embed the same code in Python or R notebooks, and compile it to C++23 when the pipeline needs to ship.
Ibex is currently single-threaded, yet stays competitive with engines using all cores on many common columnar queries.
Clauses compose left-to-right, columns are real names, and static types catch mistakes before a pipeline runs.
Use the REPL for exploration, notebooks for analysis, plugins for I/O, and C++23 codegen for native binaries.
Performance
Ibex is built for the expensive part of table work: grouped aggregation, rolling time windows, joins, filters, null handling, and reshaping. The benchmark suite compares each query against the same operation in Polars, DuckDB, ClickHouse, DataFusion, pandas, data.table, and dplyr.
Polars: 60.5 ms. Polars single-threaded: 216 ms.
Polars: 220 ms. DuckDB: 77.6 ms. DataFusion: 58.1 ms.
Polars: 152 ms. Polars single-threaded: 283 ms.
Ibex currently runs on one thread. The benchmark page shows both default engine settings and single-threaded Polars for a same-core comparison.
How a program is shaped
An Ibex program is a handful of let bindings. Each one names a
table and the steps applied to it. There are no loops or mutable
variables — you describe the transformation you want and Ibex runs it.
Write a table name, then square brackets containing a comma-separated list of clauses — one operation each. The clauses run in the order you read them, each taking the whole table and producing a new one.
Tables are values: a pipeline returns a new table and leaves its
input alone. You can name the result with let, or feed
it straight into another set of brackets.
// Keep the busy rows, then three columns
prices[
filter volume > 1000,
select { symbol, price, volume }
];
// Name a result and reuse it
let active = prices[filter volume > 1000];
active[select { symbol, price }];
Reading the syntax
Ibex uses square brackets, braces, and parentheses for three distinct things. Knowing which is which is most of what it takes to read any snippet.
[ ] — a pipeline
Square brackets attach to a table and hold a list of clauses to apply:
prices[filter …, select …]. Chaining
[…][…] just feeds one result into the next.
{ } — a list of fields
Braces hold the named members a clause works on — the output
columns of select, the sort keys of order,
the columns of a schema. Think struct fields, not a code block:
{ avg = mean(px), n = count() }.
( ) — calls and grouping
Parentheses are the familiar kind: calling a function and grouping
arithmetic. mean(price),
(close - open) / open. The expressions inside clauses are
ordinary too — comparisons, math, function calls.
A worked example
A common task — collapse tick data into daily bars per symbol — shows how the pieces fit together.
select chooses the output
columns. Each entry is name = expression; a bare name
passes a column through unchanged.
by symbol groups the rows, so
the aggregates in select — first,
max, min, last — run once
per symbol. Drop the by and they would collapse the whole
table to a single row instead.
order then sorts the result. The
whole thing is one expression, bound to bars.
let bars = ticks[
select {
open = first(price),
high = max(price),
low = min(price),
close = last(price),
vol = sum(size)
},
by symbol,
order symbol
];
The vocabulary
These drop inside [ ] and compose in any sensible order. The
function reference and
cheat sheet have the rest.
filter predicate | Keep rows where the predicate is true |
select { fields } | Choose or compute output columns; aggregates when paired with by |
update { fields } | Add or replace columns, keeping all existing ones |
where predicate update { fields } | Replace columns in selected rows |
by key | Group for select / update (like SQL GROUP BY / PARTITION BY) |
order { keys } | Sort, with per-key asc / desc |
rename { map } | Relabel columns without touching data |
distinct { keys } | Deduplicate on one or more columns |
head n / tail n | Keep the first / last n rows (per group with by) |
a join b on key | Inner / left / right / outer / semi / anti / cross / as-of joins |
window duration | Lookback window for rolling aggregates on a TimeFrame (e.g. window 5m) |
resample duration | Bucket a TimeFrame into fixed time intervals, then aggregate per bucket (e.g. 1m OHLC) |
Outside the pipelines are a few top-level forms: let bindings,
import to load a plugin, fn /
extern fn for reusable functions and data sources, and
Table { … } to build a table from literals.
Install & run
A prebuilt release is the quickest start. Build from source if you want the latest commits or a binary for your own platform.
Option A — download a release
Grab the prebuilt ibex REPL and bundled plugins, unpack,
and run — no toolchain required.
github.com/bobjansen/Ibex/releases ↗
# Unpack the archive for your platform, then:
./ibex --plugin-path ./plugins
Option B — build from source
Requirements: CMake 3.26+ and a C++23 compiler such as Clang 17+, GCC 13+, AppleClang, or MSVC 2022. Ninja is recommended on Linux and macOS; CMake's Visual Studio generator works on Windows.
# Linux/macOS with Clang or GCC
cmake -B build-release -G Ninja \
-DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_BUILD_TYPE=Release
cmake --build build-release
# Windows, from a Developer PowerShell
cmake -B build-release -DCMAKE_BUILD_TYPE=Release
cmake --build build-release --config Release
Run your first pipeline
./ibex --plugin-path ./plugins # from source: ./build-release/tools/ibex --plugin-path ./build-release/tools
import "csv";
let prices = read_csv("prices.csv");
// Five most-traded symbols by total volume
prices[
select { traded = sum(volume) }, by symbol,
order { traded desc },
head 5
];
Handy REPL commands: :load <file.ibex>,
:schema <table>, :head <table> [n],
:doc <name>, :help.
Where to go next
Interactive timings and memory use against Polars, DuckDB, ClickHouse, DataFusion, pandas, and R.
The same query in Ibex, pandas, Polars, and SQL, side by side.
A guided walk through every clause, with runnable snippets for deeper evaluation.
CSV, Parquet, SQLite via ADBC, and Kafka streaming into live dashboards.
Every built-in function with signatures and behaviour notes.
One-page syntax and function reference once you know what you are looking for.