Concise bracket syntax
Pipeline operations chain left-to-right inside [ ], in the
order they execute. No SQL keywords, no macro magic, no implicit coercions.
Backtick-quoted names handle columns with dots or spaces.
Ibex
Ibex is a statically typed language for columnar DataFrame and
TimeFrame manipulation. Write concise bracket-pipeline queries —
filter, aggregate, roll, resample, join — and run them in the fast interpreter
or compile them to standalone C++ binaries.
Pipeline operations chain left-to-right inside [ ], in the
order they execute. No SQL keywords, no macro magic, no implicit coercions.
Backtick-quoted names handle columns with dots or spaces.
as_timeframe promotes a DataFrame to a time-indexed structure
with O(n) sort detection. Rolling windows use a two-pointer O(n) scan with
a single result-column allocation per call — no copies, no heap churn.
ibex_compile transpiles any .ibex script to
idiomatic C++ using the ibex::ops::* library. Compiled and
interpreted outputs are behaviour-equivalent; both run near peak throughput.
Language tour
A complete analytical pipeline in a handful of composable clauses.
extern fn declares a C++ data-source function as a
first-class Ibex binding. The compiler resolves it at link time;
the REPL loads the corresponding plugin .so at runtime.
Filter expressions support arithmetic, comparisons, and boolean
logic (and, or, not).
Multiple clauses chain in reading order.
extern fn read_csv(path: String) -> DataFrame
from "csv.hpp";
let prices = read_csv("prices.csv");
// Keep rows where price exceeds 1.0
let active = prices[filter price > 1.0];
// Chain filter with a column projection
prices[
filter price > 1.0,
select { symbol, price, volume }
];
select doubles as a projection and an aggregation clause.
Add by to group; omit it for a global aggregate.
Available aggregate functions: first, last,
sum, mean, min, max,
count.
// Mean sepal length and row count, by species
iris[select {
mean_sl = mean(`Sepal.Length`),
n = count()
}, by Species];
// OHLC per symbol — all in one pass
prices[select {
open = first(price),
high = max(price),
low = min(price),
close = last(price),
traded = sum(volume)
}, by symbol];
update appends named computed columns to every row.
All existing columns pass through untouched, so the result schema is
a strict superset of the input.
// Enrich every row with return, range, and notional
ohlcv[update {
ret = (close - open) / open,
range = (high - low) / open,
notional = close * volume
}];
// Chain an update with an aggregation
let daily = ohlcv[update { ret = (close - open) / open }];
daily[select {
avg_ret = mean(ret),
n_sessions = count()
}, by sector]
[order { avg_ret desc }];
order accepts a single column, a multi-key block with
asc / desc annotations, or no
argument at all to sort by every column in schema order.
distinct deduplicates on a single column or a set.
// Sort by a single key (ascending by default)
iris[order `Sepal.Length`];
// Multi-key sort with explicit directions
results[order { avg_ret desc, symbol asc }];
// Sort by all columns in schema order
iris[order];
// Unique species names
iris[distinct Species];
// Unique (species, length) pairs
iris[distinct { Species, `Sepal.Length` }];
The as-of join attaches the latest right row at-or-before each left timestamp. It is the standard pattern for enriching tick data with bar data, without look-ahead bias.
Join keys are named with on. Both tables must be
TimeFrames for an asof join.
// Inner join — drop non-matching rows
let enriched = daily join fund on symbol;
// Left join — preserve all left rows
let with_meta = prices left join metadata on symbol;
// As-of join — each tick gets the latest bar at or before ts
let tf = as_timeframe(ticks, "ts");
let bars = as_timeframe(bars_1m, "ts");
tf asof join bars on ts;
as_timeframe validates sort order in O(n) and records the
time-index column. window <dur> sets the
lookback; rolling functions (rolling_sum,
rolling_mean, rolling_count, lag)
use a two-pointer scan with no per-row heap allocation.
Duration literals: 1s, 30s, 1m,
5m, 1h, …
let tf = as_timeframe(ticks, "ts");
// Previous tick's price
tf[update { prev_price = lag(price, 1) }];
// Tick count in the last 60 seconds
tf[window 1m, update { ticks_1m = rolling_count() }];
// Multiple rolling aggregates in one pass
tf[window 5m, update {
sum_5m = rolling_sum(price),
mean_5m = rolling_mean(price)
}];
resample <dur> floors timestamps into
fixed-width intervals and reduces each bucket to one output row.
Combine with by for per-symbol bars.
The output TimeFrame carries the bucket start time as its
time index, ready for downstream joins or further resampling.
let tf = as_timeframe(ticks, "ts");
// 1-minute OHLC bars
let bars = tf[resample 1m, select {
open = first(price),
high = max(price),
low = min(price),
close = last(price)
}];
// Per-symbol 1-minute bars
tf[resample 1m, select {
open = first(price),
close = last(price)
}, by symbol];
// Enrich ticks with the latest bar's close
tf asof join bars on ts;
scalar pulls one typed value out of a single-cell result
table. It is available as a binding in subsequent expressions.
ibex_compile transpiles a .ibex file to a
self-contained C++ source file. The helper script compiles and links it
in one step.
// Pull a single value from an aggregate
let total = scalar(
prices[select { total = sum(price) }],
total
);
// Use it in subsequent expressions
prices[update { weight = price / total }];
# Compile and run in one step
scripts/ibex-run.sh examples/quant.ibex
Performance
Release build, clang++, WSL2. Polars and data.table run multi-threaded on all cores; Ibex is single-threaded throughout.
Aggregation — 4 M rows, 252 symbols
| Query | Ibex | Polars | Pandas |
|---|---|---|---|
| mean by symbol | 28.4 ms | 40.1 ms | 181 ms |
| OHLC by symbol | 34.9 ms | 48.0 ms | 249 ms |
| count by sym×day | 12.6 ms | 66.2 ms | 328 ms |
| mean by sym×day | 14.0 ms | 76.8 ms | 367 ms |
| OHLC by sym×day | 20.6 ms | 73.9 ms | 400 ms |
| filter simple | 19.5 ms | 8.40 ms | 30.7 ms |
Geometric mean across 10 queries: 1.3× faster than Polars, 5× faster than Pandas, 2.1× faster than data.table, 3.5× faster than dplyr. Filter queries favour Polars, which uses parallel SIMD scans.
TimeFrame — 1 M rows, 1 s spacing
| Operation | Ibex | Polars | data.table |
|---|---|---|---|
| as_timeframe (sort) | 0.28 ms | 4.78 ms | 6.2 ms |
| lag(price, 1) | 0.97 ms | 4.84 ms | 11.0 ms |
| rolling count 1m | 1.12 ms | 16.9 ms | 12.2 ms |
| rolling sum 1m | 1.43 ms | 19.0 ms | 10.9 ms |
| rolling mean 5m | 1.65 ms | 19.7 ms | 9.6 ms |
| resample 1m OHLC | 24.7 ms | 14.6 ms | 20.0 ms |
Rolling operations use a two-pointer O(n) scan with a single result-column
allocation. Resample delegates to the aggregation path and is slower than
Polars’ parallel group_by_dynamic on this query.
Editor support
A TextMate grammar covering keywords, types, built-in functions, duration literals, backtick-quoted column names, and comments.
Install — WSL
cp -r editors/vscode \
/mnt/c/Users/<username>/.vscode/extensions/ibex-language-0.1.0
Install — macOS / native Linux
cp -r editors/vscode \
~/.vscode/extensions/ibex-language-0.1.0
Fully restart VS Code after copying. .ibex files are highlighted automatically.
Get started
Requirements: Clang 17+, CMake 3.26+, Ninja.
1 — Clone and build
# Debug build (ASan + UBSan)
cmake -B build -G Ninja \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_BUILD_TYPE=Debug \
-DIBEX_ENABLE_SANITIZERS=ON
cmake --build build
# Release build
cmake -B build-release -G Ninja \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_BUILD_TYPE=Release
cmake --build build-release
2 — Run the test suite
ctest --test-dir build --output-on-failure
3 — Start the REPL
./build-release/tools/ibex --plugin-path ./build-release/libraries
:load examples/quant.ibex | Load and execute an .ibex script |
:tables | List all bound table names |
:schema <table> | Column names and types |
:head <table> [n] | First n rows (default 10) |
:describe <table> | Schema + first n rows |
4 — Compile a script to C++
# Transpile, compile, and run in one step
scripts/ibex-run.sh examples/quant.ibex
# Or transpile only
scripts/ibex-build.sh examples/quant.ibex -o quant