Benchmark

This page summarizes a local benchmark of temporal partition generation across the currently supported tabular backends.

What was measured

The benchmark measures the total time needed to materialize all folds from:

list(splitter.iter_splits(data))

Configuration used

The same splitter configuration was used for every backend:

  • strategy: rolling

  • layout: train_test

  • train_size="3D"

  • test_size="12h"

  • gap_before_test="30min"

  • step="6h"

  • dataset frequency: one row per minute

  • metric: wall-clock runtime over the full split iteration

  • repetitions: 3 per backend and dataset size

The benchmark was run locally on the current implementation, where pandas is still the internal execution engine. That means the numpy and polars timings include the cost of normalizing those inputs to pandas before splitting.

So this benchmark should be read as an end-to-end benchmark of the public API as it behaves today, not as a native backend-versus-backend comparison.

Results

Backend

Rows

Folds

Mean ms

Min ms

Max ms

pandas

10,000

14

7.581

5.478

11.634

numpy

10,000

14

4.536

4.208

5.181

polars

10,000

14

10.767

10.657

10.825

pandas

100,000

264

10.544

8.264

14.778

numpy

100,000

264

19.789

18.930

20.940

polars

100,000

264

65.366

62.823

69.806

pandas

500,000

1,375

24.592

22.083

29.403

numpy

500,000

1,375

94.801

91.859

97.771

polars

500,000

1,375

294.843

289.612

299.288

pandas

1,000,000

2,764

44.719

38.910

47.781

numpy

1,000,000

2,764

183.353

177.574

190.390

polars

1,000,000

2,764

587.886

583.358

592.276

Visual summary

Below, each bar shows the mean runtime for a given dataset size. The scale is relative within each row so the shape remains readable.

10k rows
pandas 7.6 ms
numpy 4.5 ms
polars 10.8 ms
100k rows
pandas 10.5 ms
numpy 19.8 ms
polars 65.4 ms
500k rows
pandas 24.6 ms
numpy 94.8 ms
polars 294.8 ms
1M rows
pandas 44.7 ms
numpy 183.4 ms
polars 587.9 ms

How to read these results

The current benchmark should be read as:

  • the partition engine itself is fast on large datasets,

  • pandas is currently the fastest path end to end,

  • numpy and polars are compatible public inputs, but not yet native optimized execution backends,

  • their extra cost mostly comes from boundary normalization into pandas before fold generation.

More explicitly:

  • pandas measures the direct execution path,

  • numpy measures conversion to pandas plus partition generation,

  • polars measures conversion to pandas plus partition generation.

So, if raw split speed matters most today, the best-performing input remains pandas.DataFrame. The current numbers are useful for understanding actual user-facing cost today, but they should not be interpreted as proof that pandas is intrinsically faster than a hypothetical native Polars or NumPy execution path that Jano does not implement yet.