Benchmark¶
This page summarizes a local benchmark of temporal partition generation across the currently supported tabular backends.
What was measured¶
The benchmark measures the total time needed to materialize all folds from:
list(splitter.iter_splits(data))
Configuration used¶
The same splitter configuration was used for every backend:
strategy:
rollinglayout:
train_testtrain_size="3D"test_size="12h"gap_before_test="30min"step="6h"dataset frequency: one row per minute
metric: wall-clock runtime over the full split iteration
repetitions: 3 per backend and dataset size
The benchmark was run locally on the current implementation, where pandas is still the internal execution engine. That means the numpy and polars timings include the cost of normalizing those inputs to pandas before splitting.
So this benchmark should be read as an end-to-end benchmark of the public API as it behaves today, not as a native backend-versus-backend comparison.
Results¶
Backend |
Rows |
Folds |
Mean ms |
Min ms |
Max ms |
|---|---|---|---|---|---|
pandas |
10,000 |
14 |
7.581 |
5.478 |
11.634 |
numpy |
10,000 |
14 |
4.536 |
4.208 |
5.181 |
polars |
10,000 |
14 |
10.767 |
10.657 |
10.825 |
pandas |
100,000 |
264 |
10.544 |
8.264 |
14.778 |
numpy |
100,000 |
264 |
19.789 |
18.930 |
20.940 |
polars |
100,000 |
264 |
65.366 |
62.823 |
69.806 |
pandas |
500,000 |
1,375 |
24.592 |
22.083 |
29.403 |
numpy |
500,000 |
1,375 |
94.801 |
91.859 |
97.771 |
polars |
500,000 |
1,375 |
294.843 |
289.612 |
299.288 |
pandas |
1,000,000 |
2,764 |
44.719 |
38.910 |
47.781 |
numpy |
1,000,000 |
2,764 |
183.353 |
177.574 |
190.390 |
polars |
1,000,000 |
2,764 |
587.886 |
583.358 |
592.276 |
Visual summary¶
Below, each bar shows the mean runtime for a given dataset size. The scale is relative within each row so the shape remains readable.
How to read these results¶
The current benchmark should be read as:
the partition engine itself is fast on large datasets,
pandas is currently the fastest path end to end,
numpy and polars are compatible public inputs, but not yet native optimized execution backends,
their extra cost mostly comes from boundary normalization into pandas before fold generation.
More explicitly:
pandasmeasures the direct execution path,numpymeasures conversion to pandas plus partition generation,polarsmeasures conversion to pandas plus partition generation.
So, if raw split speed matters most today, the best-performing input remains pandas.DataFrame. The current numbers are useful for understanding actual user-facing cost today, but they should not be interpreted as proof that pandas is intrinsically faster than a hypothetical native Polars or NumPy execution path that Jano does not implement yet.