Random Splits vs Temporal Validation¶
sklearn.model_selection.train_test_split is useful when observations can be
treated as approximately independent and identically distributed. That is not the
question Jano is designed to answer.
For time-correlated data, the question is usually operational:
How would the model have behaved if it had only seen the past and then had to predict the future?
A random split can hide that question because it mixes dates across train and test.
The first snippet assumes scikit-learn is installed only to illustrate the common baseline. Jano itself does not require scikit-learn.
The scikit-learn way¶
Imagine a daily dataset where the target distribution changes near the end of the period:
import pandas as pd
from sklearn.model_selection import train_test_split
frame = pd.DataFrame(
{
"timestamp": pd.date_range("2025-01-01", periods=120, freq="D"),
"feature": range(120),
"target": [0] * 80 + [1] * 40,
}
)
train_random, test_random = train_test_split(
frame,
test_size=0.2,
shuffle=True,
random_state=7,
)
temporal_leakage = (
train_random["timestamp"].max() > test_random["timestamp"].min()
)
print(temporal_leakage)
# True
The problem is not that scikit-learn is wrong. train_test_split is doing what
it is designed to do: random sampling. The problem is that random sampling is the
wrong abstraction for production-like temporal validation.
In this setup, train can contain observations from dates that are later than some test observations. If the target changes over time, the evaluation can become too optimistic because the model has already seen part of the future regime.
The Jano Version¶
With Jano, the split is not defined as a random share of rows. It is defined as a temporal policy:
import pandas as pd
from jano import TemporalPartitionSpec, WalkForwardPolicy
frame = pd.DataFrame(
{
"timestamp": pd.date_range("2025-01-01", periods=120, freq="D"),
"feature": range(120),
"target": [0] * 80 + [1] * 40,
}
)
policy = WalkForwardPolicy(
time_col="timestamp",
partition=TemporalPartitionSpec(
layout="train_test",
train_size="60D",
test_size="14D",
gap_before_test="1D",
),
step="14D",
strategy="rolling",
)
plan = policy.plan(frame, title="Production-like temporal validation")
print(
plan.to_frame()[
[
"iteration",
"train_start",
"train_end",
"train_rows",
"test_start",
"test_end",
"test_rows",
]
].head()
)
The plan makes the temporal contract explicit before any model is trained:
iteration train_start train_end train_rows test_start test_end test_rows
0 2025-01-01 2025-03-02 60 2025-03-03 2025-03-17 14
1 2025-01-15 2025-03-16 60 2025-03-17 2025-03-31 14
2 2025-01-29 2025-03-30 60 2025-03-31 2025-04-14 14
3 2025-02-12 2025-04-13 60 2025-04-14 2025-04-28 14
What Changes¶
The difference is the evaluation contract:
train_test_splitanswers: can this model generalize to a random sample from the same mixed period?Jano answers: how would this model behave as time advances under a specific training and evaluation policy?
That gives you:
ordered train and test windows,
explicit train/test duration,
explicit gaps to model label or data availability latency,
repeated folds instead of one static estimate,
a
plan()object that can be inspected, filtered and audited before slicing the dataset.
This is the point where Jano enters: not as a replacement for scikit-learn, but as the temporal validation layer that sits before model training.