MCP server

Jano ships an optional local MCP server so AI agents can use the library through a small, explicit tool surface.

This is useful when you want an agent to:

  • inspect a local dataset,

  • infer candidate time and target columns,

  • suggest and validate a partition policy,

  • precompute a walk-forward plan,

  • run a temporal simulation,

  • execute a simple baseline model,

  • and run baseline temporal studies without writing Python code manually.

The initial MCP surface is intentionally narrow. It focuses on the most stable, agent-friendly workflow:

  • preview and inspect a dataset,

  • suggest a conservative partition policy,

  • validate a policy with plan() before running models,

  • compare partition strategies,

  • plan a walk-forward simulation,

  • run a walk-forward simulation,

  • run a baseline model over the same folds,

  • compare retraining policies,

  • evaluate train-history windows,

  • monitor fixed-train performance decay.

Why MCP instead of only the Python library?

Installing a Python library is not enough to guarantee that an AI agent will use it correctly.

The MCP layer gives the agent:

  • a small set of explicit tools,

  • structured inputs and outputs,

  • and a recommended workflow that mirrors the high-level public surface of Jano.

Installation

The MCP server depends on the official Python MCP SDK and is intended for Python 3.10+ environments.

Install it with:

python -m pip install "jano[mcp]"

Running the local server

Run the MCP server over stdio:

jano-mcp

Or directly via the module:

python -m jano.mcp_server

Available MCP tools

preview_local_dataset

Read a local CSV, Parquet file or ZIP-wrapped CSV and return a compact preview.

inspect_local_dataset

Inspect schema, dtypes, nulls, examples and candidate time_col / target_col values. This is the safest first tool when an agent receives a new local file.

suggest_temporal_partition_policy

Suggest a conservative starting policy from dataset shape. It can return a temporal walk-forward policy or an event-based online policy depending on the requested objective.

validate_temporal_partition_policy

Run plan() and return diagnostics before any model is trained. The response flags empty folds, partial folds, small train/test folds and suspicious train/test boundary ordering.

compare_temporal_partition_strategies

Compare multiple candidate partition configurations at the plan level. This is useful when an agent wants to choose between daily, weekly, rolling or expanding layouts before running model code.

plan_walk_forward_simulation

Build a walk-forward plan() and return iteration boundaries, row counts and selected partition-engine metadata.

run_walk_forward_simulation

Materialize a walk-forward simulation and return a compact summary, selected partition-engine metadata and rendered HTML.

run_walk_forward_baseline_model

Execute a built-in baseline model over the walk-forward folds and return runner data: aggregate summary, fold preview, metric trajectory, retraining events and an optional bounded prediction preview. Use model="mean" for numeric regression targets and model="majority_class" for classification targets.

compare_retrain_policy_baselines

Run the same baseline model over the same fold geometry while changing the retraining policy. The response includes one comparison row per policy plus per-policy fold and metric previews.

find_train_history_window_baseline

Evaluate multiple training-history windows against one fixed test window and return the smallest train window that stays within the requested tolerance of the best score.

monitor_decay_baseline

Keep a training window fixed, move the test window forward and return the first window where the chosen metric crosses the configured degradation threshold.

The planning and execution tools accept engine with the same values as the Python API: "auto", "pandas", "polars" or "numpy".

Agent-first workflow

For a new dataset, prefer this MCP sequence:

  1. inspect_local_dataset to identify candidate time and target columns.

  2. suggest_temporal_partition_policy to get a conservative starting policy.

  3. validate_temporal_partition_policy to inspect fold geometry and warnings.

  4. compare_temporal_partition_strategies if multiple policies are plausible.

  5. plan_walk_forward_simulation or run_walk_forward_simulation only after the partition geometry is acceptable.

Example policy validation:

{
  "dataset_path": "data/flights.csv",
  "partition": {
    "layout": "train_test",
    "train_size": "30D",
    "test_size": "7D"
  },
  "step": "7D",
  "time_col": "scheduled_departure_at",
  "max_folds": 10
}

Baseline runner example

{
  "dataset_path": "data/bts/bts_ontime_2024_01.zip",
  "partition": {
    "layout": "train_test",
    "train_size": "7D",
    "test_size": "1D"
  },
  "step": "1D",
  "time_col": "FL_DATE",
  "target_col": "arrival_state",
  "model": "majority_class",
  "retrain": "periodic",
  "retrain_interval": 2,
  "max_folds": 5
}

This tool is intentionally a baseline, not a general arbitrary-model executor. MCP JSON cannot transport Python callables, so metric-evaluated production runs should use the Python WalkForwardRunner directly. That keeps model construction, feature engineering and custom metrics in user code.

Temporal study examples

Compare retraining policies over the same geometry:

{
  "dataset_path": "data/flights.csv",
  "partition": {
    "layout": "train_test",
    "train_size": "14D",
    "test_size": "1D"
  },
  "step": "1D",
  "time_col": "scheduled_departure_at",
  "target_col": "arrival_delay",
  "model": "mean",
  "policies": [
    {"name": "always", "retrain": "always"},
    {"name": "never", "retrain": "never"},
    {"name": "weekly", "retrain": "periodic", "retrain_interval": 7}
  ]
}

Find a compact train-history window against a fixed test horizon:

{
  "dataset_path": "data/flights.csv",
  "time_col": "scheduled_departure_at",
  "cutoff": "2024-02-01",
  "train_sizes": ["7D", "14D", "30D"],
  "test_size": "3D",
  "target_col": "arrival_delay",
  "metric": "mae",
  "tolerance": 0.02
}

Monitor decay with a fixed train window:

{
  "dataset_path": "data/flights.csv",
  "time_col": "scheduled_departure_at",
  "cutoff": "2024-02-01",
  "train_size": "30D",
  "test_size": "1D",
  "step": "1D",
  "target_col": "arrival_delay",
  "metric": "mae",
  "threshold": 0.10,
  "relative": true
}

Example MCP client configuration

Many MCP clients accept a configuration entry like this:

{
  "mcpServers": {
    "jano": {
      "command": "jano-mcp"
    }
  }
}

If you prefer an explicit Python command:

{
  "mcpServers": {
    "jano": {
      "command": "python",
      "args": ["-m", "jano.mcp_server"]
    }
  }
}

AI coding assistants

The MCP server is intended for MCP-aware coding assistants such as Claude Code, Claude Desktop, Cursor, Codex runtimes with MCP support and other local agent environments.

Jano can always be used directly as a Python library. The MCP server is useful when you want the assistant to see a small set of declared tools instead of inferring imports and composing Python code from scratch.

Use the same local server configuration in any MCP-aware client:

{
  "mcpServers": {
    "jano": {
      "command": "python",
      "args": ["-m", "jano.mcp_server"]
    }
  }
}

Privacy model

The server runs locally. It reads local files through the process started by your MCP client. Jano does not upload datasets anywhere by itself.

Access to files is still governed by the client environment and the paths you provide to the tools, so prefer project-local paths and avoid giving agents broad access to unrelated folders.

Current scope

The first MCP release does not try to expose every Jano primitive.

It deliberately starts with:

  • dataset preview,

  • planning,

  • walk-forward simulation,

  • baseline-model execution,

  • baseline temporal studies.

Lower-level composition and model-specific temporal hypothesis policies remain available in the Python library itself.