Saturn Lab

Turn scattered Jupyter notebooks
into reproducible pipelines.

Senior data scientists lose hours every week to notebook chaos — lost metrics, unknown lineage, full pipeline re-runs for a one-line change. Saturn Lab solves all of it, without changing how you work.

Problem

"23 notebooks. Nobody knows which one runs in production."

Solution

A visual swim-lane canvas that makes the pipeline obvious.

Source → Processing → Experiment. Every notebook has a defined role. Execution order is explicit. One button runs everything in the correct sequence.

localhost

⬡ Customer Analytics ▾

▶ Run pipeline

Data Sources

📄 churn_data.csv

45,231 rows · 28 cols

Notebooks

📓 feature_engineering.ipynb

📓 preprocessing.ipynb

📓 xgb_classifier.ipynb

🧠

3 models

1 in production

→

Experiments

Models

📄

churn_data.csv

45,231 rows · 28 cols

data/churn_data.csv

📄

eng_features.parquet

45,231 rows · 35 cols

output/eng_features.parquet

Processing

feature_engineering

feature_engineering.ipynb

Compute rolling features, encode categoricals

📋

↗ Open

▶

Experiment

xgb_classifier

xgb_classifier.ipynb

accuracy 0.947 auc_roc 0.981

XGBoost with Optuna hyperparameter tuning

📋

↗ Open

▶

Problem

"Tried a different aggregation. Results got worse. The previous approach? Gone — you overwrote the notebook."

Solution

Every run saved. Switch approaches freely.

Saturn captures the params and metrics of every run automatically. Try rolling_mean, compare it side-by-side with weighted_mean from run #4. The notebook changes — the history never disappears.

localhost / experiments

← Back

Experiments — Customer Analytics

Run	Status	aggregation	accuracy ↑	auc_roc ↑	f1 ↑	Duration
#7	● success	rolling_mean	0.9471↑0.030	0.9813	0.9204	14m 32s
#6	● success	simple_sum	0.9170	0.9510	0.8940	11m 05s
#5	● success	median	0.9380	0.9720	0.9110	12m 44s
#4	● success	weighted_mean	0.9440	0.9791	0.9172	13m 18s
#3	✓ cached	—	—	—	—	0s

Problem

"Tweaked one hyperparameter. Re-running four hours of training."

Solution

Unchanged nodes skip. Only what changed re-runs.

Saturn hashes each node's input files and notebook code. If nothing changed since the last successful run, the node is marked cached and skipped instantly. A 4-hour pipeline becomes 12 minutes.

localhost — run #8 in progress

⬡ Customer Analytics ▾

◌ Running…

Recent runs

#8 main pipeline now

#7 main pipeline 2h ago

#6 main pipeline 5h ago

Experiments

Models

📄

churn_data.csv

45,231 rows · 28 cols

data/churn_data.csv

📄

eng_features.parquet

45,231 rows · 35 cols

output/eng_features.parquet

Processing

feature_engineering

feature_engineering.ipynb

inputs unchanged — skipped

📋

↗ Open

▶

Experiment

xgb_classifier

xgb_classifier.ipynb

n_estimators=400, learning_rate=0.05

📋

↗ Open

◌

Problem

"Production model degraded. Nobody knows what data trained it."

Solution

Model → run → code → data. Always.

One call: saturn.register_model(clf, name="churn_v3"). Every version linked to its exact pipeline run, training node, and dataset snapshot. Diagnose drift in minutes, not days.

localhost / models

← Back

Model Registry — Customer Analytics

🔵

churn_predictor v3 production

accuracy 0.9471 ★

auc_roc 0.9813 ★

f1 0.9204 ★

Run #7 · xgb_classifier · churn_data.csv v2 · 2.4 MB

Compare

One Docker command.
Launching soon.

Self-hosted. Your data never leaves your infrastructure. Works on any server — AWS, Azure, on-prem, your laptop. PostgreSQL included.

Get in touch →

Free for personal use · Team plan $300/month flat · No per-seat pricing

SATURN LAB

One Docker command.Launching soon.

One Docker command.
Launching soon.