Splitstream docs

Splitstream is the experimentation half of a feature-rollout pair. Pennant decides whether a feature ships. Splitstream decides which variant a user sees and whether the variant moved the metric. The two ship a joint quickstart via the growth wrapper; the rest of this section explains the load-bearing pieces of Splitstream itself.

The defaults are calibrated, not asserted.

The most important thing to read first is /docs/concepts/bayesian-inference. The plan's draft thresholds (0.95 / 1,000 / 15 min) delivered 66% empirical false-positive rate in our null-effect simulation. The calibrated tuple is 0.995 / 20,000 / 240 min; that's the default every new experiment inherits. Lowering thresholds without re-running the simulation re-inflates the false-positive rate.

What the docs cover

Bayesian inference — the calibration table and why your prior intuitions about peeking are wrong.
Sticky assignment — sticky-forever, mid-experiment weight-change semantics, and the force-rebucket tombstone story.
Bucketing — the hash function and the cross-SDK corpus that guards drift.
Peeking discipline — why every dashboard open is logged and how the audit log reads.
Sample Ratio Mismatch — the chi-squared monitor that catches bucket-pipeline bugs.
Mutex groups — mutually exclusive experiments, forward-only semantics, and the SELECT FOR UPDATE race protection.

Where to start writing code

Five-line samples per stack live on the home page. Full SDK references and install instructions are on /sdks. The API surface is rendered inline at /reference via Scalar.