Splitstream docs
Splitstream is the experimentation half of a feature-rollout pair. Pennant decides whether a feature ships. Splitstream decides which variant a user sees and whether the variant moved the metric. The two ship a joint quickstart via the growth wrapper; the rest of this section explains the load-bearing pieces of Splitstream itself.
The defaults are calibrated, not asserted.
The most important thing to read first is /docs/concepts/bayesian-inference. The plan's draft thresholds (0.95 / 1,000 / 15 min) delivered 66% empirical false-positive rate in our null-effect simulation. The calibrated tuple is 0.995 / 20,000 / 240 min; that's the default every new experiment inherits. Lowering thresholds without re-running the simulation re-inflates the false-positive rate.
What the docs cover
- Bayesian inference — the calibration table and why your prior intuitions about peeking are wrong.
- Sticky assignment — sticky-forever, mid-experiment weight-change semantics, and the force-rebucket tombstone story.
- Bucketing — the hash function and the cross-SDK corpus that guards drift.
- Peeking discipline — why every dashboard open is logged and how the audit log reads.
- Sample Ratio Mismatch — the chi-squared monitor that catches bucket-pipeline bugs.
- Mutex groups — mutually exclusive experiments, forward-only semantics, and the SELECT FOR UPDATE race protection.
Where to start writing code
Five-line samples per stack live on the home page. Full SDK references and install instructions are on /sdks. The API surface is rendered inline at /reference via Scalar.