Mutex groups

A mutex groupis a set of experiments where a unit assigned to one experiment cannot be assigned to another. The use case: two checkout-flow experiments running simultaneously, both plausibly affecting conversion. If a user lands in both, the two experiments contaminate each other's results.

How it works

Each mutex group has a mutex_group_assignments table with primary key (mutex_group_id, unit_id). The first /v1/assign call into any experiment in the group writes a row claiming the unit. Subsequent calls for the same unit into other experiments in the group return a mutex_holdout reason and a null variant.

The claim is written inside a SELECT FOR UPDATE transaction to handle the race where two experiments' first calls for the same unit arrive simultaneously. Without the lock, both could conclude they're the first claim and both write rows; the unit would then bucket into both experiments — which is exactly what the mutex group exists to prevent.

Forward-only semantics

Adding an experiment to an existing mutex group is forward-only. Units that have already been exposed to the experiment before the group membership change are not retroactively re-bucketed. The admin warns when a unit count exists at mutex-group-add time.

Why forward-only: retroactively re-bucketing would re-introduce the sticky-assignment violation the rest of the platform exists to prevent. The data already collected from pre-membership exposures is legitimate; the new traffic just follows the new rule.

What gets logged

Holdout assignments are written to the same assignments table as bucketed assignments, with assignment_reason = "mutex_holdout" and variant_id = NULL. They count toward sticky behavior — a holdout user calling /v1/assign for the held-out experiment again gets the same nullvariant returned consistently across the experiment's lifetime.

Mutex enforcement is recorded in the audit log when the group is configured; the per-call holdout decisions are not (they would flood the log). The admin shows aggregate holdout counts per experiment-pair so you can verify the group is actually enforcing.