A small bench for things that need calibrating. The first instrument is a sample size calculator — because somebody has to remind product managers that significance does not arrive on a schedule.
01Sample Size · Two-Proportion Test
Drag the sliders. The chart shows the two distributions you are asking the world to distinguish; the numbers tell you how many observations the world will require before it answers.
Inputs
10.0%
+5.0% rel
0.05
0.80
two-sided
5,000
15,719N per arm
31,438Total N
3.1Days to runat current traffic
H₀ · nullH₁ · alternativecritical value · z*
Two distributions, one decision rule. The shaded tails are the errors you have agreed to tolerate — α to the right of z* under H₀, β to the left of z* under H₁ — and which the sample size above is calibrated to deliver.
nper arm = ( zα/2 · √( 2 p̄(1−p̄) ) + zβ · √( p₁(1−p₁) + p₂(1−p₂) ) )² ÷ ( p₁ − p₂ )² where p̄ = (p₁ + p₂) / 2 · standard formula for unequal-variance two-proportion z-test.
02Coming next
Price-elasticity sandbox
Drag a demand curve. Watch revenue and quantity respond. Toggle linear vs. constant-elasticity. The point is to feel the model assumption, not to memorize the equation.
Market beta · rolling regression
Regress an asset's daily returns on a market index — S&P 500 by default — over a sliding window. β > 1 says it moves harder than the market; β near zero says it's its own story; β below zero says it's pulling the other way. Bitcoin's beta has traveled the full range of those answers in ten years, which is the entire point of running the regression rather than asking the asset what it is.
Espresso extraction model
Grind, dose, pressure, time → predicted yield and TDS via a simplified Darcy-flow model. Coffee as a response surface, because everything is a response surface if you stare long enough.
Peeking penalty simulator
Ten thousand simulated A/B tests. Watch the false-positive rate climb every time you stop a test early. Quietly devastating.