Archive for the ‘Statistics’ Category.

February 16, 2016, 3:06 pm

Medians are better than means in most interpretation contexts: they’re not affected by skewed or otherwise non-normal distributions. They give a better sense of the “typical” data point. When the mean and median differ, I prefer to use the median.

One problem with using medians is that you can’t calculate a confidence interval for them the same way as you calculate one for a mean. There’s no “standard error of the median”. However, it turns out there is a way to calculate confidence intervals for them. Continue reading ‘Confidence Intervals for Medians and Percentiles’ »

February 18, 2015, 10:23 pm

Stopping your A/B test once you reach significance is a great way to find bogus results…if you’re a frequentist. Checking before you have the statistical power to detect the phenomenon will often lead to false positives if you rely on classical/frequentist methods. A Bayesian with an informative null-result prior can avoid these problems. Let’s think about why. Continue reading ‘How often can Thomas Bayes check the results of his A/B test?’ »

May 15, 2014, 12:44 pm

A good A/B test tool should be able to reach the following conclusions:

- A beat B or B beat A, so you can stop.
- Neither A nor B beat the other, so you can stop.
- We can’t conclude #1 or #2 but you’ll need about m more data points to conclude one of them.

The tools I’ve found for analyzing A/B tests can all answer #1. Some of the better ones can answer #3. **None of the tools I’ve seen will answer #2** and tell you that A and B are not meaningfully different and that you have enough data to be pretty sure about that. Continue reading ‘When Enough is Enough with your A/B Test’ »

June 12, 2011, 4:40 am

Next time you see someone “misinterpret” a confidence interval, wait a second. They’re actually probably okay. Continue reading ‘Regain your confidence (intervals)’ »

March 3, 2011, 5:09 pm

What is a methods-careful practitioner to do when the number of observations () is small? I don’t know how many times I’ve been told by a well-meaning Bayesian some variation of

Bayesian estimation addresses the “small problem”

This is right and wrong. Continue reading ‘Bayes fixes small n, doesn’t it?’ »

March 10, 2010, 1:51 pm

I recently sat through some great grad student presentations. Most of those presenting empirical results made a common mistake: they kept *way* too many digits in their presented results. There are two problems with showing more digits than necessary: false certainty and lack of clarity. The extra certainty is certainly false because we *know* how accurate the estimated coefficients are: that’s exactly what the standard errors tell us! Extra digits reduce clarity by cluttering up an already hard-to-read table with extra, unnecessary information.

** Continue reading ‘Figuring significance significant figures’ »**