Can someone help me out with confidence intervals? I'm trying to wrap my head around them but it’s not clicking. I get p-values fine, but the actual math and the 'why' behind CIs is a total mess for me. How do people actually use them in the real world? Thanks for any tips.
They give an indication of the sampling variance of your estimate if you repeated the experiment multiple times.
Suppose you test 20 ME/CFS patients and find that their mean fatigue score is 10 points. But the 20 ME/CFS patients you've chosen might not be representative of the entire ME/CFS population. So if you take a different 20 patients, you might get a different estimate like 8 points. If you repeated that experiment over and over again, you would get a distribution of estimates. It might show that most of them lie between 6.5 and 13.5 points.
In real life it's not feasible to repeat experiments over and over again, so researchers estimate this sampling variance (what if I'd taken 20 other patients?) mathematically. They estimate how much variance there is in the means by using the variance of scores in the sample of 20 ME/CFS patients they do have.
They'll assume that those (imagined) mean fatigue scores are normally distributed (means often are with large samples size even if the individual scores aren't) and that their standard variation is equal to the standard variation of the sample divided by the square root of the sample size. For example if the standard deviation in your sample was 8, then the standard deviation of the hypothetical means would be 8/ sqrt(20) or 1.78.
So that gives an overview of that hypothetical distribution of means if the experiment was repeated over and over again. Researchers assume that their estimate is the best guess for the true mean, so in this case 10 points. But they know that if they'd taken 20 other patients they might had a different result. So the final result of their experiment is not just 10 points, but a confidence interval around it. So they take approximately 2 standard deviations from that estimate to get 95% of the hypothetical distribution of the means. So the 95% confidence interval would go from 10-2*1.78 to 10+2*1.78. Those results might look like this: "we found a mean fatigue score of 10 points [95% CI: 6.44 - 13.56] in ME/CFS patients..."
The tricky thing is that this doesn't mean we're 95% confident that the estimate lies between 6.44 - 13.56 (that would require a different approach using Bayesian credible interval), it just estimates what would happen if the experiment was repeated multiple times. So the correct interpretation is that if you stick to this method of calculating the 95% confidence intervals, then in 95% of the experiments you do, the true estimate will lie within the confidence interval. It doesn't mean it's 95% sure it lies in the confidence interval of your one experiment. You might be in one of the red confidence intervals as show in the graph below.
Found Daniel Lakens course on statistics quite useful because it focuses on common misundesrtandings. Here's the chapter on Confidence intervals:
This open educational resource contains information to improve statistical inferences, design better experiments, and report scientific research more transparently.
lakens.github.io