The Bonferroni correction is only one way to guard against the bias of repeated testing effects, but it is probably the most common method and it is definitely the most fun to say. I've come to consider it as critical to the accuracy of my analyses as selecting the correct type of analysis or entering the data accurately. Unfortunately adjustments for repeated testing of hypotheses, as a whole, remains something that is often overlooked by researchers and the consequences can be inaccurate results and misleading inferences. In this independence day blog, I'll discuss why the Bonferroni Correction should be as important as apple pie on the 4th of July.
The Bonferroni correction is a procedure that adjusts a researcher's test for significant effects, relative to how many repeated analyses are being done and repeated hypotheses are being tested. Here is an example:
Let's say that I am seeking to identify what factors are most predictive of one's 4th of July enthusiasm, as measured by a hypothetical continuous scale. To determine this, I might test several potential predictors in a regression model, such as "love of fireworks," "love of apple pie," "enjoyment of being off of work," or good ol' "pride in being an American", along with 16 other potential predictors, for a total of 20 hypothesized predictors.
Normally we might just toss all of our predictors into a regression and see what we come up with. However, by doing so we'd be overlooking something. When we run a regression we choose an "alpha" and by doing so, choose a percentage oferror we are willing to live with. The most common amount of error that is accepted is 5% (as in p < .05). That is to say, we expect that 19 out of 20 times we find significant effects it will be without error. However, now that we've tested 20 different potential predictors, the likelihood of finding an erroneous signifificant effect (purely by random chance) has now ballooned to approximately 64% (See Berkeley's stats website for more on the math behind this).
To avoid this inflated likelihood of error, we must use an adjusted p-value to test for significance. To calculate this using Bonferroni's method, we simply divide our desired p-value by the number of hypotheses being conducted. In our example, we divide .05 by 20 (.05/20 = .0025), giving us our new threshold of significance (p <.0025), maintaining our 95% confidence in our set of analyses as a whole (known as family-wise error rate).
For more information about Bonferroni correction and other options to making these adjustments, check out Berkeley's stats site. Despite its simplicity, Bonferroni remains a good option to guard against inflated family-wise error. Additionally, most modern stats packages offer it as an option in their calculations. SPSS for example, offers the Bonferroni adjustment as an option in their General Linear Model (GLM) dialog.
No matter what your method, guarding against the pitfalls of repeated multiple hypothesis testing may save you a lot of time later, trying to explain inexplicable findings that were found by random error.