Photo by Parker Gibbons on Unsplash
Clearer Insights: Reinterpreting Statistical Significance for Better Understanding
Statistical results are crucial in scientific research, but they are often misunderstood and miscommunicated. The language used to describe these results can lead to confusion and misinterpretation. This article will help you understand the concept of Null Hypothesis Significance Testing (NHST) in simple terms and show how we can improve the communication of statistical results using real examples from peer-reviewed papers.
Understanding Null Hypothesis Significance Testing (NHST)
In NHST, scientists start with a "null hypothesis," which usually suggests that there is no effect or difference in their study. For example, if a new drug is being tested, the null hypothesis might be that the drug has no effect on patients.
Scientists then collect data and calculate a "p-value." The p-value helps them decide whether to reject the null hypothesis. A small p-value (typically less than 0.05) suggests that the observed effect is unlikely to be due to chance, leading scientists to reject the null hypothesis.
Why is NHST Criticized?
Misinterpretation of P-values: Many people wrongly interpret a p-value greater than 0.05 to mean that there is no effect, which is not necessarily true. It might just mean the effect is not clear with the data collected.
Focus on Statistical Significance: NHST often focuses too much on whether results are statistically significant (p < 0.05) rather than on the size and importance of the effect. A result can be statistically significant but practically unimportant.
Neglect of Effect Sizes and Confidence Intervals: Effect size tells us how big an effect is, and confidence intervals show the range in which the true effect likely lies. These are crucial for understanding the real-world significance of the results but are often neglected in favor of p-values.
A Better Approach: Statistical Clarity
Instead of talking about "statistical significance," we can talk about "statistical clarity." This means focusing on how clearly we can see an effect in our data rather than just whether the p-value is below a certain threshold.
For example, instead of saying "There was no significant effect of the drug," we could say "The effect of the drug was not statistically clear." This small change in language helps prevent misinterpretation and highlights the need for more data or better analysis to clarify the effect.
Practical Tips for Using NHST
Report Effect Sizes and Confidence Intervals: Always report the size of the effect and the confidence intervals along with p-values. This provides a clearer picture of the results.
Understand P-values Properly: Remember that a p-value greater than 0.05 does not mean there is no effect; it means the effect is not clear with the current data.
Use the Language of Clarity: Describe your findings in terms of clarity rather than significance. For example, "The difference between the groups was not statistically clear" instead of "There was no significant difference."
Improving Statistical Communication: Examples from Peer-Reviewed Papers
Let's look at some real examples from peer-reviewed papers and see how we can improve the language for better clarity.
Example 1: No Effect Found
Original Language: "Toxins accumulate after acute exposure but have no effect on behaviour."
Improved Language: "Toxins accumulate after acute exposure but their effects on behaviour are statistically unclear."
Example 2: No Significant Relationship
Original Language: "There was no effect of elevated carbon dioxide on reproductive behaviours."
Improved Language: "The effect of elevated carbon dioxide on reproductive behaviours was statistically unclear."
Example 3: Surprising Non-Significant Results
Original Language: "The finding that species richness showed no significant relationship with the area of available habitat is surprising because richness is usually strongly influenced by landscape context."
Improved Language: "Although species richness is usually strongly influenced by landscape context, we were unable to find a statistically clear relationship in this study."
Example 4: Inferring Weak Effects
Original Language: "… differences between treatment and control groups were nonsignificant, with p-values of at least 0.3, and most in the range 0.7 ≤ p ≤ 0.9."
Improved Language: "… differences between treatment and control groups were not statistically clear (all p > 0.05)."
Example 5: Gender Differences in Statistical Clarity
Original Language: "This correlation was significant in males (ρ = 0.35, p < 0.05) but not females (ρ = 0.35, NS)."
Improved Language: "Although males and females show the same correlation coefficient (ρ = 0.35), the sign of the coefficient is statistically clear only in males."
Example 6: Risk Differences Between Groups
Original Language: "… risk of low BMD [bone mineral density] remained greater in HCV-coinfected women vs. women with HIV alone (adjusted OR 2.99, 95% CI 1.33–6.74), but no association was found between HCV coinfection and low BMD in men (adjusted OR 1.26, 95% CI 0.75–2.10)."
Improved Language: "… risk of low BMD remained greater in HCV-coinfected women vs. women with HIV alone (adjusted OR 2.99, 95% CI 1.33–6.74), but the association between HCV coinfection and low BMD in men was not statistically clear (adjusted OR 1.26, 95% CI 0.75–2.10)."
Conclusion
By changing the language used to describe statistical results from "significant" to "statistically clear," we can avoid common misinterpretations and provide a more accurate understanding of the data. This approach encourages the reporting of effect sizes and confidence intervals, leading to better scientific communication and more reliable conclusions.
Understanding NHST doesn't require advanced statistical knowledge. By using simple language and focusing on clear communication, we can make science more accessible and meaningful to everyone.
Reference:
Dushoff, J., Kain, M. P., & Bolker, B. M. (2019). I can see clearly now: Reinterpreting statistical significance. Methods in Ecology and Evolution, 10(6), 756–759. https://doi.org/10.1111/2041-210x.13159