What Are P-values And Why Are They So Problematic?

Picture of Justin Belair

Justin Belair

Biostatistician in Science & Tech | Consultant | Causal Inference Specialist

Table of Contents

Introduction

P-values are probably the most discussed statistical topic in history.

They are often criticized, distrusted, misused, misinterpreted, and, on the flipside, used everyday in every single empirical study.

So, what are p-values and why are they so problematic?

P-Values, Statistics, and Publication Bias

P-values are a tool used to determine if some pattern observed in data is “real”–for example, a difference between 2 groups, a correlation between 2 metrics, etc.

Here’s the catch : we need to rigorously define what this means with the language of mathematics.

Mathematically, the p-value has nothing problematic: it is the probability of observing a test statistic equal or more extreme than the one observed in the data, if the null hypothesis is assumed to be true…quite a mouthful.

This description itself is a little convoluted and does not directly lead to practical insights.

So we’ve invented dozens of shorthand ways of understanding p-values, although most of the time the p-value is not what we think it is.

This is compounded by the fact that due to some extra-statistical factors, we’ve come to declare results with p-values less than 5% as significant – worth publishing – and others as non significant – not worth publishing.

This creates a meta-scientific problem known as publication bias, whereby non-significant results are underreported, understudied, and undervalued.

But there is not a significant correlation (pun intended), at least not directly, between scientific worth and statistical significance.

Add to this the fact that even if we correctly interpret the p-value, it can be misleading due to the many pitfalls and failed assumptions that come with the statistical approach to data.

Conclusion

I don’t see p-values going away soon and I don’t advocate for their elimination, but I do strive to demystify them and use them appropriately, as a simple tool among many in my statistical toolkit. I can help in 2 ways.

Want to Learn Statistics the Right Way? Check out my Introduction to Biostatistics online course.

Need help with statistics? I offer consultations and data analysis services. Please, reach out!

Scroll to Top