P-hacking
P-hacking is the practice of trying many analysis choices until a statistically significant result appears.
What p-hacking is
P-hacking is selective use of data collection, data cleaning, outcome choice, statistical modeling, or reporting choices until a result becomes statistically significant. It is often discussed around the conventional p < 0.05 threshold, but the deeper problem is undisclosed flexibility: readers see one clean analysis even though many possible analyses were available.
Why p-values are vulnerable
A p-value measures how unusual the observed result would be under a specified statistical model if the null hypothesis were true. It does not say that a claim is probably true. When many tests, subgroups, stopping points, or model versions are tried, some low p-values can appear by chance unless the analysis accounts for that search.
Common forms
P-hacking can include checking results during data collection and stopping when significance appears, trying multiple outcomes but reporting only one, adding or removing covariates after seeing results, excluding observations with flexible rules, splitting data into many subgroups, or changing model specifications without making the exploration clear.
Intentional and accidental cases
The term can sound like deliberate cheating, but many cases are less clear. Researchers may face ambiguous data, messy measurements, pressure to publish, or a sincere desire to understand unexpected patterns. The risk is greatest when exploratory decisions are later written up as if they were planned confirmatory tests.
Consequences
P-hacking can make weak evidence look strong. It can send other researchers toward effects that do not hold up, waste time in follow-up studies, and contribute to publication bias when journals and careers reward novel positive results. The damage is cumulative: even small analytical freedoms can matter when they are repeated across many studies.
Warning signs
No single clue proves p-hacking. Still, readers can be cautious when a paper reports many barely significant p-values, unclear exclusion rules, shifting sample sizes across analyses, many outcomes with little correction for multiple testing, or a strong story built from unplanned subgroups. Clear methods and complete reporting make those judgments easier.
How researchers reduce it
Preregistration records hypotheses, outcomes, sample sizes, and analysis plans before results are known. Registered reports go further by peer-reviewing the question and method before data outcomes decide publication. Open materials, shared code, robustness checks, replication, and explicit labels for exploratory analysis also reduce hidden flexibility.
Why it matters
Modern research often depends on complex data and many reasonable analysis choices. P-hacking matters because it turns that flexibility into a source of bias when it is hidden. Naming the problem helps researchers, journals, and readers distinguish discovery work from evidence meant to confirm a claim.