What is the p-value in hypothesis testing?

p-value interpretation

Background: Before understanding what the p-value signifies, it is important to understand what hypothesis testing is.

Hypothesis testing uses sample information to test the validity of a claim about the population.

That claim is called the null hypothesis, denoted as H_0.

If the null hypothesis is proven to be invalid, the alternative hypothesis, denoted as H_a, is selected instead.

,For instance, letu2019s analyze the following problem: We are testing the hypothesis that the average gas consumption per day in Billings, Montana is at least 7 gallons per day; we want 95% confidence.

We sample 30 drivers.

The average is 8.

4, and the sample standard deviation is 4.


,The null hypothesis is that the average gas consumption per day in Billings, Montana is <= 7 gallons per day.

In mathematical terms, we write this null hypothesis as:,mu <= 7where mu is the population (drivers in Billings, Montana) mean (average gas consumption).

,How the null hypothesis was chosen: The null hypothesis is what will happen if the claim made in the problem is not true.

In this case, the claim was made that the average gas consumption per day in Billings, Montana is greater than 7 gallons per day.

Thus, if that claim is not true, it means that the average gas consumption per day in Billings, Montana is <= 7 gallons per day.

,The p-value signifies the strength of the evidence against the null hypothesis.

The smaller the p-value, the more powerful the evidence is to suggest that the null hypothesis should be rejected, and that the alternative hypothesis should be selected (usually the threshold, or significance level is p <= 0.


The larger the p-value, the less powerful the evidence is to suggest that the null hypothesis is invalidu2014thus, we fail to reject the null hypothesis.

Note that we cannot accept the null hypothesis, we can just fail to reject it.

,The p-value is based upon the z-score, which measures the number of standard deviations from the mean a data point is.

The z-score for sample means is given by the following formula:,z = frac{bar{X} - mu}{frac{sigma}{sqrt{n}}}Given that bar{X} = 8.

4, mu = 7, sigma = 4.

29, and n = 30, we end up getting:,z = 1.

79Our next step is to use the z-score table, which provides us data showing the probabilities that a value is to the left of a given z-score:,The probability corresponding to 1.

79 is 0.


However, we want to retrieve the probability that the value is to the right of the z-scoreu2014we are testing the hypothesis that the average gas consumption per day is greater than 7 gallons per day.

So, our p-value is 1 - 0.

9633 = 0.


,Since this is less than the typical significance level of 0.

05, we reject the null hypothesis and accept the alternate hypothesis (the average gas consumption per day in Billings, Montana is greater than 7 gallons per day).

p-value significance chart

Some very interesting answers here, one of the most useful questions Iu2019ve seen in a while!,I work in analytics, and common ones Iu2019ve seen so far are:,Assuming strong statistical relationships are causal: Statistical models arenu2019t useful to businesses unless theyu2019re actionable i.


you are able to make causal inferences from your statistical relationship.

In this endeavour, you need to be really sure youu2019re getting your variables right.

You might find a strong relationship between drinking Jack Daniels and longevity, but the reason might just be that people who buy Jack Daniels regularly are just rich and therefore able to spend enough to take care of themselves.

,Equating statistical significance to real-world significance: p-values are not the be-all and end-all of an analysis.

You can get a low p-value if the size of the sample is very big.

You always need to look at effect sizes before drawing conclusions.

,Misuse of terms and charts: You can trick, and get tricked, in some pretty straightforward ways.

Iu2019ve seen this done in presentations - Charts whose vertical axis doesnu2019t begin at 0 (done to exaggerate possibly insignificant differences).

You also get to see this one in the news every now and then.

You donu2019t ALWAYS have to begin the axis at 0, but you should be particularly suspicious of bar charts made this way,Oh, and watch out for u201caverageu201d.

It can be used to refer to any of mean, median or mode.

Please learn what all three are, and ask which one someoneu2019s talking about when they use the term u201caverageu201d.

p-value greater than 0.05 means

A p-value higher than 0.

05 (> 0.

05) is not statistically significant and indicates strong evidence for the null hypothesis.

This means we retain the null hypothesis and reject the alternative hypothesis.

You should note that you cannot accept the null hypothesis, we can only reject the null or fail to reject it.

What is a good p-value

If you want to be thoroughly disabused of using p values, then you can read: The Cult of Statistical Significance by Ziliak and McCloskey.

How to calculate p-value

Different statistical tests have different distributions used to compute the u201cp-valueu201d for that test.

The u201cp-valueu201d is basically the probability of that test statistic occurring by chance.

(It isnt really that simple but its good enough for this discussion),For example: a z-test uses a Normal distribution; a t-test used a test the difference between two sample means uses a Students t-distribution; the test in an analysis of variance (ANOVA) uses a F-distribution; other tests use the Chi-squared distribution; and so onu2026,Each of these distributions has a different formulau2026typically these formula do not have simple solutions and must be approximated.

,For example the Normal distribution p-value is typically found by finding the area under the curve defined by,P(x) = frac{1}{{sigma sqrt {2pi } }}e^{{{ - left( {x - mu } right)^2 } mathord{left/ {vphantom {{ - left( {x - mu } right)^2 } {2sigma ^2 }}} right.

kern-nulldelimiterspace} {2sigma ^2 }}}From negative infinity to x.

This usually done using integration or by using one of the many approximations.

Or you can do what most practicing statisticans do and use a computer or calculator to generate the value using a preprogrammed function written by someone else.

,There are good textbooks that will give you approximations to various precisions.

,42 years ago, when I was a graduate student studying applied statistics, I did a lot of work writing FORTRAN and assembler programs to generate tables of probability distributions to compare the accuracy of various algorithms against the tables generated by the u201ccomputersu201c of the previous generations.


typically rooms of young women generating these tables using mechanical calculators.

Now the algorithms are of little concern after all it is an Excel functionu2026.