Hypothesis Testing with the Binomial Distribution

Contents Toggle Main Menu 1 Hypothesis Testing 2 Worked Example 3 See Also

Hypothesis Testing

To hypothesis test with the binomial distribution, we must calculate the probability, $p$, of the observed event and any more extreme event happening. We compare this to the level of significance $\alpha$. If $p>\alpha$ then we do not reject the null hypothesis. If $p<\alpha$ we accept the alternative hypothesis.

Worked Example

A coin is tossed twenty times, landing on heads six times. Perform a hypothesis test at a $5$% significance level to see if the coin is biased.

First, we need to write down the null and alternative hypotheses. In this case

The important thing to note here is that we only need a one-tailed test as the alternative hypothesis says “in favour of tails”. A two-tailed test would be the result of an alternative hypothesis saying “The coin is biased”.

We need to calculate more than just the probability that it lands on heads $6$ times. If it landed on heads fewer than $6$ times, that would be even more evidence that the coin is biased in favour of tails. Consequently we need to add up the probability of it landing on heads $1$ time, $2$ times, $\ldots$ all the way up to $6$ times. Although a calculation is possible, it is much quicker to use the cumulative binomial distribution table. This gives $\mathrm{P}[X\leq 6] = 0.058$.

We are asked to perform the test at a $5$% significance level. This means, if there is less than $5$% chance of getting less than or equal to $6$ heads then it is so unlikely that we have sufficient evidence to claim the coin is biased in favour of tails. Now note that our $p$-value $0.058>0.05$ so we do not reject the null hypothesis. We don't have sufficient evidence to claim the coin is biased.

But what if the coin had landed on heads just $5$ times? Again we need to read from the cumulative tables for the binomial distribution which shows $\mathrm{P}[X\leq 5] = 0.021$, so we would have had to reject the null hypothesis and accept the alternative hypothesis. So the point at which we switch from accepting the null hypothesis to rejecting it is when we obtain $5$ heads. This means that $5$ is the critical value .

Selecting a Hypothesis Test

Language Flag

Find Study Materials for

Business studies, combined science.

Environmental Science

Human geography, macroeconomics, microeconomics.

Create Study Materials

Select your language.

hypothesis testing binomial distribution

Binomial Hypothesis Test

Lerne mit deinen Freunden und bleibe auf dem richtigen Kurs mit deinen persönlichen Lernstatistiken

Nie wieder prokastinieren mit unseren Lernerinnerungen.

When calculating probabilities using binomial expansions, we can calculate these probabilities for an individual value (\(P(x = a)\)) or a cumulative value \(P(x<a), \space P(x\leq a), \space P(x\geq a)\) .

In hypothesis testing , we are testing as to whether or not these calculated probabilities can lead us to accept or reject a hypothesis.

We will be focusing on regions of binomial distribution ; therefore, we are looking at cumulative values.

Types of hypotheses

There are two main types of hypotheses:

The null hypothesis (H 0 ) is the hypothesis we assume happens, and it assumes there is no difference between certain characteristics of a population. Any difference is purely down to chance.

The alternative hypothesis (H 1 ) is the hypothesis we can try to prove using the data we have been given.

We can either:

Accept the null hypothesis OR

Reject the null hypothesis and accept the alternative hypothesis.

What are the steps to undertake a hypothesis test?

There are some key terms we need to understand before we look at the steps of hypothesis testing :

Critical value – this is the value where we go from accepting to rejecting the null hypothesis.

Critical region – the region where we are rejecting the null hypothesis.

Significance Level – a significance level is the level of accuracy we are measuring, and it is given as a percentage . When we find the probability of the critical value, it should be as close to the significance level as possible.

One-tailed test – the probability of the alternative hypothesis is either greater than or less than the probability of the null hypothesis.

Two-tailed test – the probability of the alternative hypothesis is just not equal to the probability of the null hypothesis.

So when we undertake a hypothesis test, generally speaking, these are the steps we use:

STEP 1 – Establish a null and alternative hypothesis, with relevant probabilities which will be stated in the question.

STEP 2 – Assign probabilities to our null and alternative hypotheses.

STEP 3 – Write out our binomial distribution .

STEP 4 – Calculate probabilities using binomial distribution . (Hint: To calculate our probabilities, we do not need to use our long-winded formula, but in the Casio Classwiz calculator, we can go to Menu -> Distribution -> Binomial CD and enter n as our number in the sample, p as our probability, and X as what we are trying to calculate).

STEP 5 – Check against significance level (whether this is greater than or less than the significance level).

STEP 6 – Accept or reject the null hypothesis.

Let's look at a few examples to explain what we are doing.

One-tailed test example

As stated above a one-tailed hypothesis test is one where the probability of the alternative hypothesis is either greater than or less than the null hypothesis.

A researcher is investigating whether people can identify the difference between Diet Coke and full-fat coke. He suspects that people are guessing. 20 people are selected at random, and 14 make a correct identification. He carries out a hypothesis test.

a) Briefly explain why the null hypothesis should be H 0 , with the probability p = 0.5 suggesting they have made the correct identification.

b) Complete the test at the 5% significance level.

Two-tailed test example

In a two-tailed test, the probability of our alternative hypothesis is just not equal to the probability of the null hypothesis.

A coffee shop provides free espresso refills. The probability that a randomly chosen customer uses these refills is stated to be 0.35. A random sample of 20 customers is chosen, and 9 of them have used the free refills.

Carry out a hypothesis test to a 5% significance level to see if the probability that a randomly chosen customer uses the refills is different to 0.35.

So our key difference with two-tailed tests is that we compare the value to half the significance level rather than the actual significance level.

Critical values and critical regions

Remember from earlier critical values are the values in which we move from accepting to rejecting the null hypothesis. A binomial distribution is a discrete distribution; therefore, our value has to be an integer.

You have a large number of statistical tables in the formula booklet that can help us find these; however, these are inaccurate as they give us exact values not values for the discrete distribution.

Therefore the best way to find critical values and critical regions is to use a calculator with trial and error till we find an acceptable value:

STEP 1 - Plug in some random values until we get to a point where for two consecutive values, one probability is above the significance level, and one probability is below.

STEP 2 - The one with the probability below the significance level is the critical value.

STEP 3 - The critical region, is the region greater than or less than the critical value.

Let's look at this through a few examples.

Worked examples for critical values and critical regions

A mechanic is checking to see how many faulty bolts he has. He is told that 30% of the bolts are faulty. He has a sample of 25 bolts. He believes that less than 30% are faulty. Calculate the critical value and the critical region.

Let's use the above steps to help us out.

A teacher believes that 40% of the students watch TV for two hours a day. A student disagrees and believes that students watch either more or less than two hours. In a sample of 30 students, calculate the critical regions.

As this is a two-tailed test, there are two critical regions, one on the lower end and one on the higher end. Also, remember the probability we are comparing with is that of half the significance level.

Binomial Hypothesis Test - Key takeaways

Frequently Asked Questions about Binomial Hypothesis Test

--> how many samples do you need for the binomial hypothesis test.

There isn't a fixed number of samples, any sample number you are given you will use as n in X-B(n , p).

--> What is the null hypothesis for a binomial test?

The null hypothesis is what we assume is true before we conduct our hypothesis test.

--> What does a binomial test show?

It shows us the probability value is of undertaking a test, with fixed outcomes.

--> What is the p value in the binomial test?

The p value is the probability value of the null and alternative hypotheses.

Final Binomial Hypothesis Test Quiz

Binomial hypothesis test quiz - teste dein wissen.

What is a hypothesis test?

Show answer

A hypothesis test is a test to see if a claim holds up, using probability calculations.

Show question

What is a null hypothesis?

A null hypothesis is what we assume to be true before conducting our hypothesis test.

What is an alternative hypothesis?

An alternative hypothesis is what we go to accept if we have rejected our null hypothesis.

What is a one-tailed test?

A one tailed test is a test where the probability of the alternative hypothesis can be either greater than or less than the probability of the null hypothesis.

What is a two-tailed test?

A two tailed test is a hypothesis test where the probability of the alternative hypothesis can be both greater than and less than the probability of the null hypothesis (simply the probability of the alternative hypothesis is not equal to that of the null hypothesis).

What is a significance level?

A significance level is the level we are testing to. The smaller the significance level, the more difficult it is to disprove the null hypothesis.

What is a critical value?

A critical value is the value where we start to reject the null hypothesis. 

What is a critical region?

A critical region is the region enclosed by the critical value. If we get a value in the critical region we reject the null hypothesis.

of the users don't pass the Binomial Hypothesis Test quiz! Will you pass the quiz?

More explanations about Statistics

Discover the right content for your subjects, english literature, no need to cheat if you have everything you need to succeed packed into one app.

Be perfectly prepared on time with an individual plan.

Test your knowledge with gamified quizzes.

Create and find flashcards in record time.

Create beautiful notes faster than ever before.

Have all your study materials in one place.

Upload unlimited documents and save them online.

Study Analytics

Identify your study strength and weaknesses.

Weekly Goals

Set individual study goals and earn points reaching them.

Smart Reminders

Stop procrastinating with our study reminders.

Earn points, unlock badges and level up while studying.

Magic Marker

Create flashcards in notes completely automatically.

Smart Formatting

Create the most beautiful study materials using our templates.

Join millions of people in learning anywhere, anytime - every day

Sign up to highlight and take notes. It’s 100% free.

This is still free to read, it's not a paywall.

You need to register to keep reading, get free access to all of our study material, tailor-made.

Over 10 million students from across the world are already learning smarter.


StudySmarter bietet alles, was du für deinen Lernerfolg brauchst - in einer App!

Real Statistics Using Excel

Hypothesis Testing for Binomial Distribution

We now give some examples of how to use the binomial distribution to perform one-sided and two-sided hypothesis testing.

One-sided Test

Example 1 : Suppose you have a die and suspect that it is biased towards the number three, and so run an experiment in which you throw the die 10 times and count that the number three comes up 4 times. Determine whether the die is biased.

Define x = the number of times the number three occurs in 10 trials. This random variable has a binomial distribution B (10, π ) where π is the population parameter corresponding to the probability of success on any trial. We use the following null and alternative hypotheses:

H 0 : π ≤ 1/6; i.e. the die is not biased towards the number three H 1 : π > 1/6

Using a significance level of α  = .05, we have

P ( x  ≥ 4) = 1–BINOM.DIST(3, 10, 1/6, TRUE) =  0.069728 > 0.05 = α .

and so we cannot reject the null hypothesis that the die is not biased towards the number 3 with 95% confidence.

Example 2 : We suspect that a coin is biased towards heads. When we toss the coin 9 times, how many heads need to come up before we are 95% confident that the coin is biased towards heads?

If we are sure that the coin is not biased towards tails, we can use a one-tailed test with the following null and alternative hypotheses:

H 0 :  π  ≤ .5 H 1 :  π  > .5

For a 95% confidence level, α = .05, and so

BINOM.INV( n, p , 1– α ) = BINOM.INV(9, .5, .95) = 7

which means that if 8 or more heads come up then we are 95% confident that the coin is biased towards heads, and so can reject the null hypothesis.

We confirm this conclusion by noting that P ( x ≥ 8) = 1–BINOM.DIST(7, 9, .5, TRUE) =  0.01953 < 0.05 = α , while P ( x ≥ 7) = 1–BINOM.DIST(6, 9, .5, TRUE) = .08984 > .05.

Example 3 : Historically a factory has been able to produce a very specialized nano-technology component with 35% reliability, i.e. 35% of the components passed its quality assurance requirements. The management of the factory has now changed their manufacturing process and hopes that this has improved the reliability. To test this, they took a sample of 24 components produced using the new process and found that 13 components passed the quality assurance test. Does this show a significant improvement over the old process?

We use a one-tailed test with null and alternative hypotheses:

H 0 : p ≤ .35 H 1 : p > .35

p-value = 1–BINOM.DIST(12, 24, .35, TRUE) = .04225 < .05 = α

and so conclude with 95% confidence that the new process shows a significant improvement.

Two-sided Test

Example 4 : Many believe that drivers of flashy-colored cars (red, yellow, pink, orange, or purple) get pulled over more often for a driving violation. It is possible, however, that drivers of these cars are pulled over no more often or even less often. To get a deeper insight into this issue, a researcher conducted a study of the 50 cars that were pulled over in one month and she found that 7 cars had a flashy color. It is also known that about 20% of the cars sold in this area have a flashy color. Determine whether flashy-colored cars are pulled over differently from any other colored car.

This time we conduct a two-tailed test with the following null and alternative hypotheses where p = the percentage of cars pulled over that were flashy (in the entire population).

H 0 : p = .20

H 1 : p ≠ .20

Once again, we use the binomial distribution, but since it is a two-tailed test, we need to consider the case where we have an extremely low number of “successes” as well as a high number of “successes”. If we use a significance level of α = .05, then we have tails of size .025. The critical value on the left is BINOM.INV(50,.2,.025) = 5 and the critical value on the right is BINOM.INV(50,.2,.975)-1 = 15. Since 7 is between these values, we cannot reject the null hypothesis and so there is no evidence that the police are pulling over drivers of flashy cars more or less often than drivers of other cars.

We can also use the one-tailed test but with α /2 as the significant level; i.e. BINOM.DIST(7,50,.2,TRUE) = .160 > .025 = α /2. Alternatively, we can calculate the p-value as for the one-tailed test and then double the result: p-value = 2*BINOM.DIST(7,50,.2,TRUE) = .381 > .05 = α , which yields the same conclusion that the null-hypothesis shouldn’t be rejected.

If of the 50 cases, 4 had been for flashy cars, then we would have rejected the null hypothesis since 4 is less than the left-side critical value. Note that BINOM.DIST(4,50,.2,TRUE) = .0185 < .025 = α /2. Similarly, we would have rejected the null hypothesis if 16 had been for flashy cars: 1-BINOM.DIST(4,50,.2,TRUE) = .0144 < .025 = α /2.

Note that there is a lack of symmetry here since .0185 ≠ .0144. There is, however, symmetry when p = .5.

If 5 had been for flashy cars, then we wouldn’t have rejected the null hypothesis since BINOM.DIST(5,50,.2,TRUE) = .048 > .025 = α /2. Similarly, we would not have rejected the null hypothesis if 15 had been for flashy cars: 1-BINOM.DIST(15,50,.2,TRUE) = .031 > .025 = α /2.

Howell, D. C. (2010)  Statistical methods for psychology  (7 th  ed.). Wadsworth, Cengage Learning. https://labs.la.utexas.edu/gilden/files/2016/05/Statistics-Text.pdf

88 thoughts on “Hypothesis Testing for Binomial Distribution”

Charles, Took a bit to figure out how binom.test evaluates p-value.

Excel notation below produces same p-value provided by binom.test(x, n, p) in R.

A4 is an array, so need to hit ctrl+shift+enter.


Thanks, Mike

Correction A3: p should be A3: x

Great post Charles. Can I please ask a quick question? If I am looking at falls in a dementia home and they were 55% on average before an intervention but reduced to 30% after an intervention, can I use the binomial test whether the 35% is a significant improvement on 55% (although the falls being measured are of the same people?)

Thanks, Muzaffar. Do you know how many people are in the dementia home? Charles

Dear Charles, I teach statistics in master’s degree course at our university (VSB-Technical University of Ostrava, Czechia). Some of my students use R Studio for calculations, others use Excel with Real Statistics. Your tools are very useful and I am very glad we can use them. In most of the examples we solve, R and Real Stats give the same results. Unfortunately the one-sample proportion test is one of the exceptions. There is no difference when a z-test is used. Both One-sample Proportion Test Tool and R’s function prop.test(x, n, p0) give the same results, where x is # of Successes, n is Sample size and p0 is Hyp Proportion. But the difference in results appears when binomial test is used and alternative hypothesis is “greater then” (p > p0). The R’s function binom.test(x, n, p0) gives the same p-value as Excel’s 1 – BINOM.DIST(x-1, n, p0, TRUE). But in One-Sample Proportion Test tool the formula 1-BINOM.DIST(x, n, p0, TRUE) is used. When A1 is the output range cell, then the complete formula for p-value (cell B10) is: =B6*IF(B12<B5, BINOM.DIST(B4, B3, B5, TRUE), 1-BINOM.DIST(B4, B3, B5, TRUE)) To obtain a correct result in One-Sample Proportion Test tool the value x-1 must be used as # of Successes. Why? I am afraid it would be a source of misunderstandings and mistakes not only among my students. I think the better solution would be the modified formula: =B6*IF(B12<B5, BINOM.DIST(B4, B3, B5, TRUE), 1-BINOM.DIST(B4-1, B3, B5, TRUE)) If this change were made in the next version of Real Stats, then the results in both programs (R and Real Stats) for the same input parameters would be equal. Please consider it. Thank you very much Vaclav

Hello Vaclav, Thanks for bringing this to my attention. I will make the suggested change in the next release. Charles

Vaclav, From your comment, I understand that 1 – BINOM.DIST(x-1, n, p0, TRUE) instead of 1 – BINOM.DIST(x, n, p0, TRUE) should be used in both the one-tailed and two-tailed tests. Please confirm. Charles

I have a question. A process is supposed to be done in a particular way 100% of the time. I have historical data on the process and can thus measure whether or not it was done correctly on a case by case basis. But, it is expensive to perform that measurement. My question is how many times do I have to perform the test, with randomly selected cases, in order to be confident that the process is running correctly. I assume that the first time that that I get a negative result I can stop and conclude that the process is not being done correctly.

Thanks . . .

Hi Phil, Can you explain what sort of process you are referring to and how you determine whether or not it was done correctly? Presumably you are using some statistic, which when negative indicates that it is not running correctly. Charles

Hey, these examples are gret, but could you take a look at this question: “A plastic bottle company suspects that 10% of all bottles coming from its production line are defective. A random sample of 20 bottles finds that 6 of these sampled bottles are defective. Which test based on the binomial distribution would you use to answer this question? Why?”

I’m a bit confused as to which test we would use… I assumed we use Lower-tailed test

Sammy, What are your null and alternative hypotheses? Charles

I had H0: p=.10 And H1: p<.10 (This one I'm not so sure about)

Sammy, H0: p = .10 implies a two-sided test with H1: p not = .10. This might be correct, but do think that the question implies a one-sided test? If so, you sure make the alternative hypothesis the hypothesis that you are trying to collect evidence for. Charles

Hi Charles! Great post!

Please help me with this question: Assume we randomly tested 10 individuals living in the rural area, and found that only 3 of them were positive for Zika virus infection. Use a binomial test to address this question: Is there sufficient evidence to determine whether the percent of individuals infected with Zika virus in the rural area differs from 86% or not?

This is a two-tailed test. Can you please tell me how to do it?

AK, Glad that you like the post. I have just added a fourth example, which is a two-tailed test. Charles

I have a question about Example 3:

what is the Excel formula to calculate the p-value for the following situation: from the 24 components, only 6 pass the test, instead of 13?

We assume that H1 > 35% but we actually have fewer than 35%. In this case we are not able to reject H0, but what is the p-value?

The same argument can be used as for Example 3, namely, calculate =1-BINOM.DIST(5, 24, 0.35, TRUE) to obtain a p-value = .896 >> .05. This shows that it is highly unlikely that the process is an improvment. Charles

In example 3, where did the number 12 come from? Is it 13-1?

Yoga, Yes. Charles

What is the theory behind subtracting 1? Why don’t we use 13 instead of 12?

Patrick, The probability of the event of 12 successes or fewer is BINOM.DIST(12,24,.35,TRUE) and so the complement of this event, namely 13 successes or more is 1-BINOM.DIST(12,24,.35,TRUE). Charles

Great post. Is it right to say that it is easier to get a sig. effect if the comparison is occuring at the tails of the distribution (i.e p = 0.9 or = p = 0.1) compared to the middle (p=0.5) – granted sample size is held constant?

Hi Bob, I am not sure what you mean by where the comparison is occurring. By definition a significant result can only occur at the tails, but I am not sure that this is what you are asking. Charles

My test of expected pass rate as follows: 1. N = 65 2. Two outcome (failed, pass) 3. Probability of failure 1.5779% 4, If the actual failure is 3, can I say that I can still accept the expected failure rate of 1.577% based on confidence level of 95%

Hello Mok Wai Ming, This problem is very similar to Example 1. Charles

I’m having problems implementing a two-tailed test in Excel. What function do I use to estimate the number of heads or tails required to reject the null of a fair coin (95% level)? I’ve tried BINOM.INV(Tosses,0.5,0.025) compared against min(heads,tails), but if I feed this back into BINOM.DIST I get p values above 0.05.

Hello Bruce, This is problem is similar to Example 2 on this webpage. The problem is likely to be that the last argument in your formula is 0.025 instead of a value such as .975. Charles

This is an assignment but I am completely lost. Any help

CASE STUDY The number of credit card holders of a bank in two dierent cities (city – X and city – Y) settling their excess withdrawal amounts in time without attracting interestfollows binomial distribution. The manager (collections) of the bank feels that theproportion of the number of such credit card holders in the city – X is not dierentfrom the proportion of the number of such credit card holders in the city – Y. to testhis intuition a sample of !”” credit card holders is taken from the city – X and it isfound that #$” of them are settling their excess withdrawal amount in – timewithout attracting interest. %imilarly a sample of #&” credit card holders is takenfrom the city – Y and it is found that ‘” of them are settling their excess withdrawalamount in – time without attracting interest check the intuition of the salesmanager at a signicance leel of “.”‘.

It is a nice description of how to perform a one-sided test for Binomial data. I wonder if there exist a practical recommendation for how to do this in a two-sided case. Because the strong theory says we should look for an unbiased most powerful test, but I could not find any reference to a practical implementation of the respective procedure.

Andrey, You can treat Example 3 as a two-tailed test. Instead of 35% reliability, you can test for 35% having some other property where a lot more or less than 35% is bad. An example of this is winners in a lottery. Too few and people are not motivated to play; two many and the company loses money. Charles

Thank you Charles. But what I ment is whether there are binomial tests for H0:p=0.35 vs. H1:p not equal to 0.35 (in the context of Example 3)

Andrey, Yes, you can test this null hypothesis. The approach is similar. See http://onlinestatbook.com/2/logic_of_hypothesis_testing/tails.html Charles

Thank you very much, Charles. It still not clear though whether there exist recommendations for two-tail tests for other than H0:p=0.5

Andrey, There is no reason that the null hypothesis needs to be p=.5. E.g. to determine whether a die is fair you would use p=1/6. Charles

Dear Charles,

I’m puzzled with a statistic issue. I would like to compare two diagnosis test applied on the same samples and same number of samples with the following results Test 2 + – test 1 + 64 10 – 3 36

I wanted to use the McNemar test but apparently it is recommended to use a binomial test (or sign test?) in that case because of the low amount of patient (b+c = 13 (<25)). Could you tell me if this is correct and if yes, should I do a two-tailed test? Also, how do I run a binomial test when the answer is yes or no and not a percentage? Thank you in advance for your help,

Charlène, Yes, with a small sample you should use the binomial test. How to do this is described at https://real-statistics.com/non-parametric-tests/mcnemars-test/ Unless you are confident of the direction, you should use a two-tailed test. Charles

For the first and third examples, you use one less than the number of successes mentioned. For example, in the first example, a “3” was rolled 4 times, but in the excel function, you used 3 as the number of successes. Similarly, in the third example, there were 13 successes and you used 12 successes in the =1-BINOM.DIST function. What is your reasoning for doing this?

Caroline, In the first example, you want to find out the probability that three comes up 4 of more times (i.e. 4, 5, 6, 7, 8, 9 or 10 times). The BINOM.DIST(x,n,p,TRUE) function computes the probability that an event occurs at most x times (i.e. 0, 1, 2, …, x times). The probability that three comes up 4 or more times is equal to 1 minus the probability that three comes up at most 3 times, which is P(x ≥ 4) = 1–BINOM.DIST(3, 10, 1/6, TRUE). Another way to look at this is that P(x>=4) + P(x<4) = 1, and so P(x>=4) = 1 – P(x<4) = 1 - P(x<=3) = 1–BINOM.DIST(3, 10, 1/6, TRUE). Charles

What’s the approach using Excel for finding the upper limit of a sample proportion with a given level of confidence (1-alpha) for a one-tailed distribution.

Samuel, I think you are referring to the situation described on the webpage: https://real-statistics.com/binomial-and-related-distributions/proportion-distribution/ See, in particular, Example 2 and 3. These examples calculate a two tailed confidence interval. You need to use the one-tailed critical value instead of the two-tailed critical value. The other side of the confidence interval is infinity or negative infinity (depending on whether you using the right or left critical value) Charles

Try to solve this for me.

The number of credit card holders of a bank in two different cities (city – X and city – Y) settling their excess withdrawal amounts in time without attracting interest follows binomial distribution. The manager (collections) of the bank feels that the proportion of the number of such credit card holders in the city – X is not different from the proportion of the number of such credit card holders in the city – Y. to test his intuition, a sample of 200 credit card holders is taken from the city – X and it is found that 160 of them are settling their excess withdrawal amount in – time without attracting interest. Similarly a sample of 180 credit card holders is taken from the city – Y and it is found that 50 of them are settling their excess withdrawal amount in – time without attracting interest, check the intuition of the sales manager at a significance level of 0.05.

Benson, This sounds like a homework assignment and I have decided that I shouldn’t do other people’a homework for them. In any case, whether or not this is a homework assignment, here is a hint: Look at the two sample hypothesis testing for the Proportion Distribution at Proportion Distribution Charles

As my understanding, p-value is the probability that, using a given statistical model, the statistical summary (such as the sample mean difference between two compared groups) would be the same as or more extreme than the actual observed results (Wikipedia), given the null hypothesis is true.

As for Example 1, we try to find the probability of 4 or more times of #3, as the p-value, and compare it with α.

So, dbinom(0,10,1/6), the density of 0 #3 is: .1615056, Similarly, dbinom(1,10,1/6) =.3230112 dbinom(2,10,1/6)=.29071 dbinom(3,10,1/6)=.1550454 Therefore, sum(dbinom(1,10,1/6) + … +dbinom(3,10,1/6)) = .9302722 Hence, the p-value of 4 or more #3 is: 1 – sum(…) = 1 – .9302722 = .0697278, which is larger than .05, therefore, we fail to reject the null hypothesis.

Similarly, Example 2: dbinom(6,9,.5) = 0.1640625 dbinom(7,9,.5) = .0703125 dbinom(8,9,.5) = 0.01757812 dbinom(9,9,.5) = 0.001953125 Therefore, if 8 or more head come up, null hypothesis should be rejected.

Example 3: Similar to Example 1: p-value = 1- pbinom(12,24, .35)= 1- .9577469 = .04225307, therefore, we reject null hypothesis. Same conclusion, but weaker.

William, Thank you for catching these errors. I have now corrected the referenced webpage. On behalf of all the users of this website, I appreciate your help in improving the accuracy and quality of the website. Charles

What if the test value isn’t given and you have to guess and find the critical region?

Sorry, but I don’t understand your question. Charles

Hello. I am a statistic student with a question. With hypothesis test proportion binomial distribution, is it possible to have a left tail? In example Ho: p=.062 Hi:p< .o62 And if it possible how would that be solved?

Hi Allison, For Example 3 on the referenced webpage I calculated p-value = 1 – BINOM.DIST(13, 24, .35, TRUE). For the case of H0: p >=.062 H1: p < .062 I would calculate p-value = BINOM.DIST(x, n, p, TRUE). Charles

i appreciate the good job of this site

1. I am still analyzing the subject and found that, for example, Mathematica and Maple return values equal to those of Excel. MINITAB provides the following type of output: Inverse Cumulative Distribution Function

______________________________ Binomial with n = 100 and p = 0,03

x P( X <= x ) x P( X <= x ) 2 0,419775 3 0,647249 ______________________________

Statistical software in general associates the inverse of the distribution function F(x) to quantiles, calculate using the criterion of the BINOM.INV function.

2. In what concerns the code sent, I think there is one situation in which no correction should be made to the Excel value:

LEFT ONE TAIL TEST xBI Result of the BINOM.INV function xLC Lef tail critical value

x xLC Non rejection interval xBI >= xLC

If F(xBI) = alpha THEN xBI = xLC Do not subtract 1 from xBI

António, From what you are saying, it seems that there is a role for both a BINOM.INV function and a BINON.CRIT function, where sometimes the values are different. I didn’t completely understand the situation where the values should be equal (your item #2). Charles

In reference to the messages posted earlier about the BINOM.INV function, we must take into account the following: 1. The function does not follow the rules you presented for an inversion function. 2. In fact, contrary to its name, the purpose of the function is not inversion but to answer the following type of question: what is minimum number of tosses of a coin for which there is a p% chance of at least x heads. 3. This fact is reflected when saying that alpha is the criterion value and not significance level or type I error. The function does not even know if the user is considering the right or left tail. 4. In fact the former CRITBINOM function had a more appropriate name. 5. To achieve the answer of the question type presented in 2 it is inevitable that for left tails we have: value returned =critical value +1. 6. In view of this: (1) the right tail c. value is correct; (2) the left tail critical value is always inflated by one and needs to be corrected. 7. To perform this correction it is necessary to indicate to which tail if the value wanted refers. 8. This can be achieved at spreadsheet level using formulas or with UDF function such as the one below:

Option Explicit Option Base 1

Function xlBinom_CV(n As Integer, p, alpha, nTails, pTail)

‘ n – sample size ‘ p – probability ‘ alpha – Probability of type I error ‘ nTails – number of tails (1,2) ‘ pTail -Position of the tail (1 – Lower, 2 – Upper)

‘ ATTENTION – This function is not fully tested

With WorksheetFunction

If nTails = 1 And pTail = 2 Then alpha = 1 – alpha ElseIf nTails = 2 And pTail = 1 Then alpha = alpha / 2 ElseIf nTails = 2 And pTail = 2 Then alpha = 1 – alpha / 2 End If

If pTail = 1 Then xlBinom_CV = _ .Binom_Inv(n, p, alpha) – 1 Else xlBinom_CV = _ .Binom_Inv(n, p, alpha) End If End With

End Function

An additional indication about the VBA routine is that a valuie -1 indicates a non-existing left tail when P(x=0) > alpha or alpha/2.

Thanks Antonio for the clear explanation. I will add this function to the Real Statistics Resource Pack to help people with this concept. Charles

I found this site very helpful. Thank you. However, I differ in opinion regarding the “critical value” that Excel returns and at what point one Rejects the null hypothesis at the level of alpha. Focusing on example 2; it seems that the critical value returned by Excel is the value which causes the cumulative probability to pass from the “Fail to Reject” region into the “Reject” region–however, since this is a discrete, rather than continuous distribution, there is no distinct point at which this transition occurs (we “jump” from one cumulative probability to the next). Which means that “enough” (as it cannot specifically be assigned given we are dealing with a discrete distribution) of the probability for the specific occurrence for the “critical value” returned by Excel exists in the “Fail to Reject” region that to be at a minimum level of alpha, one would only reject if one observed a number of events GREATER than the critical value returned by Excel.

For instance, in example 2 the discrete probability of observing EXACTLY 7 heads is BINOMDIST(7, 9, .5, FALSE) = 0.0703, which is greater than alpha = 0.05 by itself! So the way I see it, to be confident at a minimum level of alpha, I could only reject if I observed greater than or equal to 8 heads. For this specific example, I do not see how it is possible to say I am rejecting p = 0.5 at a 95% percent confidence level if I observe 7 heads when there is a 7.o3% likelihood of observing the exact outcome of 7 out of 9 heads given p is actually equal to 0.5. I contend that my confidence level to reject at 7 or more heads (rather than 8 or more) is only BINOMDIST(6, 9, .5, TRUE) = 91.02%. The next confidence level that exists for this specific example is our “jump” to observing 8 or more heads with a corresponding confidence of BINOMDIST(7, 9, .5, TRUE) = 98.05%

Mike, Thanks for your comment. António Teixeira has just written what I found to be a very clear description of how we should look at the this issue. See his comment on this webpage on 2015/10/19. Please let me know whether you agree with his approach. Charles

Pls how do I find P-Value

For the binomial distribution, use the BINOM.DIST or BINOMDIST function. Charles

I’ve got a set of data for occurrences of a health condition in a number of different geographical populations.

To look to see if the rate in a given country is significantly different from the overall worldwide average rate is it valid us use BINOMDIST(no of cases, number in sample group, worldwide average rate, TRUE) and look to see if the value is 0.95 ?

Sarah, Do you have any evidence that this type of data has a binomial distribution (which if the number of countries is large enough is equivalent to having a normal distribution)? Charles

I think that this situation also happens with the Real Statistics Resource Pack inverse functions for Poisson and Hypergeometric distributions.

Probably so. I plan to revise all these functions along the lines that you have suggested. Charles

It’s not about what “some might consider”. It is about the definition. And the definition tells us that the critical value is the minimum number of events such that the probability of observing THAT MANY OR MORE events is LESS THAN OR EQUAL TO alpha. But what Excel function returns is the minimum number of events such that the probability of observing STRICTLY MORE events is LESS THAN OR EQUAL TO alpha.

Example. I will also take n = 5 and p = 0.5. The cumulative probability of observing 4 events is BINOM.DIST(4, 5, 0.5, TRUE) = 0.96875. Let’s take alpha = 0.05. Then Excel function returns BINOM.INV(5, 0.5, 1 – 0.05) = 4. However, the probability of observing 4 or more events is 1 – BINOM.DIST(3, 5, 0.5, TRUE) = 1 – 0.8125 = 0.1875 > 0.05. That’s because Excel returns a value for which the probability of observing STRICTLY MORE events is less than or equal to alpha, and strictly more than 4 is 5, and the probability of observing 5 or more events is 1 – BINOM.DIST(4, 5, 0.5, TRUE) = 0.03125 < 0.05. So the correct number actually is 5, not 4.

Cheers Michael

For any distribution with cumulative distribution function F(x), the inverse distribution function I(alpha) should equal the smallest x such that F(x) .05, the critical value is 0 and not 1. But BINOM.INV(5,.5,.05) = 1 and so Excel doesn’t find the right answer.

If instead we take alpha = .95 (the right tail), in Excel we get BINOM.DIST(3,5,.5,TRUE) = .8125, BINOM.DIST(4,5,.5,TRUE) = .96875 and BINOM.DIST(5,5,.5,TRUE) = 1. But BINOM.INV(5,.5,.95) = 4, which is the smallest value where the null hypothesis is rejected, which I believe is the correct answer.

So it seems that at the very least, Excel is inconsistent, producing the correct answer on one tail and the incorrect answer on the other tail.

Of course, many of the tables of critical values that I have seen published use a different definition of critical value, namely the smallest value not in the critical region on the left tail and the largest value not in the critical region on the right (the critical region is again defined as the region where the null hypothesis is rejected). In this case, Excel is still incorrect on one tail and correct on the other tail for the binomial distribution.

The issue of what is significant is also quite confusing in the literature. What happens when the p-value is exactly equal to .05 (or some other value of alpha). Usually based on the definition given, the null hypothesis is rejected when p <= .05; having said this whenever a significant result occurs, the result is written up as "p < .05" and not as "p <= .05". Charles

Nope, it did not come out again. Let me try for the last time. What CRITBINOM gives is: k_excel = min{k : P(X>k) <= alpha}.

This time it worked. So, we see that k_crit k_excel since for discrete distributions, such as the binomial, P(X>=k) P(X>k). Thanks and sorry for polluting your site with several posts. Feel free to correct formulas in the first one and delete all the others.

What’s going on? There were (should be) “not equal” signs between the k’s and P’s. PS: Maybe you should allow some LaTeX type support in comments. Cheers.

Michael, Sorry, but I no longer understand what your final comment is. Please send one more comment which captures what you are trying to say without referring to any of the previous comments. Charles

Hi Charles, Sorry for the confusion. The point is that Excel function returns k for which P(X >k) is = k) is <= alpha. This two are not the same.

Hi Charles, Sorry for the confusion. The point is that Excel function returns k for which a probability of observing a value strictly greater than k is less than or equal to alpha. But the critical values of k is defined as that for which a probability of observing a value greater than or equal to k is less than or equal to alpha. There, said it in words. This should now come out right.

Let’s use a specific example. Suppose we are looking at a binomial distribution with n = 5 and p = .5. The probability of 0 successes is BINOM.DIST(0,5,.5,FALSE) = .03125 and the probability of 1 success is BINOM.DIST(1,5,.5,FALSE) = .15625. It also follows that the probability of 0 or 1 successes is given by .03125 + .15625 = .1875 or BINOM.DIST(1,5,.5,TRUE) = .1875.

Not let alpha = .1875. The critical value as defined by Excel is BINOM.INV(5,.5,.1875) = 1, whereas BINOM.INV(5,.5,.18749999) = 1 and BINOM.INV(5,.5,.187500001) = 2. Some might consider the critical value for alpha = .1875 to be 2 instead of 1.

Hi, Charles. I think the calculation of a critical value of events you demonstrate using CRITBINOM function is not correct. The critical value is defined as

k_crit = min{k : P(X>=k) <= alpha}

but what the Excel function returns is

k_excel = min{k : P(X= 1- alpha} = min{k : 1- P(X<=k) k) k) P(X>=k), the difference being, of course, P(X=k).

I am not sure, actually, if there is a simple way to get the correct critical value in Excel using CRITBINOM. Any suggestions?

Thanks Michael

Sorry, don’t know what happed with formulas in my previous post above. The second one should read

k_excel = min{k : P(X= 1- alpha} = min{k : 1- P(X<=k) k) <= alpha }

Hope this one come out OK.

I understand it now. Excel gives cumulative probability, which for 1 to 7 heads is 0.9804. But 7 heads in itself has an exact probability of 0.0731, see the table below. So when Excel says 7 heads is the critical value, it means that 8 and above is 95% confident. Adding the exact probability of 8 heads (0.01758) and 9 heads (0.00195), gives a probability of 0.01953. 2-tailed that is 0.04 like my statistics program says. So when Excel says BINOM.DIST(8,9,.5,TRUE) = 0.001953, it is the cumulative probabality of getting 1 to 8 heads, and that is the same as the probability of getting 9 heads.

Number of Successes Number of Failures Exact Probability Cumulative Probability 0 9 0.195% 0.195% 1 8 1.758% 1.953% 2 7 7.031% 8.984% 3 6 16.406% 25.391% 4 5 24.609% 50.000% 5 4 24.609% 74.609% 6 3 16.406% 91.016% 7 2 7.031% 98.047% 8 1 1.758% 99.805% 9 0 0.195% 100.000%

Erik, If you want the pdf instead of the cdf, change the last argument from TRUE to FALSE. Charles

Hello, I have a question about example 2, tossing a coin 9 times and the result of the Critbinom function is 7 heads. When I input that in my statistical program and choose Non-parametric statistics – Binomial test, using a test proportion of 0.5, it gives a p-value of 0.18 (2-tailed)! 8 heads out of 9 tosses gives a p-value of 0.04 (2-tailed). Doesn’t that mean that we need 8 heads to be 95% confident that the coin is biased towards heads?

Erik, I am not sure where you got your p-value from, but 1-BINOM.DIST(8,9,.5,TRUE) = 0.001953. Thus p-value = .003906, which is close to .004 not .04. Charles

Hello Charles, I got the p-value from my statistics program PSPP, it is similar to SPSS. I get the same result at this site: http://graphpad.com/quickcalcs/binomial1/

Here is the result: “Sign and binomial test Number of “successes”: 7 Number of trials (or subjects) per experiment: 9 Sign test. If the probability of “success” in each trial or subject is 0.500, then: The one-tail P value is 0.0898 This is the chance of observing 7 or more successes in 9 trials. The two-tail P value is 0.1797 This is the chance of observing either 7 or more successes, or 2 or fewer successes, in 9 trials. ”

Based on the problem, the question was “how many heads you must observe so that the probability of getting head is not equal to 5/17 on the average?”. How was that?

Rochelle, Remember that the probabilities that an event E occurs or doesn’t occur are interrelated, namely the probability that event E doesn’t occur is equal to 1 – the probability that it does occur. In your problem you need to look at confidence intervals. Inside the interval you are confident that some event occurs, while outside that interval you are confident that the event doesn’t occur. Charles

Hi! This site helps me a lot in answering some of our assignments. But there’s one problem that I, myself, can’t understand. Hope you can help me. Thank you!

Suppose a coin is tossed 170 times and you observed that 50 heads and 120 tails. Now, if you are going to toss this coin 23 times, how many heads you must observe so that the probability of getting head is not equal to 5/17 on the average? (Hint: Hypothesis testing with interval estimation)

Rochelle, I am reluctant to do your homework assignment, but I will give you a possible hint. If your were told that the coin is biased so that the probability of a head occurring is 5/17, then you could use the approach shown in Example 2 of the referenced webpage. Here 50 heads and 120 tails yields 50/170 = 5/17. Since you were told to use confidence intervals, you need to look beyond just the averages but at some interval around 5/17 (see how to calculate confidence intervals). Charles

How will we know how many number of heads?

Hi, this website is so helpful! Really glad I found it. I wonder if you could help me with a problem too. My English isn’t that good so I hope you understand. I did a discrimination test in school with two brands of popcorn. There where two cups of B-brand cheap popcorn and 1 cup of A-brand popcorn. The students that follow the same subject (statistics) where the test persons. There where 16 test persons, 8 of them pointed out the A brand popcorn as the different one, which was correct. Now I have to analyse these results with the five steps of hypothesis testing from the book of Agresti & Franklin. Can I use a binomial one tailed test? And is Ho: p=0.33 and Ha: p>0.33? I’m not sure if I’m doing it right and I’m not allowed to use my Texas Ti 83 calculator. And can I use a=0.05? I hope you can help me out or give me some hints. Thanks, Eliza.

Eliza, If I understand the problem correctly, then I believe that the approach you are taking is correct. Assume p = probability of selecting A-brand on any single trial = 1/3, based on the null hypothesis that people pick completely at random. You can then apply the binomial one-tailed test as you have described. Charles

Hello, I like your website. I wonder if you could help me with a problem. I am a custody evaluator and I want to examine 89 reports to assess for whether I am biased in my decisions, for either fathers or mothers. Probability of an outcome is of course 50:50. So the null hypothesis is mothers and fathers are equal. Can the binomial test be used to show examine if my outcomes depart from equality? How many decisions in either direction would be necessary to show a bias in 89 reports? Thanks, Marvin

Hello Marvin, It sounds like your problem is equivalent to Example 2 on the referenced webpage with n = 89 and p = .5. Charles

Leave a Comment Cancel reply

hypothesis testing binomial distribution

Cantor’s Paradise

Maike Elisa

Oct 15, 2021


Hypothesis Testing Using the Binomial Distribution

An intuitive introduction and the 5-step-process..

Let’s play a game and throw a coin. Heads, I win. Tails, you win.

We start throwing the coin. First up is heads. I win.

We throw again. Heads. I win.

Again. Heads. I win.

This is the point where most people start revolting. “You’re cheating” — they’re accusing me. But why? I challenge you to think about the reasoning why my students start accusing me of cheating right at this moment and I challenge you to formulate it as precisely as you can.

Before you read on, really think about why they don’t believe me. Because if you come up with it yourself, I promise you, you won’t ever forget the basic principle behind hypothesis testing.

Whenever I play this game in a (virtual) classroom, eventually, somebody comes up with the answer I was looking for:

Because, if the coin was fair, you probably wouldn’t have rolled 4 heads in a row.

This is precisely the basic thought behind hypothesis testing. It’s also a true statement, so let’s examine it. If the coin is fair, there is a 50% chance we get heads and a 50% chance we get tails each time we throw it.

The probability of getting four heads in a row is thus given by

We, therefore, get a 6.25 % chance of having four heads in a row (if the coin is fair). That’s unlikely, but not impossible, yet, it is the exact reasoning that is behind hypothesis testing.

In hypothesis testing, somebody usually wants to prove a (new) belief H₁. In this case: “Your professor is cheating on you, the coin is unfair and the probability p for getting heads is larger than 50 %” or mathematically:

Now the basic idea is as follows: To prove H₁, we first assume that the opposite is true. The opposite is called the null hypothesis and abbreviated with H₀. In this case: “The coin is fair or even more inclined to show tails (rather than heads)”. The probability p for heads is, therefore, equal to 50 % or smaller.

The problem is — how are we ever going precisely verify this value p , i.e. the true probability for throwing heads? No matter how often we throw the coin, we can never know for sure. So the true probability p for getting heads will forever be unknown to us and by throwing the coin, we will only get approximations of the true p .

What we can do though, is to assume for now that the probability of getting heads is in fact 50 % (i.e. H₀ is true) and then compute the probability of our observation (given that H₀ is true, we have to assume some underlying probability otherwise we cannot make calculations). In our case, we compute the probability of throwing 4 heads in a row given that the coin is fair, i.e. given that p=0.5 .

We see that this probability is rather small and it is clear, the smaller this probability, the less we’re willing to believe that H₀ is actually true.

Of course, even if the probability we compute is rather low, H₀ could still be true and our coin could still be fair. It’s just hard to believe.

Therefore, if this probability that we computed is too low, we don’t believe H₀ anymore. We say, that we reject it.

The big question is: How low is too low? In many applications, the value 5 % is used. I can’t tell you exactly where these 5 % come from, but I can tell you something else that’s quite interesting:

Whenever we do experiments such as throwing the coin or rolling the dice, it’s once we get close to this 5 % mark that students intuitively don’t believe me anymore and accuse me of cheating.

In the above coin example, once I have 3 heads in a row, they tend to lose trust. With 4 heads in a row, they start to accuse me of cheating. With 5 heads, the accusations get very loud.

This happens without actually calculating any probabilities and seems to be some human intuition. Interestingly, this has also happened with other experiments (rolling dice, etc.) that once we get close to the 5 % mark, the accusations get louder.

This 5 % value is called the error level α. Sometimes it is also called the significance level even though significance level should normally describe the value 1- α, in this case, 95 %. The value of α should always be decided upon before conducting the experiment and in many cases, a 5 % value is chosen.

How does hypothesis testing work? A brief explanation.

So, therefore, whenever we want to do hypothesis testing, we first define our hypotheses and decide upon an error level (something like 5%, 1%,..). Along with this error level, we should decide the exact experiment we would like to conduct, e.g. throw a coin 4 times.

From here, we compute something called a decision rule based upon this error level and then conduct the experiment.

In the last step, we decide what to do with our hypotheses, i.e. whether to reject H₀ or not reject H₀. It’s important to notice that we just reject H₀ or we don’t reject it. In case of not rejecting it, it does not mean we have proven it.

In the practical coin example, it just means that we didn’t find enough evidence to accuse me of cheating. It’s still possible that the coin is rigged.

The same principle applies every day in courts: Just because you cannot prove that somebody is guilty, it doesn’t mean that person is innocent.

In criminal law, we say that a person is innocent until proven guilty. In hypothesis testing, we say that H₀ is true unless proven impossible.

That means that there is a strong bias towards H₀ and if you don’t have a strong bias towards either statement, hypothesis testing might not be for you.

General Procedure in Hypothesis Testing

Let’s summarize the steps of hypothesis testing again.

Step 1: Figure out the new belief that you would like to prove (H₁) and define its opposite (H₀).

Step 2: Determine your error level α and the exact experiment you would like to conduct, usually this means figuring out the number n of times you would like to repeat a base experiment (such as throwing a coin).

Side Note: This wasn’t perfectly done in the introductory example because I never specified beforehand how many times I would throw the coin.

Step 3: Based upon the hypothesis, the error level, and the experiment, determine the decision rule.

Step 4: Conduct your experiment.

Step 5: Conclude according to the decision rule determined in step 3.

Coin Example — Step by Step

Let’s get back to the coin example and go through the steps.

Step 1: Determining H₀ and H₁

We would like to prove that the coin is rigged.

There are two ways to formulate this hypothesis. We could either be only interested in the coin being rigged in one way. For example, if we believe the other person (who picked heads) is cheating, we would only expect a higher probability for heads. We wouldn’t consider the possibility that getting tails has a probability higher than 50 %. In this case, we would choose

This is called a one-sided hypothesis testing or even more specifically, a right-sided hypothesis test, because we reject H₀ “on the right side”.

If, for example, we are a neutral referee though, we might consider that either of the parties is cheating and that either heads or tails could have a probability significantly larger than 50 %.

therefore using something called a two-sided hypothesis. In this case, we would only want to test if the coin is fair or not.

Since we have a strong feeling though that the coin shows more heads than tails, let’s settle for the hypotheses

Step 2: Error Level and Experiment

In step 2, we want to determine the error level and the exact experiment. In the introductory example, this was not done perfectly because we just threw the coin and at some point decided it looked like a fraud. Usually, one should decide beforehand how many times the coin should be thrown.

Therefore, let’s throw the coin 10 times and decide upon α = 0.05.

Step 3: Decision Rule

Now comes the decision rule. This is where it gets interesting. Remember that the basic principle of hypothesis testing is to assume that H₀ is true. Therefore, we assume that p ≤ 0.5 is true. And, for simplicity of calculations, let’s assume that p=0.5, i.e. the coin is perfectly fair. (We’ll come back to why this is okay later).

Now, the question is, given that H₀ is true, what is the probability to observe certain results?

How is the number X of heads distributed given that we throw the coin 10 times? This can be described using a binomial distribution with parameters n=10 and p=0.5.

(If you want a deeper explanation of the binomial distribution, check this story here).

We want to compute the probabilities for getting k or fewer heads when throwing 10 times. Computing the values of the binomial distribution, we get the following table:

From this table, we can now conclude our decision rule. Remember that we get suspicious if the number of heads is too high. Specifically, if the number is that large that the probability of observing this number or something more extreme (given a fair coin) is α or higher. According to the table, we should therefore get suspicious if we observe 8 heads or more.

Therefore, the decision rule should be:

If 8 or more heads are observed, reject H₀. At a significance level of 95 % (an error level of 5 %), we can assume that the coin is rigged. If 7 or less heads are observed, we cannot reject H₀ and we cannot statistically prove that the coin is rigged.

Step 4: Conduct the Experiment

Now comes the moment to actually throw the coin 10 times:

Heads, heads, tails, heads, tails, heads, tails, heads, heads, heads.

Overall 7 heads and 3 tails.

Step 5: Experiment Conclusion

Taking the result from step 4 (7 out of 10 heads) we look back at our decision rule. The decision rule clearly states that if 7 or fewer heads are observed, we cannot reject H₀. Therefore we cannot reject the claim that the coin is fair and we cannot prove that it’s rigged. Therefore, we have to be ok with the statement that the coin is fair.

Does it mean that the coin is actually fair? No.

It just means that we don’t have enough evidence to prove the opposite. Because if the coin was fair, there would still be a probability of more than 5 % (in this case, 5.47 %) that the observed result (or something more extreme) would occur.

In hypothesis testing, it’s important to remember that we never (statistically) prove H₀. We can only (statistically) prove H₁.

This is because, from the beginning, there is a strong underlying bias towards H₀. H₀ is assumed to be true unless proven otherwise (i.e. unless rejected). This means that if you have no bias to start with, hypothesis testing might not be for you.

This also explains, why, in step 3, we set p=0.5 . Why is it ok to assume that p=0.5 when H₀ actually says p ≤ 0.5? If you play around with some values in the binomial distribution and for example choose p=0.4, you will notice that the results (i.e. the probabilities in the table) only become more extreme and we are less likely to not reject H₀ and therefore (statistically) prove H₁. The bias towards H₀ would only become stronger.

From experience, a lot of students (both in high school and in university) fear hypothesis testing. The basic idea behind it is fairly simple though.

The most challenging parts are usually given in step 3 and step 5. In step 3, the underlying distribution (here it was a binomial distribution) has to be determined (which, admittedly, can sometimes be tricky or even unclear). In step 5, we have to draw the right conclusions. This might be a bit tricky at times.

At the end of the day (or the research paper) hypothesis testing always follows the same 5 steps.

If you enjoyed this story, you can sign up for a medium membership using this link , I’ll earn a small commission.

More from Cantor’s Paradise

Medium’s #1 Math Publication

About Help Terms Privacy

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store

Maike Elisa

M. Sc. Mathematics | Educational Enthusiast | Entrepreneur | Passion for writing, doing & teaching Math | Kite | Digital Nomad | Author | IG: @mathe.mit.maike

Text to speech

We've updated our Privacy Policy to make it clearer how we use your personal data.

We use cookies to provide you with a better experience. You can read our Cookie Policy here.


Technology Networks' facebook page

The Binomial Test

Elliot McClenaghan image

Complete the form below and we will email you a PDF version of "The Binomial Test"

Complete the form below to unlock access to ALL audio articles.

What is the Binomial test?

The Binomial test, sometimes referred to as the Binomial exact test, is a test used in sampling statistics to assess whether a proportion of a binary variable is equal to some hypothesized value. In this article, we explore the key features of this test and walk through an example test. 

What are the hypotheses of the binomial test?

The hypotheses for the Binomial test are as follows:

Sometimes you may also want to test a null hypothesis of the population proportion as being greater than the hypothesized value specifically (or lesser than), rather than different in any direction, in which case you would perform a one tailed significance test, but more commonly the two tailed approach is used.

Note that there is no test statistic generated in a Binomial test as is common in other statistical tests such as the Mann-Whitney U Test or the unpaired Student’s t-test , due to the p-value being calculated directly.

When to use the Binomial test

The Binomial test is used when a binary variable of interest (a variable that can take only two possible values e.g. mortality (dead/alive)) is being investigated and you have a hypothesized or expected value with which to compare it to. The test can only be used when sample size is small compared to the population about which you are trying to make an inference.

hypothesis testing binomial distribution

Changes to the shape of a Binomial distribution at varying values of the proportion of successes, p, and number of trials, n.

The Binomial test is derived from the Binomial distribution, which can be thought of as the distribution that is followed by the number of ‘successes’ or ‘failures’ in a certain number, n , of repeated independent experiments or ‘trials’. In more statistical language, we can say that the distribution relies on the values of n and p (the probability of any trial being a success), and that these are the parameters of the Binomial distribution. It is useful to note that as the sample size (the value of n ) increases, the distribution becomes more symmetrical and converges to a Normal distribution.

Binomial test assumptions

Assumptions for the Binomial test are as follows, and can be easily remembered using the ‘BINS’ acronym:

Binomial test example

Suppose a population health researcher carries out a small random sample survey to estimate the prevalence (the proportion of a population affected) of herpes simplex virus (HSV), a common viral infection that causes genital and oral herpes. Members of the sample are selected at random with a total of 20 people selected ( n =20), are independent from one another and have the same probability of having the outcome, and with a binary outcome of interest (presence of HSV; yes/no).

We can thus conceptualize this as a series of 20 independent trials with the proportion of people with the infection, p, following the Binomial distribution. Suppose in the survey it was found that 6 (30%) of the 20 participants had HSV. The probability of a given survey participant having the disease is therefore p=0.3. Suppose also that a previous survey found the prevalence of HSV to be 20% (this could be from the same population or a comparable population) - the researchers use this as the hypothesized value on which to run the Binomial test for the current survey proportion. 

The next step is to run the Binomial test and generate a p-value, which denotes the probability of getting the proportion of people with HSV as extreme or more extreme than what was observed if the true p was equal to the hypothesized value. Statistical packages such as Stata, SPSS or R Studio can be relied upon to generate the Binomial test p-value, but for illustrative purposes the formula is detailed below. If we have n independent trials with probability of having HSV being p we can calculate the probability of the value being the hypothesized number of HSV cases, r (in this case r =4 as 20% of 20 is 4), using the following formula:

hypothesis testing binomial distribution

By plugging the values into the Binomial formula we get 0.196, the probability of 6 or fewer HSV cases out of 20 (one tailed test). Since our hypothesis of interest is whether the observed and hypothesized values differ in any direction, we would like to generate a two tailed test, and so we multiply by 2 to get a final p-value of 0.392.

The Binomial test formula features factorials represented by an exclamation point. These are calculated by multiplying the number by itself and then by every whole number through to 1. See here  for a full hand calculation of the Binomial formula, and here  for a convenient online calculator.

Using a significance level of α =0.05 we fail to reject the null hypothesis because p > 0.05 and conclude that there is no evidence of a statistically significant difference between the prevalence of HSV in the current survey compared with the previous survey given the sample size.

Elliot McClenaghan is a research fellow in Epidemiology and Medical Statistics at the London School of Hygiene & Tropical Medicine  

Elliot McClenaghan

Post-Hoc Tests in Statistical Analysis

hypothesis testing binomial distribution

The Advantage of a Cloud LIMS in Empowering Lab’s Digital Transformation

hypothesis testing binomial distribution

ELN/LIMS and ROI: Practical Discussion on Strategy, Costs, Considerations and Pitfalls to Avoid

hypothesis testing binomial distribution

Stack Exchange Network

Stack Exchange network consists of 181 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Mathematics Stack Exchange is a question and answer site for people studying math at any level and professionals in related fields. It only takes a minute to sign up.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Test of Hypothesis with Binomial Distribution

It is known that 40% of a certain species of birds have characteristic B. Twelve birds of this species are captured in an unusual environment and 4 of them are found to have characteristic B.

Is it reasonable to assume that the birds in this environment have a smaller probability than that the species in general has?

I assumed that this is a binomial case with $p=0.4$ and $n=12.$ Then to figure out if the assumption of the birds having smaller probability than the species is correct, I tried to do it using hypothesis testing and finding the P-value, but I got confused. Any help will be appreciated!

BruceET's user avatar

Null and Alternative Hypotheses. You want to test $H_0: p = .4$ against $H_a: p < .4.$

In $n = 12$ observations you observe $X = 4$ birds of Type B. If the null hypothesis is true $E(X) = np = 12(.4) = 4.8.$

While it is true that you observed fewer than the 'expected' number of birds of Type B, the question is whether 4 is enough smaller than 4.8 to reject $H_0,$ calling this a 'statistically significant' result.

Finding the P-value. The P-value is the probability (assuming $H_0$ to be true) of a result as extreme or more extreme (in the direction of the alternative) than the observed $X = 4.$

If $X \sim \mathsf{Binom}(n = 12,\, p = .4),$ then the P-value is $P(X \le 4) = 0.4382.$

This can be computed with a calculator using the PDF of $\mathsf{Binom}(12, .4),$ and evaluating $P(X \le 4) = P(X=0) + P(X=1) + \cdots + P(X=4),$ or by using software. The computation in R statistical software is as follows:

Conclusion. So the P-value of the test is about 0.44, which is not surprisingly small. Testing at the 5% level of significance, one would not reject $H_0$ unless the P-value is less than 0.05. Thus, seeing 4 birds of Type B is consistent with $H_0$ by the usual standards of statistical significance. (This is the same as the conclusion in the Comments of @lulu and @DavidQuinn, even if perhaps not for precisely the same reasons.)

P-value by Normal Approximation. Alternatively, an approximate value of this probability can be found by using the normal approximation to the binomial distribution (with continuity correction): $\mu = E(X) = 4.8,$ as above, and $\sigma = SD(X) = \sqrt{np(1-p)} = 1.6971.$ Then the 'best-fitting' normal distribution is $\mathsf{Norm}(\mu = 4.8, \sigma = 1.6971).$ The approximation is as follows:

$$P(X \le 4.5) = P\left(\frac{X-\mu}{\sigma} \le \frac{4.5-4.8}{1.6971} = -0.1768\right) \approx P(Z \le -0.18) = 0.4286,$$

where $Z$ has a standard normal distribution, so that the approximate probability can be found using printed normal tables. Slightly more accurately (without rounding to use tables), the normal approximation of the P-value can be found using software:

Sketch of Null Binomial Distribution. Below is a plot of $\mathsf{Binom}(12, .4)$ (black bars) compared with the PDF of the 'best fitting' normal distribution (blue curve). The P-value is the sum of the heights of the bars to the left of the vertical red line.

enter image description here

You must log in to answer this question.

Not the answer you're looking for browse other questions tagged probability computational-mathematics hypothesis-testing ..

Hot Network Questions

hypothesis testing binomial distribution

Your privacy

By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy .


  1. S2 Hypothesis Testing. Binomial Distribution

    hypothesis testing binomial distribution

  2. Hypothesis test using the binomial distribution introduction

    hypothesis testing binomial distribution

  3. Hypothesis testing using the binomial distribution (2.05a)

    hypothesis testing binomial distribution

  4. Hypothesis Testing with Binomial Distribution

    hypothesis testing binomial distribution

  5. PPT

    hypothesis testing binomial distribution

  6. Hypothesis Testing with the Binomial Distribution

    hypothesis testing binomial distribution


  1. Statistics: Binomial Distribution

  2. Statistics 5 5 Binomial Distributions HD 720p

  3. 9.3 Distribution Needed for Hypothesis Testing

  4. 9 3 Distribution Needed for Hypothesis Testing

  5. Binomial Probability Distribution PHStat 2

  6. Chapter 7 -7 .1 Hypothesis testing


  1. Hypothesis Testing with the Binomial Distribution

    To hypothesis test with the binomial distribution, we must calculate the probability, p p , of the observed event and any more extreme event happening.

  2. Binomial Hypothesis Test: Explanation, Example, Assumptions

    Hypothesis testing is the process of using binomial distribution to help us reject or accept null hypotheses. · A null hypothesis is what we assume to be

  3. Hypothesis Testing for Binomial Distribution

    test(x, n, p0) give the same results, where x is # of Successes, n is Sample size and p0 is Hyp Proportion. But the difference in results appears when binomial

  4. Hypothesis Testing Using the Binomial Distribution

    In this case: “The coin is fair or even more inclined to show tails (rather than heads)”. The probability p for heads is, therefore, equal to 50

  5. Hypothesis Testing for the Binomial Distribution : ExamSolutions

    Hypothesis testing for the binomial distribution. In this video, I'll show you how to conduct a Hypothesis test for Binomial

  6. Hypothesis Testing for the Binomial Distribution (Example 2)

    Example question on hypothesis testing for the binomial distribution.YOUTUBE CHANNEL at https://www.youtube.com/ExamSolutionsEXAMSOLUTIONS

  7. Hypothesis Tests on a Binomial Proportion

    Organized by textbook: https://learncheme.com/Made by faculty at the University of Colorado Boulder, Department of Chemical & Biological

  8. Binomial test

    In statistics, the binomial test is an exact test of the statistical significance of deviations from a theoretically expected distribution of observations

  9. The Binomial Test

    Binomial test example · The null hypothesis (H0) is that the proportion of survey participants (30%) with HSV is equal to 20% (0.2). · The

  10. Test of Hypothesis with Binomial Distribution

    Well, under the null hypothesis the probability that you'd see at most 4 type B birds is 0.438178222...the probability that you'd see exactly 4