Have a language expert improve your writing
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
- Knowledge Base
- Chi-Square Test of Independence | Formula, Guide & Examples

Chi-Square Test of Independence | Formula, Guide & Examples
Published on May 30, 2022 by Shaun Turney . Revised on November 10, 2022.
A chi-square (Χ 2 ) test of independence is a nonparametric hypothesis test . You can use it to test whether two categorical variables are related to each other.
The city decides to test two interventions: an educational flyer (pamphlet) or a phone call. They randomly select 300 households and randomly assign them to the flyer, phone call, or control group (no intervention). They’ll use the results of their experiment to decide which intervention to use for the whole city.
Table of contents
What is the chi-square test of independence, chi-square test of independence hypotheses, when to use the chi-square test of independence, how to calculate the test statistic (formula), how to perform the chi-square test of independence, when to use a different test, practice questions, frequently asked questions about the chi-square test of independence.
A chi-square (Χ 2 ) test of independence is a type of Pearson’s chi-square test . Pearson’s chi-square tests are nonparametric tests for categorical variables. They’re used to determine whether your data are significantly different from what you expected.
You can use a chi-square test of independence, also known as a chi-square test of association, to determine whether two categorical variables are related. If two variables are related, the probability of one variable having a certain value is dependent on the value of the other variable.
The chi-square test of independence calculations are based on the observed frequencies, which are the numbers of observations in each combined group.
The test compares the observed frequencies to the frequencies you would expect if the two variables are unrelated. When the variables are unrelated, the observed and expected frequencies will be similar.
Contingency tables
When you want to perform a chi-square test of independence, the best way to organize your data is a type of frequency distribution table called a contingency table .
A contingency table, also known as a cross tabulation or crosstab, shows the number of observations in each combination of groups. It also usually includes row and column totals.
They reorganize the data into a contingency table:
They also visualize their data in a bar graph:

The chi-square test of independence is an inferential statistical test, meaning that it allows you to draw conclusions about a population based on a sample . Specifically, it allows you to conclude whether two variables are related in the population.
Like all hypothesis tests, the chi-square test of independence evaluates a null and alternative hypothesis. The hypotheses are two competing answers to the question “Are variable 1 and variable 2 related?”
- Null hypothesis ( H 0 ): Variable 1 and variable 2 are not related in the population; The proportions of variable 1 are the same for different values of variable 2.
- Alternative hypothesis ( H a ): Variable 1 and variable 2 are related in the population; The proportions of variable 1 are not the same for different values of variable 2.
You can use the above sentences as templates. Replace variable 1 and variable 2 with the names of your variables.
- Null hypothesis ( H 0 ): Whether a household recycles and the type of intervention they receive are not related in the population; The proportion of households that recycle is the same for all interventions.
- Alternative hypothesis ( H a ): Whether a household recycles and the type of intervention they receive are related in the population; The proportion of households that recycle is not the same for all interventions.
Expected values
A chi-square test of independence works by comparing the observed and the expected frequencies. The expected frequencies are such that the proportions of one variable are the same for all values of the other variable.
You can calculate the expected frequencies using the contingency table. The expected frequency for row r and column c is:

Here's why students love Scribbr's proofreading services
Discover proofreading & editing
The following conditions are necessary if you want to perform a chi-square goodness of fit test:
- Chi-square tests of independence are usually performed on binary or nominal variables. They are sometimes performed on ordinal variables, although generally only on ordinal variables with fewer than five groups.
- The sample was random l y selected from the population .
- There are a minimum of five observations expected in each combined group.
- They want to test a hypothesis about the relationships between two categorical variables: whether a household recycles and the type of intervention.
- They recruited a random sample of 300 households.
- There are a minimum of five observations expected in each combined group. The smallest expected frequency is 12.57.
Pearson’s chi-square (Χ 2 ) is the test statistic for the chi-square test of independence:

- Χ 2 is the chi-square test statistic
- Σ is the summation operator (it means “take the sum of”)
- O is the observed frequency
- E is the expected frequency
The chi-square test statistic measures how much your observed frequencies differ from the frequencies you would expect if the two variables are unrelated. It is large when there’s a big difference between the observed and expected frequencies ( O − E in the equation).
Follow these five steps to calculate the test statistic:
Step 1: Create a table
Create a table with the observed and expected frequencies in two columns.
Step 2: Calculate O − E
In a new column called “ O − E ”, subtract the expected frequencies from the observed frequencies.
Step 3: Calculate ( O – E ) 2
In a new column called “( O − E ) 2 ”, square the values in the previous column.
Step 4: Calculate ( O − E ) 2 / E
In a final column called “(O − E) 2 / E”, divide the previous column by the expected frequencies.
Step 5: Calculate Χ 2
Finally, add up the values of the previous column to calculate the chi-square test statistic (Χ2).
If the test statistic is big enough then you should conclude that the observed frequencies are not what you’d expect if the variables are unrelated. But what counts as big enough?
We compare the test statistic to a critical value from a chi-square distribution to decide whether it’s big enough to reject the null hypothesis that the two variables are unrelated. This procedure is called the chi-square test of independence.
Follow these steps to perform a chi-square test of independence (the first two steps have already been completed for the recycling example):
Step 1: Calculate the expected frequencies
Use the contingency table to calculate the expected frequencies following the formula:

Step 2: Calculate chi-square
Use the Pearson’s chi-square formula to calculate the test statistic :

Step 3: Find the critical chi-square value
You can find the critical value in a chi-square critical value table or using statistical software. You need to known two numbers to find the critical value:
- The degrees of freedom ( df ): For a chi-square test of independence, the df is (number of variable 1 groups − 1) * (number of variable 2 groups − 1).
- Significance level (α): By convention, the significance level is usually .05.
Step 4: Compare the chi-square value to the critical value
Is the test statistic big enough to reject the null hypothesis? Compare it to the critical value to find out.
Critical value = 5.99
Step 5: Decide whether to reject the null hypothesis
- The data allows you to reject the null hypothesis that the variables are unrelated and provides support for the alternative hypothesis that the variables are related.
- The data doesn’t allow you to reject the null hypothesis that the variables are unrelated and doesn’t provide support for the alternative hypothesis that the variables are related.
There is a significant difference between the observed frequencies and the frequencies expected if the two variables were unrelated ( p < .05). This suggests that the proportion of households that recycle is not the same for all interventions.
Step 6: Follow up with post hoc tests (optional)
If there are more than two groups in either of the variables and you rejected the null hypothesis, you may want to investigate further with post hoc tests. A post hoc test is a follow-up test that you perform after your initial analysis.
Similar to a one-way ANOVA with more than two groups, a significant difference doesn’t tell you which groups’ proportions are significantly different from each other.
One post hoc approach is to compare each pair of groups using chi-square tests of independence and a Bonferroni correction. A Bonferroni correction is when you divide your original significance level (usually .05) by the number of tests you’re performing.
- Since there are two intervention groups and two outcome groups for each test, there is (2 − 1) * (2 − 1) = 1 degree of freedom.
- There are three tests, so the significance level with a Bonferroni correction applied is α = .05 / 3 = .016.
- For a test of significance at α = .016 and df = 1, the Χ 2 critical value is 5.803.
- The chi-square value is greater than the critical value for the pamphlet vs control and phone call vs. control tests.
Based on these findings, the city concludes that a significantly greater proportion of households recycled after receiving a pamphlet or phone call compared to the control.
Several tests are similar to the chi-square test of independence, so it may not always be obvious which to use. The best choice will depend on your variables, your sample size, and your hypotheses.
When to use the chi-square goodness of fit test
There are two types of Pearson’s chi-square test. The chi-square test of independence is one of them, and the chi-square goodness of fit test is the other. The math is the same for both tests—the main difference is how you calculate the expected values.
You should use the chi-square goodness of fit test when you have one categorical variable and you want to test a hypothesis about its distribution .
When to use Fisher’s exact test
If you have a small sample size ( N < 100), Fisher’s exact test is a better choice. You should especially opt for Fisher’s exact test when your data doesn’t meet the condition of a minimum of five observations expected in each combined group.
When to use McNemar’s test
You should use McNemar’s test when you have a closely-related pair of categorical variables that each have two groups. It allows you to test whether the proportions of the variables are equal. This test is most often used to compare before and after observations of the same individuals.
When to use a G test
A G test and a chi-square test give approximately the same results. G tests can accommodate more complex experimental designs than chi-square tests. However, the tests are usually interchangeable and the choice is mostly a matter of personal preference.
One reason to prefer chi-square tests is that they’re more familiar to researchers in most fields.
Do you want to test your knowledge about the chi-square goodness of fit test? Download our practice questions and examples with the buttons below.
Download Word doc Download Google doc
You can use the CHISQ.TEST() function to perform a chi-square test of independence in Excel. It takes two arguments, CHISQ.TEST(observed_range, expected_range), and returns the p value.
You can use the chisq.test() function to perform a chi-square test of independence in R. Give the contingency table as a matrix for the “x” argument. For example:
m = matrix(data = c(89, 84, 86, 9, 8, 24), nrow = 3, ncol = 2)
chisq.test(x = m)
The two main chi-square tests are the chi-square goodness of fit test and the chi-square test of independence .
A chi-square distribution is a continuous probability distribution . The shape of a chi-square distribution depends on its degrees of freedom , k . The mean of a chi-square distribution is equal to its degrees of freedom ( k ) and the variance is 2 k . The range is 0 to ∞.
Cite this Scribbr article
If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.
Turney, S. (2022, November 10). Chi-Square Test of Independence | Formula, Guide & Examples. Scribbr. Retrieved February 28, 2023, from https://www.scribbr.com/statistics/chi-square-test-of-independence/
Is this article helpful?

Shaun Turney
Other students also liked, chi-square (χ²) tests | types, formula & examples, chi-square (χ²) distributions | definition & examples, chi-square goodness of fit test | formula, guide & examples, what is your plagiarism score.
JMP | Statistical Discovery.™ From SAS.
Statistics Knowledge Portal
A free online introduction to statistics
Chi-Square Test of Independence
What is the chi-square test of independence.
The Chi-square test of independence is a statistical hypothesis test used to determine whether two categorical or nominal variables are likely to be related or not.
When can I use the test?
You can use the test when you have counts of values for two categorical variables.
Can I use the test if I have frequency counts in a table?
Yes. If you have only a table of values that shows frequency counts, you can use the test.
Using the Chi-square test of independence
The Chi-square test of independence checks whether two variables are likely to be related or not. We have counts for two categorical or nominal variables. We also have an idea that the two variables are not related. The test gives us a way to decide if our idea is plausible or not.
The sections below discuss what we need for the test, how to do the test, understanding results, statistical details and understanding p-values.
What do we need?
For the Chi-square test of independence, we need two variables. Our idea is that the variables are not related. Here are a couple of examples:
- We have a list of movie genres; this is our first variable. Our second variable is whether or not the patrons of those genres bought snacks at the theater. Our idea (or, in statistical terms, our null hypothesis) is that the type of movie and whether or not people bought snacks are unrelated. The owner of the movie theater wants to estimate how many snacks to buy. If movie type and snack purchases are unrelated, estimating will be simpler than if the movie types impact snack sales.
- A veterinary clinic has a list of dog breeds they see as patients. The second variable is whether owners feed dry food, canned food or a mixture. Our idea is that the dog breed and types of food are unrelated. If this is true, then the clinic can order food based only on the total number of dogs, without consideration for the breeds.
For a valid test, we need:
- Data values that are a simple random sample from the population of interest.
- Two categorical or nominal variables. Don't use the independence test with continous variables that define the category combinations. However, the counts for the combinations of the two categorical variables will be continuous.
- For each combination of the levels of the two variables, we need at least five expected values. When we have fewer than five for any one combination, the test results are not reliable.
Chi-square test of independence example
Let’s take a closer look at the movie snacks example. Suppose we collect data for 600 people at our theater. For each person, we know the type of movie they saw and whether or not they bought snacks.
Let’s start by answering: Is the Chi-square test of independence an appropriate method to evaluate the relationship between movie type and snack purchases?
- We have a simple random sample of 600 people who saw a movie at our theater. We meet this requirement.
- Our variables are the movie type and whether or not snacks were purchased. Both variables are categorical. We meet this requirement.
- The last requirement is for more than five expected values for each combination of the two variables. To confirm this, we need to know the total counts for each type of movie and the total counts for whether snacks were bought or not. For now, we assume we meet this requirement and will check it later.
It appears we have indeed selected a valid method. (We still need to check that more than five values are expected for each combination.)
Here is our data summarized in a contingency table:
Table 1: Contingency table for movie snacks data
Before we go any further, let’s check the assumption of five expected values in each category. The data has more than five counts in each combination of Movie Type and Snacks. But what are the expected counts if movie type and snack purchases are independent?
Finding expected counts
To find expected counts for each Movie-Snack combination, we first need the row and column totals, which are shown below:
Table 2: Contingency table for movie snacks data with row and column totals
The expected counts for each Movie-Snack combination are based on the row and column totals. We multiply the row total by the column total and then divide by the grand total. This gives us the expected count for each cell in the table. For example, for the Action-Snacks cell, we have:
$ \frac{125\times310}{600} = \frac{38,750}{600} = 65 $
We rounded the answer to the nearest whole number. If there is not a relationship between movie type and snack purchasing we would expect 65 people to have watched an action film with snacks.
Here are the actual and expected counts for each Movie-Snack combination. In each cell of Table 3 below, the expected count appears in bold beneath the actual count. The expected counts are rounded to the nearest whole number.
Table 3: Contingency table for movie snacks data showing actual count vs. expected count
When using software, these calculated values will be labeled as “expected values,” “expected cell counts” or some similar term.
All of the expected counts for our data are larger than five, so we meet the requirement for applying the independence test.
Before calculating the test statistic, let’s look at the contingency table again. The expected counts use the row and column totals. If we look at each of the cells, we can see that some expected counts are close to the actual counts but most are not. If there is no relationship between the movie type and snack purchases, the actual and expected counts will be similar. If there is a relationship, the actual and expected counts will be different.
A common mistake with expected counts is to simply divide the grand total by the number of cells. For our movie data, this is 600 / 8 = 75. This is not correct. We know the row totals and column totals. These are fixed and cannot change for our data. The expected values are based on the row and column totals, not just on the grand total.

Performing the test
The basic idea in calculating the test statistic is to compare actual and expected values, given the row and column totals that we have in the data. First, we calculate the difference from actual and expected for each Movie-Snacks combination. Next, we square that difference. Squaring gives the same importance to combinations with fewer actual values than expected and combinations with more actual values than expected. Next, we divide by the expected value for the combination. We add up these values for each Movie-Snacks combination. This gives us our test statistic.
This is much easier to follow using the data from our example. Table 4 below shows the calculations for each Movie-Snacks combination carried out to two decimal places.
Table 4: Preparing to calculate our test statistic
Lastly, to get our test statistic, we add the numbers in the final row for each cell:
$ 3.29 + 3.52 + 5.81 + 6.21 + 12.65 + 13.52 + 9.68 + 10.35 = 65.03 $
To make our decision, we compare the test statistic to a value from the Chi-square distribution . This activity involves five steps:
- We decide on the risk we are willing to take of concluding that the two variables are not independent when in fact they are. For the movie data, we had decided prior to our data collection that we are willing to take a 5% risk of saying that the two variables – Movie Type and Snack Purchase – are not independent when they really are independent. In statistics-speak, we set the significance level, α, to 0.05.
- We calculate a test statistic. As shown above, our test statistic is 65.03.
- We find the critical value from the Chi-square distribution based on our degrees of freedom and our significance level. This is the value we expect if the two variables are independent.
- The degrees of freedom depend on how many rows and how many columns we have. The degrees of freedom (df) are calculated as: $ \text{df} = (r-1)\times(c-1) $ In the formula, r is the number of rows, and c is the number of columns in our contingency table. From our example, with Movie Type as the rows and Snack Purchase as the columns, we have: $ \text{df} = (4-1)\times(2-1) = 3\times1 = 3 $ The Chi-square value with α = 0.05 and three degrees of freedom is 7.815.
- We compare the value of our test statistic (65.03) to the Chi-square value. Since 65.03 > 7.815, we reject the idea that movie type and snack purchases are independent.
We conclude that there is some relationship between movie type and snack purchases. The owner of the movie theater cannot estimate how many snacks to buy regardless of the type of movies being shown. Instead, the owner must think about the type of movies being shown when estimating snack purchases.
It's important to note that we cannot conclude that the type of movie causes a snack purchase. The independence test tells us only whether there is a relationship or not; it does not tell us that one variable causes the other.
Understanding results
Let’s use graphs to understand the test and the results.
The side-by-side chart below shows the actual counts in blue, and the expected counts in orange. The counts appear at the top of the bars. The yellow box shows the movie type and snack purchase totals. These totals are needed to find the expected counts.

Compare the expected and actual counts for the Horror movies. You can see that more people than expected bought snacks and fewer people than expected chose not to buy snacks.
If you look across all four of the movie types and whether or not people bought snacks, you can see that there is a fairly large difference between actual and expected counts for most combinations. The independence test checks to see if the actual data is “close enough” to the expected counts that would occur if the two variables are independent. Even without a statistical test, most people would say that the two variables are not independent. The statistical test provides a common way to make the decision, so that everyone makes the same decision on the data.
The chart below shows another possible set of data. This set has the exact same row and column totals for movie type and snack purchase, but the yes/no splits in the snack purchase data are different.

The purple bars show the actual counts in this data. The orange bars show the expected counts, which are the same as in our original data set. The expected counts are the same because the row totals and column totals are the same. Looking at the graph above, most people would think that the type of movie and snack purchases are independent. If you perform the Chi-square test of independence using this new data, the test statistic is 0.903. The Chi-square value is still 7.815 because the degrees of freedom are still three. You would fail to reject the idea of independence because 0.903 < 7.815. The owner of the movie theater can estimate how many snacks to buy regardless of the type of movies being shown.
Statistical details
Let’s look at the movie-snack data and the Chi-square test of independence using statistical terms.
Our null hypothesis is that the type of movie and snack purchases are independent. The null hypothesis is written as:
$ H_0: \text{Movie Type and Snack purchases are independent} $
The alternative hypothesis is the opposite.
$ H_a: \text{Movie Type and Snack purchases are not independent} $
Before we calculate the test statistic, we find the expected counts. This is written as:
$ Σ_{ij} = \frac{R_i\times{C_j}}{N} $
The formula is for an i x j contingency table. That is a table with i rows and j columns. For example, E 11 is the expected count for the cell in the first row and first column. The formula shows R i as the row total for the i th row, and C j as the column total for the j th row. The overall sample size is N .
We calculate the test statistic using the formula below:
$ Σ^n_{i,j=1} = \frac{(O_{ij}-E_{ij})^2}{E_{ij}} $
In the formula above, we have n combinations of rows and columns. The Σ symbol means to add up the calculations for each combination. (We performed these same steps in the Movie-Snack example, beginning in Table 4.) The formula shows O ij as the Observed count for the ij -th combination and E i j as the Expected count for the combination. For the Movie-Snack example, we had four rows and two columns, so we had eight combinations.
We then compare the test statistic to the critical Chi-square value corresponding to our chosen alpha value and the degrees of freedom for our data. Using the Movie-Snack data as an example, we had set α = 0.05 and had three degrees of freedom. For the Movie-Snack data, the Chi-square value is written as:
$ χ_{0.05,3}^2 $
There are two possible results from our comparison:
- The test statistic is lower than the Chi-square value. You fail to reject the hypothesis of independence. In the movie-snack example, the theater owner can go ahead with the assumption that the type of movie a person sees has no relationship with whether or not they buy snacks.
- The test statistic is higher than the Chi-square value. You reject the hypothesis of independence. In the movie-snack example, the theater owner cannot assume that there is no relationship between the type of movie a person sees and whether or not they buy snacks.
Understanding p-values
Let’s use a graph of the Chi-square distribution to better understand the p-values. You are checking to see if your test statistic is a more extreme value in the distribution than the critical value. The graph below shows a Chi-square distribution with three degrees of freedom. It shows how the value of 7.815 “cuts off” 95% of the data. Only 5% of the data from a Chi-square distribution with three degrees of freedom is greater than 7.815.

The next distribution graph shows our results. You can see how far out “in the tail” our test statistic is. In fact, with this scale, it looks like the distribution curve is at zero at the point at which it intersects with our test statistic. It isn’t, but it is very, very close to zero. We conclude that it is very unlikely for this situation to happen by chance. The results that we collected from our movie goers would be extremely unlikely if there were truly no relationship between types of movies and snack purchases.

Statistical software shows the p-value for a test. This is the likelihood of another sample of the same size resulting in a test statistic more extreme than the test statistic from our current sample, assuming that the null hypothesis is true. It’s difficult to calculate this by hand. For the distributions shown above, if the test statistic is exactly 7.815, then the p - value will be p=0.05. With the test statistic of 65.03, the p - value is very, very small. In this example, most statistical software will report the p - value as “p < 0.0001.” This means that the likelihood of finding a more extreme value for the test statistic using another random sample (and assuming that the null hypothesis is correct) is less than one chance in 10,000.
Using Chi-Square Statistic in Research
The Chi Square statistic is commonly used for testing relationships between categorical variables. The null hypothesis of the Chi-Square test is that no relationship exists on the categorical variables in the population; they are independent. An example research question that could be answered using a Chi-Square analysis would be:
Is there a significant relationship between voter intent and political party membership?
How does the Chi-Square statistic work?
The Chi-Square statistic is most commonly used to evaluate Tests of Independence when using a crosstabulation (also known as a bivariate table). Crosstabulation presents the distributions of two categorical variables simultaneously, with the intersections of the categories of the variables appearing in the cells of the table. The Test of Independence assesses whether an association exists between the two variables by comparing the observed pattern of responses in the cells to the pattern that would be expected if the variables were truly independent of each other. Calculating the Chi-Square statistic and comparing it against a critical value from the Chi-Square distribution allows the researcher to assess whether the observed cell counts are significantly different from the expected cell counts.

Discover How We Assist to Edit Your Dissertation Chapters
Aligning theoretical framework, gathering articles, synthesizing gaps, articulating a clear methodology and data plan, and writing about the theoretical and practical implications of your research are part of our comprehensive dissertation editing services.
- Bring dissertation editing expertise to chapters 1-5 in timely manner.
- Track all changes, then work with you to bring about scholarly writing.
- Ongoing support to address committee feedback, reducing revisions.
The calculation of the Chi-Square statistic is quite straight-forward and intuitive:

where f o = the observed frequency (the observed counts in the cells) and f e = the expected frequency if NO relationship existed between the variables
As depicted in the formula, the Chi-Square statistic is based on the difference between what is actually observed in the data and what would be expected if there was truly no relationship between the variables.
How is the Chi-Square statistic run in SPSS and how is the output interpreted?
The Chi-Square statistic appears as an option when requesting a crosstabulation in SPSS. The output is labeled Chi-Square Tests; the Chi-Square statistic used in the Test of Independence is labeled Pearson Chi-Square. This statistic can be evaluated by comparing the actual value against a critical value found in a Chi-Square distribution (where degrees of freedom is calculated as # of rows – 1 x # of columns – 1), but it is easier to simply examine the p -value provided by SPSS. To make a conclusion about the hypothesis with 95% confidence, the value labeled Asymp. Sig. (which is the p -value of the Chi-Square statistic) should be less than .05 (which is the alpha level associated with a 95% confidence level).
Is the p -value (labeled Asymp. Sig.) less than .05? If so, we can conclude that the variables are not independent of each other and that there is a statistical relationship between the categorical variables.

In this example, there is an association between fundamentalism and views on teaching sex education in public schools. While 17.2% of fundamentalists oppose teaching sex education, only 6.5% of liberals are opposed. The p -value indicates that these variables are not independent of each other and that there is a statistically significant relationship between the categorical variables.
What are special concerns with regard to the Chi-Square statistic?
There are a number of important considerations when using the Chi-Square statistic to evaluate a crosstabulation. Because of how the Chi-Square value is calculated, it is extremely sensitive to sample size – when the sample size is too large (~500), almost any small difference will appear statistically significant. It is also sensitive to the distribution within the cells, and SPSS gives a warning message if cells have fewer than 5 cases. This can be addressed by always using categorical variables with a limited number of categories (e.g., by combining categories if necessary to produce a smaller table).
Statistics Solutions can assist with your quantitative analysis by assisting you to develop your methodology and results chapters. The services that we offer include:
Data Analysis Plan
Quantitative Results Section (Descriptive Statistics, Bivariate and Multivariate Analyses, Structural Equation Modeling , Path analysis, HLM, Cluster Analysis )
Please call 727-442-4290 to request a quote based on the specifics of your research, schedule a consultation here , or email [email protected]

Sociology 3112
Department of sociology, main navigation, the chi-square test for independence, learning objectives.
- Understand the characteristics of the chi-square distribution
- Carry out the chi-square test and interpret its results
- Understand the limitations of the chi-square test
Chi-Square Distribution: a family asymmetrical, positively skewed distributions, the exact shape of which is determined by their respective degrees of freedom Observed Frequencies: the cell frequencies actually observed in a bivariate table Expected Frequencies: The cell frequencies that one might expect to see in a bivariate table if the two variables were statistically independent
The primary use of the chi-square test is to examine whether two variables are independent or not. What does it mean to be independent, in this sense? It means that the two factors are not related. Typically in social science research, we're interested in finding factors that are dependent upon each other—education and income, occupation and prestige, age and voting behavior. By ruling out independence of the two variables, the chi-square can be used to assess whether two variables are, in fact, dependent or not. More generally, we say that one variable is "not correlated with" or "independent of" the other if an increase in one variable is not associated with an increase in the another. If two variables are correlated, their values tend to move together, either in the same or in the opposite direction. Chi-square examines a special kind of correlation: that between two nominal variables.
The Chi-Square Distribution
The chi-square distribution, like the t distribution, is actually a series of distributions, the exact shape of which varies according to their degrees of freedom. Unlike the t distribution, however, the chi-square distribution is asymmetrical, positively skewed and never approaches normality. The graph below illustrates how the shape of the chi-square distribution changes as the degrees of freedom (k) increase:

The Chi-Square Test
Earlier in the semester, you familiarized yourself with the five steps of hypothesis testing: (1) making assumptions (2) stating the null and research hypotheses and choosing an alpha level (3) selecting a sampling distribution and determining the test statistic that corresponds with the chosen alpha level (4) calculating the test statistic and (5) interpreting the results. Like the t tests we discussed previously, the chi-square test begins with a handful of assumptions, a pair of hypotheses, a sampling distribution and an alpha level and ends with a conclusion obtained via comparison of an obtained statistic with a critical statistic. The assumptions associated with the chi-square test are fairly straightforward: the data at hand must have been randomly selected (to minimize potential biases) and the variables in question must be nominal or ordinal (there are other methods to test the statistical independence of interval/ratio variables; these methods will be discussed in subsequent chapters). Regarding the hypotheses to be tested, all chi-square tests have the same general null and research hypotheses. The null hypothesis states that there is no relationship between the two variables, while the research hypothesis states that there is a relationship between the two variables. The test statistic follows a chi-square distribution, and the conclusion depends on whether or not our obtained statistic is greater that the critical statistic at our chosen alpha level .
In the following example, we'll use a chi-square test to determine whether there is a relationship between gender and getting in trouble at school (both nominal variables). Below is the table documenting the raw scores of boys and girls and their respective behavior issues (or lack thereof):
Gender and Getting in Trouble at School
To examine statistically whether boys got in trouble in school more often, we need to frame the question in terms of hypotheses. The null hypothesis is that the two variables are independent (i.e. no relationship or correlation) and the research hypothesis is that the two variables are related. In this case, the specific hypotheses are:
H0: There is no relationship between gender and getting in trouble at school H1: There is a relationship between gender and getting in trouble at school
As is customary in the social sciences, we'll set our alpha level at 0.05
Next we need to calculate the expected frequency for each cell. These values represent what we would expect to see if there really were no relationship between the two variables. We calculate the expected frequency for each cell by multiplying the row total by the column total and dividing by the total number of observations. To get the expected count for the upper right cell, we would multiply the row total (117) by the column total (83) and divide by the total number of observations (237). (83 x 117)/237 = 40.97. If the two variables were independent, we would expect 40.97 boys to get in trouble. Or, to put it another way, if there were no relationship between the two variables, we would expect to see the number of students who got in trouble be evenly distributed across both genders.
We do the same thing for the other three cells and end up with the following expected counts (in parentheses next to each raw score):
With these sets of figures, we calculate the chi-square statistic as follows:

For each cell, we square the difference between the observed frequency and the expected frequency (observed frequency – expected frequency) and divide that number by the expected frequency. Then we add all of the terms (there will be four, one for each cell) together, like so:

After we've crunched all those numbers, we end up with an obtained statistic of 1.87. ( Please note: a chi-square statistic can't be negative because nominal variables don't have directionality. If your obtained statistic turns out to be negative, you might want to check your math.) But before we can come to a conclusion, we need to find our critical statistic, which entails finding our degrees of freedom. In this case, the number of degrees of freedom is equal to the number of columns in the table minus one multiplied by the number of rows in the table minus one, or (r-1)(c-1). In our case, we have (2-1)(2-1), or one degree of freedom.
Finally, we compare our obtained statistic to our critical statistic found on the chi-square table posted in the "Files" section on Canvas. We also need to reference our alpha, which we set at .05. As you can see, the critical statistic for an alpha level of 0.05 and one degree of freedom is 3.841, which is larger than our obtained statistic of 1.87. Because the critical statistic is greater than our obtained statistic, we can't reject our null hypothesis.
The Limitations of the Chi-Square Test
There are two limitations to the chi-square test about which you should be aware. First, the chi-square test is very sensitive to sample size. With a large enough sample, even trivial relationships can appear to be statistically significant. When using the chi-square test, you should keep in mind that "statistically significant" doesn't necessarily mean "meaningful." Second, remember that the chi-square can only tell us whether two variables are related to one another. It does not necessarily imply that one variable has any causal effect on the other. In order to establish causality, a more detailed analysis would be required.
Main Points
- The chi-square distribution is actually a series of distributions that vary in shape according to their degrees of freedom.
- The chi-square test is a hypothesis test designed to test for a statistically significant relationship between nominal and ordinal variables organized in a bivariate table. In other words, it tells us whether two variables are independent of one another.
- The obtained chi-square statistic essentially summarizes the difference between the frequencies actually observed in a bivariate table and the frequencies we would expect to see if there were no relationship between the two variables.
- The chi-square test is sensitive to sample size.
- The chi-square test cannot establish a causal relationship between two variables.
Carrying out the Chi-Square Test in SPSS
To perform a chi square test with SPSS, click "Analyze," then "Descriptive Statistics," and then "Crosstabs." As was the case in the last chapter, the independent variable should be placed in the "Columns" box, and the dependent variable should be placed in the "Rows" box. Now click on "Statistics" and check the box next to "Chi-Square." This test will provide evidence either in favor of or against the statistical independence of two variables, but it won't give you any information about the strength or direction of the relationship.
After looking at the output, some of you are probably wondering why SPSS provides you with a two-tailed p-value when chi-square is always a one-tailed test. In all honesty, I don't know the answer to that question. However, all is not lost. Because two-tailed tests are always more conservative than one-tailed tests (i.e., it's harder to reject your null hypothesis with a two-tailed test than it is with a one-tailed test), a statistically significant result under a two-tailed assumption would also be significant under a one-tailed assumption. If you're highly motivated, you can compare the obtained statistic from your output to the critical statistic found on a chi-square chart. Here's a video walkthrough with a slightly more detailed explanation:
- Using the World Values Survey data, run a chi-square test to determine whether there is a relationship between sex ("SEX") and marital status ("MARITAL"). Report the obtained statistic and the p-value from your output. What is your conclusion?
- Using the ADD Health data, run a chi-square test to determine whether there is a relationship between the respondent's gender ("GENDER") and his or her grade in math ("MATH"). Again, report the obtained statistic and the p-value from your output. What is your conclusion?
11.3 - Chi-Square Test of Independence
The chi-square (\(\chi^2\)) test of independence is used to test for a relationship between two categorical variables. Recall that if two categorical variables are independent, then \(P(A) = P(A \mid B)\). The chi-square test of independence uses this fact to compute expected values for the cells in a two-way contingency table under the assumption that the two variables are independent (i.e., the null hypothesis is true).
Even if two variables are independent in the population, samples will vary due to random sampling variation. The chi-square test is used to determine if there is evidence that the two variables are not independent in the population using the same hypothesis testing logic that we used with one mean, one proportion, etc.
Again, we will be using the five step hypothesis testing procedure:
The assumptions are that the sample is randomly drawn from the population and that all expected values are at least 5 (we will see what expected values are later).
Our hypotheses are:
\(H_0:\) There is not a relationship between the two variables in the population (they are independent)
\(H_a:\) There is a relationship between the two variables in the population (they are dependent)
Note: When you're writing the hypotheses for a given scenario, use the names of the variables, not the generic "two variables."
The p-value can be found using Minitab. Look up the area to the right of your chi-square test statistic on a chi-square distribution with the correct degrees of freedom. Chi-square tests are always right-tailed tests.
If \(p \leq \alpha\) reject the null hypothesis.
If \(p>\alpha\) fail to reject the null hypothesis.
Write a conclusion in terms of the original research question.
11.3.1 - Example: Gender and Online Learning
A sample of 314 Penn State students was asked if they have ever taken an online course. Their genders were also recorded. The contingency table below was constructed. Use a chi-square test of independence to determine if there is a relationship between gender and whether or not someone has taken an online course.
\(H_0:\) There is not a relationship between gender and whether or not someone has taken an online course (they are independent)
\(H_a:\) There is a relationship between gender and whether or not someone has taken an online course (they are dependent)
Looking ahead to our calculations of the expected values, we can see that all expected values are at least 5. This means that the sampling distribution can be approximated using the \(\chi^2\) distribution.
In order to compute the chi-square test statistic we must know the observed and expected values for each cell. We are given the observed values in the table above. We must compute the expected values. The table below includes the row and column totals.
Note that all expected values are at least 5, thus this assumption of the \(\chi^2\) test of independence has been met.
Observed and expected counts are often presented together in a contingency table. In the table below, expected values are presented in parentheses.
\(\chi^2=\sum \dfrac{(O-E)^2}{E} \)
\(\chi^2=\dfrac{(43-46.586)^2}{46.586}+\dfrac{(63-59.414)^2}{59.414}+\dfrac{(95-91.414)^2}{91.414}+\dfrac{(113-116.586)^2}{116.586}=0.276+0.216+0.141+0.110=0.743\)
The chi-square test statistic is 0.743
\(df=(number\;of\;rows-1)(number\;of\;columns-1)=(2-1)(2-1)=1\)
We can determine the p-value by constructing a chi-square distribution plot with 1 degree of freedom and finding the area to the right of 0.743.

\(p = 0.388702\)
\(p>\alpha\), therefore we fail to reject the null hypothesis.
There is not enough evidence to conclude that gender and whether or not an individual has completed an online course are related.
Note that we cannot say for sure that these two categorical variables are independent, we can only say that we do not have enough evidence that they are dependent.
11.3.2 - Minitab: Test of Independence
If you have a data file with the responses for individual cases then you have "raw data" and can follow the directions below. If you have a table filled with data, then you have "summarized data." There is an example of conducting a chi-square test of independence using summarized data on a later page. After data entry the procedure is the same for both data entry methods.
Minitab ® – Chi-square Test Using Raw Data
Research question : Is there a relationship between where a student sits in class and whether they have ever cheated?
- Null hypothesis : Seat location and cheating are not related in the population.
- Alternative hypothesis : Seat location and cheating are related in the population.
To perform a chi-square test of independence in Minitab using raw data:
- Open Minitab file: class_survey.mpx
- Select Stat > Tables > Chi-Square Test for Association
- Select Raw data (categorical variables) from the dropdown.
- Choose the variable Seating to insert it into the Rows box
- Choose the variable Ever_Cheat to insert it into the Columns box
- Click the Statistics button and check the boxes Chi-square test for association and Expected cell counts
- Click OK and OK
This should result in the following output:
Rows: Seating Columns: Ever_Cheat
Chi-square test.
All expected values are at least 5 so we can use the Pearson chi-square test statistic. Our results are \(\chi^2 (2) = 1.539\). \(p = 0.463\). Because our \(p\) value is greater than the standard alpha level of 0.05, we fail to reject the null hypothesis. There is not enough evidence of a relationship in the population between seat location and whether a student has cheated.
11.3.2.1 - Example: Raw Data
Example: dog & cat ownership.
Is there a relationship between dog and cat ownership in the population of all World Campus STAT 200 students? Let's conduct an hypothesis test using the dataset: fall2016stdata.mpx
\(H_0:\) There is not a relationship between dog ownership and cat ownership in the population of all World Campus STAT 200 students \(H_a:\) There is a relationship between dog ownership and cat ownership in the population of all World Campus STAT 200 students
Assumption: All expected counts are at least 5. The expected counts here are 176.02, 75.98, 189.98, and 82.02, so this assumption has been met.
Let's use Minitab to calculate the test statistic and p-value.
- After entering the data, select Stat > Tables > Cross Tabulation and Chi-Square
- Enter Dog in the Rows box
- Enter Cat in the Columns box
- Select the Chi-Square button and in the new window check the box for the Chi-square test and Expected cell counts
Rows: Dog Columns: Cat
Since the assumption was met in step 1, we can use the Pearson chi-square test statistic.
\(Pearson\;\chi^2 = 1.771\)
\(p = 0.183\)
Our p value is greater than the standard 0.05 alpha level, so we fail to reject the null hypothesis.
There is not enough evidence of a relationship between dog ownership and cat ownership in the population of all World Campus STAT 200 students.
11.3.2.2 - Example: Summarized Data
Example: coffee and tea preference.
Is there a relationship between liking tea and liking coffee?
The following table shows data collected from a random sample of 100 adults. Each were asked if they liked coffee (yes or no) and if they liked tea (yes or no).
Let's use the 5 step hypothesis testing procedure to address this research question.
\(H_0:\) Liking coffee an liking tea are not related (i.e., independent) in the population \(H_a:\) Liking coffee and liking tea are related (i.e., dependent) in the population
Assumption: All expected counts are at least 5.
- Select Stat > Tables > Cross Tabulation and Chi-Square
- Select Summarized data in a two-way table from the dropdown
- Enter the columns Likes Coffee-Yes and Likes Coffee-No in the Columns containing the table box
- For the row labels enter Likes Tea (leave the column labels blank)
- Select the Chi-Square button and check the boxes for Chi-square test and Expected cell counts .
Rows: Likes Tea Columns: Worksheet columns
\(Pearson\;\chi^2 = 10.774\)
\(p = 0.001\)
Our p value is less than the standard 0.05 alpha level, so we reject the null hypothesis.
There is evidence of a relationship between between liking coffee and liking tea in the population.
11.3.3 - Relative Risk
A chi-square test of independence will give you information concerning whether or not a relationship between two categorical variables in the population is likely. As was the case with the single sample and two sample hypothesis tests that you learned earlier this semester, with a large sample size statistical power is high and the probability of rejecting the null hypothesis is high, even if the relationship is relatively weak. In addition to examining statistical significance by looking at the p value, we can also examine practical significance by computing the relative risk .
In Lesson 2 you learned that risk is often used to describe the probability of an event occurring. Risk can also be used to compare the probabilities in two different groups. First, we'll review risk, then you'll be introduced to the concept of relative risk.
The risk of an outcome can be expressed as a fraction or as the percent of a group that experiences the outcome.
Examples of Risk
60 out of 1000 teens have asthma. The risk is \(\frac{60}{1000}=.06\). This means that 6% of all teens experience asthma.
45 out of 100 children get the flu each year. The risk is \(\frac{45}{100}=.45\) or 45%
Thus, relative risk gives the risk for group 1 as a multiple of the risk for group 2.
Example of Relative Risk
Suppose that the risk of a child getting the flu this year is .45 and the risk of an adult getting the flu this year is .10. What is the relative risk of children compared to adults?
- \(Relative\;risk=\dfrac{.45}{.10}=4.5\)
Children are 4.5 times more likely than adults to get the flu this year.
Watch out for relative risk statistics where no baseline information is given about the actual risk. For instance, it doesn't mean much to say that beer drinkers have twice the risk of stomach cancer as non-drinkers unless we know the actual risks. The risk of stomach cancer might actually be very low, even for beer drinkers. For example, 2 in a million is twice the size of 1 in a million but is would still be a very low risk. This is known as the baseline with which other risks are compared.

Stats and R
Chi-square test of independence by hand.
- Hypothesis test
- Inferential statistics
Introduction
How the test works, observed frequencies, expected frequencies, test statistic, critical value, conclusion and interpretation.

Chi-square tests of independence test whether two qualitative variables are independent, that is, whether there exists a relationship between two categorical variables. In other words, this test is used to determine whether the values of one of the 2 qualitative variables depend on the values of the other qualitative variable.
If the test shows no association between the two variables (i.e., the variables are independent), it means that knowing the value of one variable gives no information about the value of the other variable. On the contrary, if the test shows a relationship between the variables (i.e., the variables are dependent), it means that knowing the value of one variable provides information about the value of the other variable.
This article focuses on how to perform a Chi-square test of independence by hand and how to interpret the results with a concrete example. To learn how to do this test in R, read the article “ Chi-square test of independence in R ”.
The Chi-square test of independence is a hypothesis test so it has a null ( \(H_0\) ) and an alternative hypothesis ( \(H_1\) ):
- \(H_0\) : the variables are independent, there is no relationship between the two categorical variables. Knowing the value of one variable does not help to predict the value of the other variable
- \(H_1\) : the variables are dependent, there is a relationship between the two categorical variables. Knowing the value of one variable helps to predict the value of the other variable
The Chi-square test of independence works by comparing the observed frequencies (so the frequencies observed in your sample) to the expected frequencies if there was no relationship between the two categorical variables (so the expected frequencies if the null hypothesis was true).
If the difference between the observed frequencies and the expected frequencies is small , we cannot reject the null hypothesis of independence and thus we cannot reject the fact that the two variables are not related . On the other hand, if the difference between the observed frequencies and the expected frequencies is large , we can reject the null hypothesis of independence and thus we can conclude that the two variables are related .
The threshold between a small and large difference is a value that comes from the Chi-square distribution (hence the name of the test). This value, referred as the critical value, depends on the significance level \(\alpha\) (usually set equal to 5%) and on the degrees of freedom. This critical value can be found in the statistical table of the Chi-square distribution. More on this critical value and the degrees of freedom later in the article.
For our example, we want to determine whether there is a statistically significant association between smoking and being a professional athlete. Smoking can only be “yes” or “no” and being a professional athlete can only be “yes” or “no”. The two variables of interest are qualitative variables so we need to use a Chi-square test of independence, and the data have been collected on 28 persons.
Note that we chose binary variables (binary variables = qualitative variables with two levels) for the sake of easiness, but the Chi-square test of independence can also be performed on qualitative variables with more than two levels. For instance, if the variable smoking had three levels: (i) non-smokers, (ii) moderate smokers and (iii) heavy smokers, the steps and the interpretation of the results of the test are similar than with two levels.
Our data are summarized in the contingency table below reporting the number of people in each subgroup, totals by row, by column and the grand total:
Remember that for the Chi-square test of independence we need to determine whether the observed counts are significantly different from the counts that we would expect if there was no association between the two variables. We have the observed counts (see the table above), so we now need to compute the expected counts in the case the variables were independent. These expected frequencies are computed for each subgroup one by one with the following formula:
\[\text{exp. frequencies} = \frac{\text{total # of obs. for the row} \cdot \text{total # of obs. for the column}}{\text{total number of observations}}\]
where obs. correspond to observations. Given our table of observed frequencies above, below is the table of the expected frequencies computed for each subgroup:
Note that the Chi-square test of independence should only be done when the expected frequencies in all groups are equal to or greater than 5. This assumption is met for our example as the minimum number of expected frequencies is 5. If the condition is not met, the Fisher’s exact test is preferred.
Talking about assumptions, the Chi-square test of independence requires that the observations are independent. This is usually not tested formally, but rather verified based on the design of the experiment and on the good control of experimental conditions. If you are not sure, ask yourself if one observation is related to another (if one observation has an impact on another). If not, it is most likely that you have independent observations.
If you have dependent observations (paired samples), the McNemar’s or Cochran’s Q tests should be used instead. The McNemar’s test is used when we want to know if there is a significant change in two paired samples (typically in a study with a measure before and after on the same subject) when the variables have only two categories. The Cochran’s Q tests is an extension of the McNemar’s test when we have more than two related measures.
We have the observed and expected frequencies. We now need to compare these frequencies to determine if they differ significantly. The difference between the observed and expected frequencies, referred as the test statistic (or t-stat) and denoted \(\chi^2\) , is computed as follows:
\[\chi^2 = \sum_{i, j} \frac{\big(O_{ij} - E_{ij}\big)^2}{E_{ij}}\]
where \(O\) represents the observed frequencies and \(E\) the expected frequencies. We use the square of the differences between the observed and expected frequencies to make sure that negative differences are not compensated by positive differences. The formula looks more complex than what it really is, so let’s illustrate it with our example. We first compute the difference in each subgroup one by one according to the formula:
- in the subgroup of athlete and non-smoker: \(\frac{(14 - 9)^2}{9} = 2.78\)
- in the subgroup of non-athlete and non-smoker: \(\frac{(0 - 5)^2}{5} = 5\)
- in the subgroup of athlete and smoker: \(\frac{(4 - 9)^2}{9} = 2.78\)
- in the subgroup of non-athlete and smoker: \(\frac{(10 - 5)^2}{5} = 5\)
and then we sum them all to obtain the test statistic:
\[\chi^2 = 2.78 + 5 + 2.78 + 5 = 15.56\]
The test statistic alone is not enough to conclude for independence or dependence between the two variables. As previously mentioned, this test statistic (which in some sense is the difference between the observed and expected frequencies) must be compared to a critical value to determine whether the difference is large or small. One cannot tell that a test statistic is large or small without putting it in perspective with the critical value.
If the test statistic is above the critical value, it means that the probability of observing such a difference between the observed and expected frequencies is unlikely. On the other hand, if the test statistic is below the critical value, it means that the probability of observing such a difference is likely. If it is likely to observe this difference, we cannot reject the hypothesis that the two variables are independent, otherwise we can conclude that there exists a relationship between the variables.
The critical value can be found in the statistical table of the Chi-square distribution and depends on the significance level, denoted \(\alpha\) , and the degrees of freedom, denoted \(df\) . The significance level is usually set equal to 5%. The degrees of freedom for a Chi-square test of independence is found as follow:
\[df = (\text{number of rows} - 1) \cdot (\text{number of columns} - 1)\]
In our example, the degrees of freedom is thus \(df = (2 - 1) \cdot (2 - 1) = 1\) since there are two rows and two columns in the contingency table (totals do not count as a row or column).
We now have all the necessary information to find the critical value in the Chi-square table ( \(\alpha = 0.05\) and \(df = 1\) ). To find the critical value we need to look at the row \(df = 1\) and the column \(\chi^2_{0.050}\) (since \(\alpha = 0.05\) ) in the picture below. The critical value is \(3.84146\) . 1

Chi-square table - Critical value for alpha = 5% and df = 1
Now that we have the test statistic and the critical value, we can compare them to check whether the null hypothesis of independence of the variables is rejected or not. In our example,
\[\text{test statistic} = 15.56 > \text{critical value} = 3.84146\]
Like for many statistical tests , when the test statistic is larger than the critical value, we can reject the null hypothesis at the specified significance level.
In our case, we can therefore reject the null hypothesis of independence between the two categorical variables at the 5% significance level.
\(\Rightarrow\) This means that there is a significant relationship between the smoking habit and being an athlete or not. Knowing the value of one variable helps to predict the value of the other variable.
Thanks for reading.
I hope the article helped you to perform the Chi-square test of independence by hand and interpret its results. If you would like to learn how to do this test in R, read the article “ Chi-square test of independence in R ”.
As always, if you have a question or a suggestion related to the topic covered in this article, please add it as a comment so other readers can benefit from the discussion.
For readers that prefer to check the \(p\) -value in order to reject or not the null hypothesis, I also created a Shiny app to help you compute the \(p\) -value given a test statistic. ↩︎
Related articles
- Wilcoxon test in R: how to compare 2 groups under the non-normality assumption?
- Correlation coefficient and correlation test in R
- One-proportion and chi-square goodness of fit test
- How to do a t-test or ANOVA for more than one variable at once in R?
Liked this post?
- Get updates every time a new article is published (no spam and unsubscribe anytime):
Yes, receive new posts by email
- Support the blog
Consulting FAQ Contribute Sitemap
Hypothesis Testing - Chi Squared Test
Lisa Sullivan, PhD
Professor of Biostatistics
Boston University School of Public Health

Introduction
This module will continue the discussion of hypothesis testing, where a specific statement or hypothesis is generated about a population parameter, and sample statistics are used to assess the likelihood that the hypothesis is true. The hypothesis is based on available information and the investigator's belief about the population parameters. The specific tests considered here are called chi-square tests and are appropriate when the outcome is discrete (dichotomous, ordinal or categorical). For example, in some clinical trials the outcome is a classification such as hypertensive, pre-hypertensive or normotensive. We could use the same classification in an observational study such as the Framingham Heart Study to compare men and women in terms of their blood pressure status - again using the classification of hypertensive, pre-hypertensive or normotensive status.
The technique to analyze a discrete outcome uses what is called a chi-square test. Specifically, the test statistic follows a chi-square probability distribution. We will consider chi-square tests here with one, two and more than two independent comparison groups.
Learning Objectives
After completing this module, the student will be able to:
- Perform chi-square tests by hand
- Appropriately interpret results of chi-square tests
- Identify the appropriate hypothesis testing procedure based on type of outcome variable and number of samples
Tests with One Sample, Discrete Outcome
Here we consider hypothesis testing with a discrete outcome variable in a single population. Discrete variables are variables that take on more than two distinct responses or categories and the responses can be ordered or unordered (i.e., the outcome can be ordinal or categorical). The procedure we describe here can be used for dichotomous (exactly 2 response options), ordinal or categorical discrete outcomes and the objective is to compare the distribution of responses, or the proportions of participants in each response category, to a known distribution. The known distribution is derived from another study or report and it is again important in setting up the hypotheses that the comparator distribution specified in the null hypothesis is a fair comparison. The comparator is sometimes called an external or a historical control.
In one sample tests for a discrete outcome, we set up our hypotheses against an appropriate comparator. We select a sample and compute descriptive statistics on the sample data. Specifically, we compute the sample size (n) and the proportions of participants in each response
Test Statistic for Testing H 0 : p 1 = p 10 , p 2 = p 20 , ..., p k = p k0
We find the critical value in a table of probabilities for the chi-square distribution with degrees of freedom (df) = k-1. In the test statistic, O = observed frequency and E=expected frequency in each of the response categories. The observed frequencies are those observed in the sample and the expected frequencies are computed as described below. χ 2 (chi-square) is another probability distribution and ranges from 0 to ∞. The test above statistic formula above is appropriate for large samples, defined as expected frequencies of at least 5 in each of the response categories.
When we conduct a χ 2 test, we compare the observed frequencies in each response category to the frequencies we would expect if the null hypothesis were true. These expected frequencies are determined by allocating the sample to the response categories according to the distribution specified in H 0 . This is done by multiplying the observed sample size (n) by the proportions specified in the null hypothesis (p 10 , p 20 , ..., p k0 ). To ensure that the sample size is appropriate for the use of the test statistic above, we need to ensure that the following: min(np 10 , n p 20 , ..., n p k0 ) > 5.
The test of hypothesis with a discrete outcome measured in a single sample, where the goal is to assess whether the distribution of responses follows a known distribution, is called the χ 2 goodness-of-fit test. As the name indicates, the idea is to assess whether the pattern or distribution of responses in the sample "fits" a specified population (external or historical) distribution. In the next example we illustrate the test. As we work through the example, we provide additional details related to the use of this new test statistic.
A University conducted a survey of its recent graduates to collect demographic and health information for future planning purposes as well as to assess students' satisfaction with their undergraduate experiences. The survey revealed that a substantial proportion of students were not engaging in regular exercise, many felt their nutrition was poor and a substantial number were smoking. In response to a question on regular exercise, 60% of all graduates reported getting no regular exercise, 25% reported exercising sporadically and 15% reported exercising regularly as undergraduates. The next year the University launched a health promotion campaign on campus in an attempt to increase health behaviors among undergraduates. The program included modules on exercise, nutrition and smoking cessation. To evaluate the impact of the program, the University again surveyed graduates and asked the same questions. The survey was completed by 470 graduates and the following data were collected on the exercise question:
Based on the data, is there evidence of a shift in the distribution of responses to the exercise question following the implementation of the health promotion campaign on campus? Run the test at a 5% level of significance.
In this example, we have one sample and a discrete (ordinal) outcome variable (with three response options). We specifically want to compare the distribution of responses in the sample to the distribution reported the previous year (i.e., 60%, 25%, 15% reporting no, sporadic and regular exercise, respectively). We now run the test using the five-step approach.
- Step 1. Set up hypotheses and determine level of significance.
The null hypothesis again represents the "no change" or "no difference" situation. If the health promotion campaign has no impact then we expect the distribution of responses to the exercise question to be the same as that measured prior to the implementation of the program.
H 0 : p 1 =0.60, p 2 =0.25, p 3 =0.15, or equivalently H 0 : Distribution of responses is 0.60, 0.25, 0.15
H 1 : H 0 is false. α =0.05
Notice that the research hypothesis is written in words rather than in symbols. The research hypothesis as stated captures any difference in the distribution of responses from that specified in the null hypothesis. We do not specify a specific alternative distribution, instead we are testing whether the sample data "fit" the distribution in H 0 or not. With the χ 2 goodness-of-fit test there is no upper or lower tailed version of the test.
- Step 2. Select the appropriate test statistic.
The test statistic is:
We must first assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ..., n p k ) > 5. The sample size here is n=470 and the proportions specified in the null hypothesis are 0.60, 0.25 and 0.15. Thus, min( 470(0.65), 470(0.25), 470(0.15))=min(282, 117.5, 70.5)=70.5. The sample size is more than adequate so the formula can be used.
- Step 3. Set up decision rule.
The decision rule for the χ 2 test depends on the level of significance and the degrees of freedom, defined as degrees of freedom (df) = k-1 (where k is the number of response categories). If the null hypothesis is true, the observed and expected frequencies will be close in value and the χ 2 statistic will be close to zero. If the null hypothesis is false, then the χ 2 statistic will be large. Critical values can be found in a table of probabilities for the χ 2 distribution. Here we have df=k-1=3-1=2 and a 5% level of significance. The appropriate critical value is 5.99, and the decision rule is as follows: Reject H 0 if χ 2 > 5.99.
- Step 4. Compute the test statistic.
We now compute the expected frequencies using the sample size and the proportions specified in the null hypothesis. We then substitute the sample data (observed frequencies) and the expected frequencies into the formula for the test statistic identified in Step 2. The computations can be organized as follows.
Notice that the expected frequencies are taken to one decimal place and that the sum of the observed frequencies is equal to the sum of the expected frequencies. The test statistic is computed as follows:
- Step 5. Conclusion.
We reject H 0 because 8.46 > 5.99. We have statistically significant evidence at α=0.05 to show that H 0 is false, or that the distribution of responses is not 0.60, 0.25, 0.15. The p-value is p < 0.005.
In the χ 2 goodness-of-fit test, we conclude that either the distribution specified in H 0 is false (when we reject H 0 ) or that we do not have sufficient evidence to show that the distribution specified in H 0 is false (when we fail to reject H 0 ). Here, we reject H 0 and concluded that the distribution of responses to the exercise question following the implementation of the health promotion campaign was not the same as the distribution prior. The test itself does not provide details of how the distribution has shifted. A comparison of the observed and expected frequencies will provide some insight into the shift (when the null hypothesis is rejected). Does it appear that the health promotion campaign was effective?
Consider the following:
If the null hypothesis were true (i.e., no change from the prior year) we would have expected more students to fall in the "No Regular Exercise" category and fewer in the "Regular Exercise" categories. In the sample, 255/470 = 54% reported no regular exercise and 90/470=19% reported regular exercise. Thus, there is a shift toward more regular exercise following the implementation of the health promotion campaign. There is evidence of a statistical difference, is this a meaningful difference? Is there room for improvement?
The National Center for Health Statistics (NCHS) provided data on the distribution of weight (in categories) among Americans in 2002. The distribution was based on specific values of body mass index (BMI) computed as weight in kilograms over height in meters squared. Underweight was defined as BMI< 18.5, Normal weight as BMI between 18.5 and 24.9, overweight as BMI between 25 and 29.9 and obese as BMI of 30 or greater. Americans in 2002 were distributed as follows: 2% Underweight, 39% Normal Weight, 36% Overweight, and 23% Obese. Suppose we want to assess whether the distribution of BMI is different in the Framingham Offspring sample. Using data from the n=3,326 participants who attended the seventh examination of the Offspring in the Framingham Heart Study we created the BMI categories as defined and observed the following:
- Step 1. Set up hypotheses and determine level of significance.
H 0 : p 1 =0.02, p 2 =0.39, p 3 =0.36, p 4 =0.23 or equivalently
H 0 : Distribution of responses is 0.02, 0.39, 0.36, 0.23
H 1 : H 0 is false. α=0.05
The formula for the test statistic is:
We must assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ..., n p k ) > 5. The sample size here is n=3,326 and the proportions specified in the null hypothesis are 0.02, 0.39, 0.36 and 0.23. Thus, min( 3326(0.02), 3326(0.39), 3326(0.36), 3326(0.23))=min(66.5, 1297.1, 1197.4, 765.0)=66.5. The sample size is more than adequate, so the formula can be used.
Here we have df=k-1=4-1=3 and a 5% level of significance. The appropriate critical value is 7.81 and the decision rule is as follows: Reject H 0 if χ 2 > 7.81.
We now compute the expected frequencies using the sample size and the proportions specified in the null hypothesis. We then substitute the sample data (observed frequencies) into the formula for the test statistic identified in Step 2. We organize the computations in the following table.
The test statistic is computed as follows:
We reject H 0 because 233.53 > 7.81. We have statistically significant evidence at α=0.05 to show that H 0 is false or that the distribution of BMI in Framingham is different from the national data reported in 2002, p < 0.005.
Again, the χ 2 goodness-of-fit test allows us to assess whether the distribution of responses "fits" a specified distribution. Here we show that the distribution of BMI in the Framingham Offspring Study is different from the national distribution. To understand the nature of the difference we can compare observed and expected frequencies or observed and expected proportions (or percentages). The frequencies are large because of the large sample size, the observed percentages of patients in the Framingham sample are as follows: 0.6% underweight, 28% normal weight, 41% overweight and 30% obese. In the Framingham Offspring sample there are higher percentages of overweight and obese persons (41% and 30% in Framingham as compared to 36% and 23% in the national data), and lower proportions of underweight and normal weight persons (0.6% and 28% in Framingham as compared to 2% and 39% in the national data). Are these meaningful differences?
In the module on hypothesis testing for means and proportions, we discussed hypothesis testing applications with a dichotomous outcome variable in a single population. We presented a test using a test statistic Z to test whether an observed (sample) proportion differed significantly from a historical or external comparator. The chi-square goodness-of-fit test can also be used with a dichotomous outcome and the results are mathematically equivalent.
In the prior module, we considered the following example. Here we show the equivalence to the chi-square goodness-of-fit test.
The NCHS report indicated that in 2002, 75% of children aged 2 to 17 saw a dentist in the past year. An investigator wants to assess whether use of dental services is similar in children living in the city of Boston. A sample of 125 children aged 2 to 17 living in Boston are surveyed and 64 reported seeing a dentist over the past 12 months. Is there a significant difference in use of dental services between children living in Boston and the national data?
We presented the following approach to the test using a Z statistic.
- Step 1. Set up hypotheses and determine level of significance
H 0 : p = 0.75
H 1 : p ≠ 0.75 α=0.05
We must first check that the sample size is adequate. Specifically, we need to check min(np 0 , n(1-p 0 )) = min( 125(0.75), 125(1-0.75))=min(94, 31)=31. The sample size is more than adequate so the following formula can be used
This is a two-tailed test, using a Z statistic and a 5% level of significance. Reject H 0 if Z < -1.960 or if Z > 1.960.
We now substitute the sample data into the formula for the test statistic identified in Step 2. The sample proportion is:

We reject H 0 because -6.15 < -1.960. We have statistically significant evidence at a =0.05 to show that there is a statistically significant difference in the use of dental service by children living in Boston as compared to the national data. (p < 0.0001).
We now conduct the same test using the chi-square goodness-of-fit test. First, we summarize our sample data as follows:
H 0 : p 1 =0.75, p 2 =0.25 or equivalently H 0 : Distribution of responses is 0.75, 0.25
We must assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ...,np k >) > 5. The sample size here is n=125 and the proportions specified in the null hypothesis are 0.75, 0.25. Thus, min( 125(0.75), 125(0.25))=min(93.75, 31.25)=31.25. The sample size is more than adequate so the formula can be used.
Here we have df=k-1=2-1=1 and a 5% level of significance. The appropriate critical value is 3.84, and the decision rule is as follows: Reject H 0 if χ 2 > 3.84. (Note that 1.96 2 = 3.84, where 1.96 was the critical value used in the Z test for proportions shown above.)
(Note that (-6.15) 2 = 37.8, where -6.15 was the value of the Z statistic in the test for proportions shown above.)
We reject H 0 because 37.8 > 3.84. We have statistically significant evidence at α=0.05 to show that there is a statistically significant difference in the use of dental service by children living in Boston as compared to the national data. (p < 0.0001). This is the same conclusion we reached when we conducted the test using the Z test above. With a dichotomous outcome, Z 2 = χ 2 ! In statistics, there are often several approaches that can be used to test hypotheses.
Tests for Two or More Independent Samples, Discrete Outcome
Here we extend that application of the chi-square test to the case with two or more independent comparison groups. Specifically, the outcome of interest is discrete with two or more responses and the responses can be ordered or unordered (i.e., the outcome can be dichotomous, ordinal or categorical). We now consider the situation where there are two or more independent comparison groups and the goal of the analysis is to compare the distribution of responses to the discrete outcome variable among several independent comparison groups.
The test is called the χ 2 test of independence and the null hypothesis is that there is no difference in the distribution of responses to the outcome across comparison groups. This is often stated as follows: The outcome variable and the grouping variable (e.g., the comparison treatments or comparison groups) are independent (hence the name of the test). Independence here implies homogeneity in the distribution of the outcome among comparison groups.
The null hypothesis in the χ 2 test of independence is often stated in words as: H 0 : The distribution of the outcome is independent of the groups. The alternative or research hypothesis is that there is a difference in the distribution of responses to the outcome variable among the comparison groups (i.e., that the distribution of responses "depends" on the group). In order to test the hypothesis, we measure the discrete outcome variable in each participant in each comparison group. The data of interest are the observed frequencies (or number of participants in each response category in each group). The formula for the test statistic for the χ 2 test of independence is given below.
Test Statistic for Testing H 0 : Distribution of outcome is independent of groups
and we find the critical value in a table of probabilities for the chi-square distribution with df=(r-1)*(c-1).
Here O = observed frequency, E=expected frequency in each of the response categories in each group, r = the number of rows in the two-way table and c = the number of columns in the two-way table. r and c correspond to the number of comparison groups and the number of response options in the outcome (see below for more details). The observed frequencies are the sample data and the expected frequencies are computed as described below. The test statistic is appropriate for large samples, defined as expected frequencies of at least 5 in each of the response categories in each group.
The data for the χ 2 test of independence are organized in a two-way table. The outcome and grouping variable are shown in the rows and columns of the table. The sample table below illustrates the data layout. The table entries (blank below) are the numbers of participants in each group responding to each response category of the outcome variable.
Table - Possible outcomes are are listed in the columns; The groups being compared are listed in rows.
In the table above, the grouping variable is shown in the rows of the table; r denotes the number of independent groups. The outcome variable is shown in the columns of the table; c denotes the number of response options in the outcome variable. Each combination of a row (group) and column (response) is called a cell of the table. The table has r*c cells and is sometimes called an r x c ("r by c") table. For example, if there are 4 groups and 5 categories in the outcome variable, the data are organized in a 4 X 5 table. The row and column totals are shown along the right-hand margin and the bottom of the table, respectively. The total sample size, N, can be computed by summing the row totals or the column totals. Similar to ANOVA, N does not refer to a population size here but rather to the total sample size in the analysis. The sample data can be organized into a table like the above. The numbers of participants within each group who select each response option are shown in the cells of the table and these are the observed frequencies used in the test statistic.
The test statistic for the χ 2 test of independence involves comparing observed (sample data) and expected frequencies in each cell of the table. The expected frequencies are computed assuming that the null hypothesis is true. The null hypothesis states that the two variables (the grouping variable and the outcome) are independent. The definition of independence is as follows:
Two events, A and B, are independent if P(A|B) = P(A), or equivalently, if P(A and B) = P(A) P(B).
The second statement indicates that if two events, A and B, are independent then the probability of their intersection can be computed by multiplying the probability of each individual event. To conduct the χ 2 test of independence, we need to compute expected frequencies in each cell of the table. Expected frequencies are computed by assuming that the grouping variable and outcome are independent (i.e., under the null hypothesis). Thus, if the null hypothesis is true, using the definition of independence:
P(Group 1 and Response Option 1) = P(Group 1) P(Response Option 1).
The above states that the probability that an individual is in Group 1 and their outcome is Response Option 1 is computed by multiplying the probability that person is in Group 1 by the probability that a person is in Response Option 1. To conduct the χ 2 test of independence, we need expected frequencies and not expected probabilities . To convert the above probability to a frequency, we multiply by N. Consider the following small example.
The data shown above are measured in a sample of size N=150. The frequencies in the cells of the table are the observed frequencies. If Group and Response are independent, then we can compute the probability that a person in the sample is in Group 1 and Response category 1 using:
P(Group 1 and Response 1) = P(Group 1) P(Response 1),
P(Group 1 and Response 1) = (25/150) (62/150) = 0.069.
Thus if Group and Response are independent we would expect 6.9% of the sample to be in the top left cell of the table (Group 1 and Response 1). The expected frequency is 150(0.069) = 10.4. We could do the same for Group 2 and Response 1:
P(Group 2 and Response 1) = P(Group 2) P(Response 1),
P(Group 2 and Response 1) = (50/150) (62/150) = 0.138.
The expected frequency in Group 2 and Response 1 is 150(0.138) = 20.7.
Thus, the formula for determining the expected cell frequencies in the χ 2 test of independence is as follows:
Expected Cell Frequency = (Row Total * Column Total)/N.
The above computes the expected frequency in one step rather than computing the expected probability first and then converting to a frequency.
In a prior example we evaluated data from a survey of university graduates which assessed, among other things, how frequently they exercised. The survey was completed by 470 graduates. In the prior example we used the χ 2 goodness-of-fit test to assess whether there was a shift in the distribution of responses to the exercise question following the implementation of a health promotion campaign on campus. We specifically considered one sample (all students) and compared the observed distribution to the distribution of responses the prior year (a historical control). Suppose we now wish to assess whether there is a relationship between exercise on campus and students' living arrangements. As part of the same survey, graduates were asked where they lived their senior year. The response options were dormitory, on-campus apartment, off-campus apartment, and at home (i.e., commuted to and from the university). The data are shown below.
Based on the data, is there a relationship between exercise and student's living arrangement? Do you think where a person lives affect their exercise status? Here we have four independent comparison groups (living arrangement) and a discrete (ordinal) outcome variable with three response options. We specifically want to test whether living arrangement and exercise are independent. We will run the test using the five-step approach.
H 0 : Living arrangement and exercise are independent
H 1 : H 0 is false. α=0.05
The null and research hypotheses are written in words rather than in symbols. The research hypothesis is that the grouping variable (living arrangement) and the outcome variable (exercise) are dependent or related.
- Step 2. Select the appropriate test statistic.
The condition for appropriate use of the above test statistic is that each expected frequency is at least 5. In Step 4 we will compute the expected frequencies and we will ensure that the condition is met.
The decision rule depends on the level of significance and the degrees of freedom, defined as df = (r-1)(c-1), where r and c are the numbers of rows and columns in the two-way data table. The row variable is the living arrangement and there are 4 arrangements considered, thus r=4. The column variable is exercise and 3 responses are considered, thus c=3. For this test, df=(4-1)(3-1)=3(2)=6. Again, with χ 2 tests there are no upper, lower or two-tailed tests. If the null hypothesis is true, the observed and expected frequencies will be close in value and the χ 2 statistic will be close to zero. If the null hypothesis is false, then the χ 2 statistic will be large. The rejection region for the χ 2 test of independence is always in the upper (right-hand) tail of the distribution. For df=6 and a 5% level of significance, the appropriate critical value is 12.59 and the decision rule is as follows: Reject H 0 if c 2 > 12.59.
We now compute the expected frequencies using the formula,
Expected Frequency = (Row Total * Column Total)/N.
The computations can be organized in a two-way table. The top number in each cell of the table is the observed frequency and the bottom number is the expected frequency. The expected frequencies are shown in parentheses.
Notice that the expected frequencies are taken to one decimal place and that the sums of the observed frequencies are equal to the sums of the expected frequencies in each row and column of the table.
Recall in Step 2 a condition for the appropriate use of the test statistic was that each expected frequency is at least 5. This is true for this sample (the smallest expected frequency is 9.6) and therefore it is appropriate to use the test statistic.
We reject H 0 because 60.5 > 12.59. We have statistically significant evidence at a =0.05 to show that H 0 is false or that living arrangement and exercise are not independent (i.e., they are dependent or related), p < 0.005.
Again, the χ 2 test of independence is used to test whether the distribution of the outcome variable is similar across the comparison groups. Here we rejected H 0 and concluded that the distribution of exercise is not independent of living arrangement, or that there is a relationship between living arrangement and exercise. The test provides an overall assessment of statistical significance. When the null hypothesis is rejected, it is important to review the sample data to understand the nature of the relationship. Consider again the sample data.
Because there are different numbers of students in each living situation, it makes the comparisons of exercise patterns difficult on the basis of the frequencies alone. The following table displays the percentages of students in each exercise category by living arrangement. The percentages sum to 100% in each row of the table. For comparison purposes, percentages are also shown for the total sample along the bottom row of the table.
From the above, it is clear that higher percentages of students living in dormitories and in on-campus apartments reported regular exercise (31% and 23%) as compared to students living in off-campus apartments and at home (10% each).
Test Yourself
Pancreaticoduodenectomy (PD) is a procedure that is associated with considerable morbidity. A study was recently conducted on 553 patients who had a successful PD between January 2000 and December 2010 to determine whether their Surgical Apgar Score (SAS) is related to 30-day perioperative morbidity and mortality. The table below gives the number of patients experiencing no, minor, or major morbidity by SAS category.
Question: What would be an appropriate statistical test to examine whether there is an association between Surgical Apgar Score and patient outcome? Using 14.13 as the value of the test statistic for these data, carry out the appropriate test at a 5% level of significance. Show all parts of your test.
In the module on hypothesis testing for means and proportions, we discussed hypothesis testing applications with a dichotomous outcome variable and two independent comparison groups. We presented a test using a test statistic Z to test for equality of independent proportions. The chi-square test of independence can also be used with a dichotomous outcome and the results are mathematically equivalent.
In the prior module, we considered the following example. Here we show the equivalence to the chi-square test of independence.
A randomized trial is designed to evaluate the effectiveness of a newly developed pain reliever designed to reduce pain in patients following joint replacement surgery. The trial compares the new pain reliever to the pain reliever currently in use (called the standard of care). A total of 100 patients undergoing joint replacement surgery agreed to participate in the trial. Patients were randomly assigned to receive either the new pain reliever or the standard pain reliever following surgery and were blind to the treatment assignment. Before receiving the assigned treatment, patients were asked to rate their pain on a scale of 0-10 with higher scores indicative of more pain. Each patient was then given the assigned treatment and after 30 minutes was again asked to rate their pain on the same scale. The primary outcome was a reduction in pain of 3 or more scale points (defined by clinicians as a clinically meaningful reduction). The following data were observed in the trial.
We tested whether there was a significant difference in the proportions of patients reporting a meaningful reduction (i.e., a reduction of 3 or more scale points) using a Z statistic, as follows.
H 0 : p 1 = p 2
H 1 : p 1 ≠ p 2 α=0.05
Here the new or experimental pain reliever is group 1 and the standard pain reliever is group 2.
We must first check that the sample size is adequate. Specifically, we need to ensure that we have at least 5 successes and 5 failures in each comparison group or that:
In this example, we have
Therefore, the sample size is adequate, so the following formula can be used:
Reject H 0 if Z < -1.960 or if Z > 1.960.
We now substitute the sample data into the formula for the test statistic identified in Step 2. We first compute the overall proportion of successes:
We now substitute to compute the test statistic.
- Step 5. Conclusion.
We now conduct the same test using the chi-square test of independence.
H 0 : Treatment and outcome (meaningful reduction in pain) are independent
H 1 : H 0 is false. α=0.05
The formula for the test statistic is:
For this test, df=(2-1)(2-1)=1. At a 5% level of significance, the appropriate critical value is 3.84 and the decision rule is as follows: Reject H0 if χ 2 > 3.84. (Note that 1.96 2 = 3.84, where 1.96 was the critical value used in the Z test for proportions shown above.)
We now compute the expected frequencies using:
The computations can be organized in a two-way table. The top number in each cell of the table is the observed frequency and the bottom number is the expected frequency. The expected frequencies are shown in parentheses.
A condition for the appropriate use of the test statistic was that each expected frequency is at least 5. This is true for this sample (the smallest expected frequency is 22.0) and therefore it is appropriate to use the test statistic.
(Note that (2.53) 2 = 6.4, where 2.53 was the value of the Z statistic in the test for proportions shown above.)
Chi-Squared Tests in R
The video below by Mike Marin demonstrates how to perform chi-squared tests in the R programming language.
Answer to Problem on Pancreaticoduodenectomy and Surgical Apgar Scores
We have 3 independent comparison groups (Surgical Apgar Score) and a categorical outcome variable (morbidity/mortality). We can run a Chi-Squared test of independence.
H 0 : Apgar scores and patient outcome are independent of one another.
H A : Apgar scores and patient outcome are not independent.
Chi-squared = 14.3
Since 14.3 is greater than 9.49, we reject H 0.
There is an association between Apgar scores and patient outcome. The lowest Apgar score group (0 to 4) experienced the highest percentage of major morbidity or mortality (16 out of 57=28%) compared to the other Apgar score groups.

IMAGES
VIDEO
COMMENTS
Like all hypothesis tests, the chi-square test of independence evaluates a null and alternative hypothesis. The hypotheses are two competing
The Chi-square test of independence is a statistical hypothesis test used to determine whether two categorical or nominal variables are likely to be related
The null hypothesis for this test is that there is no relationship between gender and empathy. The alternative hypothesis is that there is a relationship
The Chi Square statistic is commonly used for testing relationships between categorical variables. The null hypothesis of the Chi-Square test is that no
Regarding the hypotheses to be tested, all chi-square tests have the same general null and research hypotheses. The null hypothesis states
How do we test the independence of two categorical variables? It will be done using the Chi-Square Test of Independence. As with all prior statistical tests
Recall that if two categorical variables are independent, then P ( A ) = P ( A ∣ B ) . The chi-square test of independence uses this fact to compute expected
If the difference between the observed frequencies and the expected frequencies is small, we cannot reject the null hypothesis of independence
Null hypothesis: There are no relationships between the categorical variables. If you know the value of one variable, it does not help you predict the value of
The test is called the χ2 test of independence and the null hypothesis is that there is no difference in the distribution of responses to the outcome across