Statistic of the Week
Bi-Variate Statistics
Contingency Tables & the Chi Square Statistic
7400.685.080 Research Methods in HE/FE - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Inst: D. Witt
Up to now we have been looking at univariate statistics (one variable at a time). This is fine for t-tests and for the assessment of skewness and kurtosis in all our variables of interest.
Now we can begin to take a look at real analysis with two variables. This is bivariate statistical analysis. (by the end of the semester, we will be doing multivariate statistics!!!).
There are several statistics that operate under the bivariate label. The first one is the Chi Square statistic. But first ... we have to have an understanding of Contingency Tables (also known as crosstabulations, or crosstabs for short).
Crosstabulations explain the distribution of values that two variables have in common.
For example: Suppose we have a theory that hypothesizes a
relationship
between:
Gender (1=male/2=female) and Attitudes Toward Gun control
(1=favor/2=don't
favor).
That is, we hypothesize that men are more in favor of gun
control
than women
S0 ---Hyp.1: Men are more in Favor of Gun control than Women.
We construct a questionnaire that includes these two concepts, distribute it, and code our data into the computer.
When we ask the computer for a Frequency Distribution of the variables, we can fill in a table that reflects the following:
for Gender: 30% of the sample are men - 70% of the sample are women
for Gun Control: 60% of the sample favor control - 40% of the sample don't favor control
If we were to ask the computer to spit out a Contingency table using the variables Gender and Gun Control we'd get this:
Gender Male Female Yes Cell1 Cell2 Favor Observed 8% Observed 52% 60% Gun Control No Cell3 Cell4 Observed 22% Observed 18% 40% 30% 70% 100%
| Contingency Table |
Gender |
||
|
Favor Gun Control |
male |
female |
marginals |
|
yes |
Cell1 Observed 8% |
Cell2 Observed 52% |
60% |
|
no |
Cell3 Observed 22% |
Cell4 Observed 18% |
40% |
|
marginals |
30% |
70% |
100% |
NOTICE The computer gives us the Observed cell
percentages
and:
-as you add down the Male column of cells, 8% + 22% = 30%
-as you add across the Yes column of cells, 8% + 52% =
60%
-as you add down the column marginals, 60% + 40% = 100% annnnnd
..
-as you add across the row marginals, 30% + 70% = 100%
This, children, is a two by two (2x2) contingency table and is the simplest form of one. It tells us what we really observed, but doesn't say much about whether or not our observations are different from what we would normallys expect to see ====> W have to insert Expected and Observed percentages on our Contingency Table to do that:
Knowing the Observed Scores (Fo) from the table above, we calculate the Expected Scores (Fe) for each cell (k) in the contingency table using this formula:
Fek = (row total x column total)/grand total
| We have four cells, so we calculate four Fe's: | We already know: |
| Fe1 = (60 x 30)/100 = 18% | Fo1 = 8% |
| Fe2 = (60 x 70)/100 = 42% | Fo2 = 52% |
| Fe3 = (30 x 40)/100 = 12% | Fo3 = 22% |
| Fe4 = (70 x 40)/100 = 28% | Fo4 = 18% |
Now we can insert Expected frequencies in the contingency table:
|
Contingency Table |
Gender |
Gender |
|
|
Favor Gun Control |
Male |
Female |
marginals |
|
yes |
Cell1 - Exp.Fe118% Obs. Fo1 8% |
Cell2 - Exp.Fe2 42% Obs. Fo2 52% |
60% |
|
no |
Cell3 - Exp.Fe312% Obs. Fo3 22% |
Cell4 - Exp.Fe428% Obs. Fo4 18% |
40% |
|
marginals |
30% |
70% |
100% |
To calculate the Chi2 Statistic: use the formula:
Chi2 = (Fok - Fek)2
/Fek
and by substitutuion: ((8-18)2 / 18)+((52-42)2
/42)+((22-12)2 /12+((18-28)2 /28) = 19.84
So the Obtained (calculated) Chi2 Value is 19.84
But is 19.84 a statistically significant chi square value?
We need to look in the chi square distribution table (below) ...
and ... we need to know the Degrees of Freedom (df's)
For a 2x2 table the df is 1 because as soon as one cell is filled,
all the others are determined.
Look in the Distribution of Chi Square Table (below) and
find
the place where df = 1.
Follow from left to right until you find a listed "critical"
value of Chi2 that is bigger than your calculated one or run
out of values.

For this example df = 1 has a Critical Chi2 value of
10.827
at the 99.9th percentile,
or a probability level of p<.001 (that's 100%-99.9%=.1% or
p<.001)
Enterpreting these data, we can say either:
-men differ significantly from women when it comes to favoring gun control.
or -people who favor gun control are more likely to be women than men.
Name ______________________________________________________ Homework Assignment: Contingency Tables and the Chi Square Statistic
1. You are given the following data concerning the relationship between: education and type of community where subjects were raised.
Community in which respondent lived most of the time from age 13 to 19.
|
Community in which respondent Rural |
lived from age 13 to 19 Urban |
Row Marginals |
|
|
Education 12 years or less |
Cell1 Exp._____ Obs. 35% |
Cell2 Exp._____ Obs. 20% |
55% |
|
Education Over 12 years |
Cell3 Exp._____ Obs. 15% |
Cell4 Exp._____ Obs. 30% |
45% |
|
Column Marginals |
50% |
50% |
100% |
a. Fill in the table above with Fek values (expected values). Show your work!
b. What is the Calculated Chi2 value?
c. Look up the expected, or "critical", Chi2 value in the table of chi square distribution?
d. Summarize the nature of the relationship in a few sentences.
e. Generalize from the data.
2. Let us suppose you are interested in studying the relationship between Intelligence and Memory.
You design a study that measures IQ for intelligence, and Cognitive Retention of Beatles Lyrics (i.e., Fill in the blank "I'll buy you a ________ ________ my friend if it makes you feel alright.").
Your hypothesis:
The more intelligent the respondent, the more Beatle lyrics that can be
remembered.
To test this hypothesis, you select a random sample of dormitory residents at the UofA and provide them with a cassette tape with 30 songs on it. You ask respondents to listen to the tape once each day for seven days. At the end of the week you administor the test and measure their intelligence.
Here are the results in crosstab form:
| Number of Beatle
0-15 lyrics |
Lyrics Remembered
Over 15 remembered |
Row
Marginals |
|
| IQ Scores
Under 100 |
Cell1 Exp____
Obs. 25% |
Cell2 Exp ____
Obs. 10% |
35% |
| IQ Scores
100-129 |
Cell3 Exp ____
Obs. 15% |
Cell4 Exp. ____
Obs. 20% |
35% |
| IQ Scores
Over 129 |
Cell5 Exp ____
Obs. 5% |
Cell6 Exp. ____
Obs. 25% |
30% |
| Column
Marginals |
45% | 55% | 100% |
a. Fill in the table above with Expected Values. Show your work!
b. What is the calculated/obtained Chi2 value?
c. Look up the expected/critical Chi2 value?
d. Summarize the nature of the relationship in a couple of sentences.
e. Generalize the findings.