Question 1
A study examined the relationship between pet ownership (dog, cat, no pet) and allergy status (yes, no) in a random sample of 600 people. In the sample, 250 people owned a dog, 150 owned a cat, and 200 had no pet. A total of 180 people reported having allergies.
Which of the following expressions correctly calculates the expected number of people who own a cat and have allergies, under the null hypothesis of no association?
- 600(150)(180)
- 600(150)(250)
- 600(180)(200)
- (150)(180)(600)
Explanation: The formula for the expected count in a cell of a two-way table is (row total × column total) / grand total. Here, the row total for 'owning a cat' is 150, the column total for 'having allergies' is 180, and the grand total sample size is 600. The correct expression is 600(150)(180). Question 2
Data was collected to see if there is an association between the day of the week and the number of customers at a restaurant. A random sample of 200 days was observed. The totals are: 80 weekdays and 120 weekend days. The customer traffic was categorized as Low, Medium, or High. There were 50 days with Low traffic, 90 with Medium, and 60 with High. On weekdays, there were 30 days with Low traffic.
What is the expected number of weekdays with Low customer traffic if traffic level is independent of the type of day?
- 20
- 30
- 40
- 50
Explanation: The expected count is (row total × column total) / grand total. The row total for weekdays is 80. The column total for Low traffic is 50. The grand total is 200. The expected count is (80×50)/200=4000/200=20. The value of 30 is the observed count, which is a common distractor. Question 3
In a chi-square test for independence, if the observed count for a particular cell is much larger than its corresponding expected count, what does this suggest?
- The calculation of the expected count must be incorrect.
- This cell provides evidence against the null hypothesis of independence.
- This cell provides evidence in favor of the null hypothesis of independence.
- The sample size was not large enough to perform the test.
Explanation: The chi-square test statistic measures the total discrepancy between observed and expected counts. A large difference between an observed and expected count for a cell increases the value of the chi-square statistic. A large test statistic provides evidence against the null hypothesis. Therefore, a cell where the observed count is very different from the expected count suggests the variables may not be independent.
Question 4
A research study produced the following two-way table of counts for two categorical variables, A and B. A total of 200 subjects were studied. For variable A, the counts are 80 for level 1 and 120 for level 2. For variable B, the counts are 100 for level X and 100 for level Y.
For a chi-square test of independence, which of the following comparisons between expected counts is correct?
- The expected count for (Level 1, Level X) is equal to the expected count for (Level 2, Level X).
- The expected count for (Level 1, Level X) is less than the expected count for (Level 1, Level Y).
- The expected count for (Level 2, Level X) is greater than the expected count for (Level 1, Level X).
- The expected count for (Level 1, Level Y) is greater than the expected count for (Level 2, Level Y).
Explanation: First, calculate the four expected counts using the formula (row total × column total) / grand total. The expected count for (Level 1, Level X) is (80×100)/200=40. The expected count for (Level 1, Level Y) is (80×100)/200=40. The expected count for (Level 2, Level X) is (120×100)/200=60. The expected count for (Level 2, Level Y) is (120×100)/200=60. Comparing the values as per the choices, only C is correct because the expected count for (Level 2, Level X), which is 60, is greater than the expected count for (Level 1, Level X), which is 40. Question 5
In the context of a chi-square test for independence between two categorical variables, the calculation for expected counts is based on which assumption?
- The two variables are independent, which is the assumption of the null hypothesis.
- The two variables are dependent, which is the assumption of the alternative hypothesis.
- The sample size is large enough for the Central Limit Theorem to apply to the counts.
- The distribution of each variable is approximately normal within the population.
Explanation: Expected counts for a chi-square test of independence are calculated under the assumption that the null hypothesis is true. The null hypothesis for this test states that there is no association (i.e., the variables are independent) between the two categorical variables.
Question 6
A study was conducted to investigate whether the genre of a movie seen (Action, Comedy, Drama) is independent of the age group of the moviegoer (Child, Teen, Adult). Data was collected from a random sample of 300 moviegoers.
Under the null hypothesis that movie genre preference is independent of age group, how is the expected number of teens who prefer action movies calculated?
- By multiplying the total number of teens by the total number of people who prefer action movies, then dividing by the total number of moviegoers.
- By dividing the number of teens who were observed to prefer action movies by the total number of teens.
- By multiplying the proportion of all moviegoers who are teens by the total number of action movies available.
- By averaging the number of moviegoers across all combinations of age group and genre.
Explanation: The expected count for a cell under the null hypothesis of independence is calculated by the formula: (row total × column total) / grand total. In this context, this corresponds to (total number of teens × total number of people who prefer action movies) / total number of moviegoers.
Question 7
A gym tracked 500 members by whether they attend group classes and whether they renewed their membership. Assuming independence, which expression calculates the expected count for the Group Classes & Renewed cell?
- 500(200)(350)
- 500200
- 140
- 500350
- (200)(350)
Explanation: To find expected counts assuming independence, we apply (row total × column total) ÷ grand total. For Group Classes & Renewed, we multiply members attending group classes (200) by members who renewed (350), then divide by all 500 members. Choice A shows this correctly: (200)(350)/500. Choice C gives 140, which is the calculated result but not the expression itself. Choices B and D show individual proportions, while E shows the product without division. The expected count formula helps us test whether the observed counts differ significantly from what independence would predict.
Question 8
A researcher classified 120 plants by whether they received fertilizer and whether they bloomed. Under the assumption of independence, which expression calculates the expected count for the No Fertilizer & Bloomed cell?
- 120(50)(70)
- 12050
- 30
- 12070
- (50)(70)
Explanation: Expected counts in two-way tables use the formula (row total × column total) ÷ grand total. For No Fertilizer & Bloomed, we multiply plants without fertilizer (50) by plants that bloomed (70), then divide by all 120 plants. Choice A correctly shows (50)(70)/120. Choice C shows 30, which might be an observed count but isn't the expression. Choices B and D show marginal proportions that don't calculate expected counts, while E multiplies totals without dividing, yielding 3,500 instead of the reasonable expected count of about 29.
Question 9
A clinic categorized 180 patients by whether they received a flu shot and whether they later reported flu symptoms. Under the assumption of independence, which expression calculates the expected count for the Shot & Symptoms cell?
- 180(120)(45)
- 18045
- 180120
- 20
- (120)(45)
Explanation: This problem requires calculating the expected count for a cell in a two-way table assuming independence between variables. The expected count formula is (row total × column total) ÷ grand total. For the Shot & Symptoms cell, we multiply the total who got shots (120) by the total with symptoms (45), then divide by all 180 patients. Choice A correctly represents this: (120)(45)/180. Choice D shows 20, which might be an observed count, while B and C show individual proportions. Choice E multiplies the totals but forgets the crucial step of dividing by the grand total.
Question 10
A study categorized 90 commuters by whether they bike to work and whether their commute is under 5 miles or 5 miles and over. Assuming independence, which expression calculates the expected count for the Bike & Under 5 miles cell?
- 9050
- 90(36)(50)
- 9036
- 18
- (36)(50)
Explanation: Expected counts under independence use the formula (row total×column total)÷grand total. For Bike & Under 5 miles, we multiply commuters who bike (36) by those with commutes under 5 miles (50), then divide by all 90 commuters. Choice B shows this correctly: (36)(50)/90. Choice D gives 18, which might be an observed count rather than the expression. Choices A and C show individual proportions, while E shows only the product. Understanding this formula is essential for testing whether categorical variables are associated or independent.