Representing a Categorical Variable with Tables
Help Questions
AP Statistics › Representing a Categorical Variable with Tables
A researcher collects data on the preferred type of movie from a sample of moviegoers. The categories are Action, Comedy, Drama, and Sci-Fi. Which of the following is NOT information that can be obtained from a frequency table of this data?
The total number of moviegoers in the sample.
The movie type that was most frequently preferred.
The number of moviegoers who prefer Comedy.
The average rating moviegoers gave for their preferred movie type.
Explanation
A frequency table for a categorical variable like 'preferred movie type' can show counts for each category, the total count, and the most frequent category (the mode). It cannot provide information about a quantitative variable like an 'average rating,' as that data is not part of the categorical frequency count.
Which of the following statements best describes the relationship between a frequency and a relative frequency for a given category?
The relative frequency is the frequency divided by the total number of categories.
The relative frequency is the frequency divided by the total number of observations.
The frequency is the relative frequency divided by the total number of observations.
The frequency is the relative frequency multiplied by 100.
Explanation
Relative frequency is calculated by taking the count (frequency) for a specific category and dividing it by the total sample size (total number of observations). The other options misstate this fundamental relationship.
A library tracks the genres of books checked out by patrons. A relative frequency table shows the following: Fiction: 0.52, Non-Fiction: 0.28, Young Adult: 0.15. The rest are Children's books. If 400 books were checked out in total, what is the frequency of Children's books?
5
20
95
380
Explanation
First, find the relative frequency of Children's books. The sum of relative frequencies must be 1. $$1 - (0.52 + 0.28 + 0.15) = 1 - 0.95 = 0.05$$. Then, find the frequency by multiplying this relative frequency by the total number of books: $$0.05 \times 400 = 20$$.
Which of the following represents the relative frequency table for the pollution levels of the 150 rivers?
Low: 0.42, Medium: 0.30, High: 0.28
Low: 63, Medium: 45, High: 42
Low: 0.28, Medium: 0.30, High: 0.42
Low: 0.63, Medium: 0.45, High: 0.42
Explanation
First, find the frequency for High: $$150 - 63 - 45 = 42$$. Then, calculate relative frequencies: Low is $$63/150 = 0.42$$, Medium is $$45/150 = 0.30$$, and High is $$42/150 = 0.28$$. Choice A correctly lists these relative frequencies. Choice B lists frequencies. Choices C and D use incorrect calculations.
A record store manager creates a frequency table of music genres sold in a day. The manager notes that the relative frequency for the Rock genre is 0.25. Which statement must be true?
Rock was the most popular genre sold that day.
One-quarter of all albums sold that day were in the Rock genre.
The total number of albums sold was 25.
Exactly 25 Rock albums were sold that day.
Explanation
A relative frequency of 0.25 means that the category accounts for $$1/4$$, or one-quarter, of the total observations. Without knowing the total number of albums sold, we cannot determine the exact frequency (count), so A and D are not necessarily true. We also cannot determine if Rock was the most popular genre, as another genre could have a relative frequency greater than 0.25.
A city planner claims that cars are the primary mode of transport for a majority of commuters. Which of the following statements correctly evaluates this claim using the provided data?
The claim is incorrect because the frequency for Car (165) is not more than twice the next highest frequency (75).
The claim is incorrect because 135 commuters use a mode of transport other than a car.
The claim is correct because the frequency for Car (165) is the highest among all categories.
The claim is correct because the relative frequency for Car is 0.55, which is greater than 0.50.
Explanation
The term 'majority' means more than 50%. To evaluate this, we need the relative frequency. The relative frequency for Car is $$165/300 = 0.55$$. Since $$0.55 > 0.50$$, the claim is supported by the data. While choice A is true (Car is the mode), it doesn't directly address the 'majority' claim. Choices C and D use flawed reasoning to evaluate the claim.
A fitness center manager is researching which class type members most want added to the schedule. From the population of current members, a random sample of 160 members is surveyed and each member selects a preferred new class type (a categorical variable). The distribution is shown below.
Which statement is supported by the data?
| Preferred new class type | Frequency | Relative frequency |
|---|---|---|
| Yoga | 52 | 0.325 |
| Strength training | 44 | 0.275 |
| Cycling | 36 | 0.225 |
| Dance | 28 | 0.175 |

Dance is preferred by 28% of surveyed members.
Cycling is the most preferred new class type in the sample.
Strength training is preferred by about 44% of surveyed members.
More than half of surveyed members preferred cycling or dance.
Yoga is preferred by the largest proportion of surveyed members, about 32.5%.
Explanation
The skill assessed is representing a categorical variable with tables, using frequency and relative frequency to evaluate preferred new class types among 160 members. Choice C is supported by yoga's relative frequency of 0.325, or 32.5%, the largest proportion, greater than strength training at 27.5%. Evidence includes 52/160 = 0.325, confirming its lead. A common distractor is choice A, claiming strength training at about 44%, but that's the frequency, not the 27.5% relative. Choice D similarly mistakes dance's frequency 28 for 28%. A transferable lesson for categorical tables is to distinguish frequencies from relative frequencies when making proportion-based claims, and verify modes by comparing decimals or percentages directly.
A city transportation office is studying commuting habits of adults who live within city limits. A random sample of 200 adults is asked to report their primary commute mode (a categorical variable). The distribution is shown below.
Which statement is supported by the data?
| Primary commute mode | Frequency | Relative frequency |
|---|---|---|
| Drive alone | 98 | 0.490 |
| Carpool | 34 | 0.170 |
| Public transit | 46 | 0.230 |
| Bike/Walk | 22 | 0.110 |

About 49% of sampled adults reported driving alone, which is the largest category.
Public transit is the most common commute mode in the sample.
The frequency for bike/walk is 0.110 adults.
Carpool and bike/walk together account for more than half of the sample.
Fewer than one-fourth of sampled adults reported public transit as their primary mode.
Explanation
The skill tested here involves representing a categorical variable with tables, focusing on interpreting frequencies and relative frequencies for commute modes in a sample of 200 adults. Choice C is supported as the relative frequency for driving alone is 0.490, or about 49%, and it is the largest category, exceeding public transit at 23% and others. This is evidenced by the frequency of 98, which is 98/200 = 0.49, confirming it's the mode. A frequent distractor is choice E, claiming the frequency for bike/walk is 0.110 adults, but 0.110 is the relative frequency, while the actual frequency is 22, mixing up the two columns. Choice D incorrectly adds carpool and bike/walk to over half (0.170 + 0.110 = 0.280), which is less than 0.5. A transferable mini-lesson for categorical tables is to always verify proportions by dividing frequencies by the total and compare them to assess dominance or combinations, avoiding assumptions about aggregates without calculation.
A school counselor wants to understand how 9th-grade students at Central High prefer to receive academic support. The counselor surveys a random sample of 120 ninth graders and records each student’s preferred support method (a categorical variable). Results are summarized below.
Which statement is supported by the data in the table?
| Preferred support method | Frequency | Relative frequency |
|---|---|---|
| One-on-one tutoring | 42 | 0.350 |
| Small-group sessions | 30 | 0.250 |
| Online resources | 24 | 0.200 |
| Teacher office hours | 18 | 0.150 |
| Peer mentoring | 6 | 0.050 |
About $35%$ of surveyed students preferred one-on-one tutoring, the highest proportion.
A majority of surveyed students preferred either online resources or teacher office hours.
More students preferred small-group sessions than online resources.
Teacher office hours were preferred by about 18% of surveyed students.
Peer mentoring was preferred by about 6% of surveyed students.
Explanation
This question assesses the skill of representing a categorical variable with tables by interpreting frequency and relative frequency to identify supported statements about students' preferred academic support methods. The data supports choice B because the relative frequency for one-on-one tutoring is $0.350$, or $35%$, which is indeed the highest proportion among the options, as confirmed by comparing it to $0.250$, $0.200$, $0.150$, and $0.050$. Evidence from the table shows 42 students chose this out of 120, yielding exactly $35%$, outperforming others like small-group sessions at $25%$. A common distractor is choice C, which claims about 18% preferred teacher office hours, but the actual relative frequency is $0.150$ or 15%, likely confusing the frequency count of 18 with a percentage. Another distractor, choice D, states about 6% for peer mentoring, but it's precisely 5%, highlighting the need for accurate reading. In interpreting categorical tables, a key lesson is to use relative frequencies for proportions and compare them directly to identify modes or rankings, ensuring calculations align with the total sample size for validity.
A community health clinic is studying how patients prefer to be reminded about appointments. From the population of patients who had an appointment last month, a random sample of 110 patients is surveyed and each chooses a preferred reminder method (a categorical variable). The distribution is shown below.
Which statement is supported by the data?
| Preferred reminder method | Frequency | Relative frequency |
|---|---|---|
| Text message | 55 | 0.500 |
| Phone call | 22 | 0.200 |
| 27 | 0.245 | |
| No reminder | 6 | 0.055 |

No reminder is preferred by about 6% of the sampled patients.
Email is preferred by more patients than phone calls in the sample.
Phone calls are preferred by about 22.0% of the sampled patients.
Text messages are preferred by about 55% of the sampled patients.
A majority of the sampled patients preferred either email or phone calls.
Explanation
The skill here involves representing a categorical variable with tables by analyzing frequency and relative frequency for preferred reminder methods in 110 patients. Choice A is supported because email has a frequency of 27, exceeding phone calls at 22, indicating more patients prefer email. Evidence from the table confirms 27 > 22, directly comparing counts. A common distractor is choice B, claiming text messages at about 55%, but it's 0.500 or 50%, an overestimation. Choice D says phone calls at about 22.0%, mistaking frequency for relative frequency (20%). A mini-lesson on categorical tables is to compare frequencies for count-based claims and relative frequencies for proportions, calculating totals or sums to validate statements about majorities or preferences.