Representing a Categorical Variable with Graphs

Help Questions

AP Statistics › Representing a Categorical Variable with Graphs

Questions 1 - 4
1

A marketing firm surveyed 500 shoppers about their primary reason for choosing a particular grocery store. The reasons were categorized as Price, Convenience, Quality, and Service. Which of the following graphical displays is most appropriate for visualizing the distribution of these reasons?

A dotplot, as it can represent each of the 500 individual shopper responses as a separate point.

A bar chart, because the variable 'reason' is categorical and this graph displays the frequency for each category.

A histogram, because it shows the shape of the distribution by grouping shopper responses into bins.

A stem-and-leaf plot, because it is useful for showing the distribution of a variable with a small to moderate number of observations.

Explanation

A bar chart is the appropriate graphical display for a single categorical variable. The variable 'reason' has distinct categories (Price, Convenience, etc.), and a bar chart effectively shows the count or proportion in each category. Histograms, dotplots, and stem-and-leaf plots are used for quantitative data, not categorical data.

2

A city official wants to display the sources of city tax revenue for the past year. The sources are Property Tax, Sales Tax, Income Tax, and Other Fees. Which of the following graphs would be best to emphasize each source's contribution to the total revenue?

A side-by-side bar chart, to compare each revenue source to the others.

A pie chart, because it visually represents parts of a whole.

A histogram, because tax revenue is a quantitative variable.

A dotplot, to show the distribution of the different revenue amounts.

Explanation

A pie chart is specifically designed to show how a whole amount is divided into parts. It is ideal for displaying the proportion of the total tax revenue that comes from each categorical source, thus emphasizing each source's contribution to the whole.

3

A sociologist wants to compare the distribution of educational attainment (e.g., high school, bachelor's degree, master's degree) for residents of two different cities. Which graphical display would be most effective for a direct comparison between the two cities?

Two separate pie charts, one for each city's educational attainment.

A histogram of educational attainment, treating the categories as numerical values.

A side-by-side bar chart showing the educational attainment categories for each city.

A single bar chart showing the combined educational attainment for both cities.

Explanation

A side-by-side bar chart is designed to compare the distribution of a categorical variable across two or more groups. It places the bars for each city next to each other for each category, allowing for easy comparison. A combined bar chart (A) would lose the city-to-city comparison. Two pie charts (B) make direct comparison of categories difficult. A histogram (D) is inappropriate for categorical data.

4

Data on the preferred social media platform were collected from a sample of 200 teenagers and a separate sample of 300 adults. To compare the distributions of preferences between teenagers and adults, what is the most significant disadvantage of using two frequency bar charts instead of two relative frequency bar charts?

The frequency bar charts cannot be used for categorical data, only relative frequency charts can.

The direct comparison of bar heights between the two charts could be misleading due to the different sample sizes.

It would be impossible to determine the total number of people who prefer a certain platform from the frequency charts.

The frequency bar charts would not show the most popular platform for each group.

Explanation

When group sizes are unequal, comparing raw counts (frequencies) is misleading. A category might have a higher count in the adult group simply because there are more adults in the sample. Relative frequency charts standardize the comparison by showing proportions within each group, allowing for a valid comparison of preferences irrespective of sample size.