Describing the Distribution of Quantitative Variables

Help Questions

AP Statistics › Describing the Distribution of Quantitative Variables

Questions 1 - 10
1

When describing the distribution of a quantitative variable, why is it essential to address unusual features like outliers or gaps?

They can significantly influence non-resistant summary statistics and may represent important contextual information.

They confirm that the distribution is approximately normal, which is a key assumption for many procedures.

They are almost always the result of measurement or data entry errors and should be immediately removed.

They provide the only way to determine the shape of the distribution, such as skewness.

Explanation

The correct answer is B. Unusual features are critical to describe because they can have a strong effect on summary statistics like the mean and standard deviation. They can also provide important insights into the data, representing unique cases or subgroups that should be investigated rather than ignored.

2

A local gym records the ages of its members. The data reveal a large number of members between 20 and 35 years old and another large number of members between 55 and 70 years old, with very few members in between these two ranges. Which unusual feature is present in this distribution?

A symmetric shape, because the two groups of members are balanced.

A gap, representing the range of ages where there are very few members.

A uniform shape, because all age groups are equally represented.

A single significant outlier representing an extremely old member.

Explanation

The correct answer is B. A gap is a region in a distribution where there are no, or very few, data values. The description of 'very few members in between' two concentrated groups of ages indicates a gap. The distribution is also likely bimodal with two clusters, but the gap is the feature that describes the space between them.

3

The distribution of the number of hours students at a large high school spent on homework last week is unimodal and skewed to the right. Which of the following is a plausible explanation for the shape of this distribution?

There are two distinct groups of students: those who do a lot of homework and those who do very little.

Most students spend many hours on homework, while only a few students spend very little time on it.

The number of hours spent on homework is roughly the same for all students surveyed.

Most students spend a relatively small to moderate number of hours, but a few students spend an exceptionally large number of hours.

Explanation

The correct answer is C. A distribution that is skewed to the right has a main body of data on the left (lower values) and a tail extending to the right (higher values). This shape corresponds to a situation where most students spend a small to moderate amount of time on homework, and a few students spend a very large amount of time, creating the right tail.

4

Which value in this dataset is best described as a potential outlier?

72, because it is the second lowest score and establishes the beginning of the main cluster.

79.5, the mean of the dataset, because it represents the central tendency.

88, because it is the maximum value and is therefore an extreme point in the dataset.

35, because it is unusually low compared to the other scores which are clustered together.

Explanation

The correct answer is A. The majority of the scores (72, 75, 78, 81, 83, 85, 88) are clustered in the 70s and 80s. The score of 35 is significantly lower than this cluster, making it a potential outlier. Being the maximum or minimum value does not automatically qualify a point as an outlier; the separation from the rest of the data is the key factor.

5

When describing a distribution of quantitative data that is strongly skewed with several outliers, which measures of center and variability are generally most appropriate to use?

Median and range, as the median is resistant and the range captures the full effect of the outliers.

Median and interquartile range (IQR), because they are resistant to the influence of outliers.

Mean and interquartile range (IQR), as this combination provides the most detail about the skewness.

Mean and standard deviation, because they incorporate information from every data point.

Explanation

The correct answer is B. The median and IQR are based on the relative positions of data values and are not significantly affected by extreme values. Therefore, they are considered 'resistant' measures. In contrast, the mean, standard deviation, and range are all sensitive to outliers and may not accurately represent the typical center and spread of a skewed distribution.

6

A dataset of the last digit of 1,000 telephone numbers is collected. Which of the following best describes the likely shape of the distribution of these digits?

Approximately uniform, because each digit from 0 to 9 is expected to occur with roughly equal frequency.

Skewed to the left, because smaller digits are typically less common in phone numbers.

Skewed to the right, because larger digits are typically more common in phone numbers.

Unimodal and symmetric, because the middle digits such as 4, 5, and 6 are most likely to occur.

Explanation

The correct answer is C. In a large, random set of telephone numbers, there is no reason to believe any one digit (0-9) would appear more frequently than another as the last digit. Therefore, the distribution of these digits is expected to be approximately uniform, with each digit having a roughly equal count.

7

A survey of residents in a large city asked for their daily commute time to work. The data show two distinct peaks: one around 20 minutes and another around 50 minutes. Which term best describes the shape of this distribution?

Skewed, as there will likely be a tail of very long commute times for some residents.

Bimodal, as there are two distinct clusters of data suggesting two common commute times.

Unimodal, as commute time is a single variable being measured.

Uniform, as all commute times in a large city are generally equally likely.

Explanation

The correct answer is B. A distribution with two distinct peaks is called bimodal. This shape often suggests that the data come from two different subgroups within the population, for example, people using different modes of transportation or living in different parts of the city.

8

A botanist measures the number of seeds produced by each of 36 plants grown under the same conditions. The dotplot shows values from 90 to 130 seeds, with the highest concentration around 108–112, and roughly similar frequencies on both sides of that center; there are no isolated points far from the rest.

Which statement best describes the distribution?

The distribution is approximately symmetric and unimodal, centered near about 110 seeds, with no apparent outliers.

The distribution is strongly left-skewed because the smallest values are below 100.

The distribution is bimodal because there are values on both sides of 110.

The distribution is strongly right-skewed because 130 is larger than 90.

The distribution is uniform because the range is about 40 seeds.

Explanation

This question tests describing quantitative distributions like seed counts in a dotplot, focusing on shape, center, spread, and outliers. The distribution is approximately symmetric and unimodal, centered near 110 seeds with balanced frequencies on both sides from 90-130 and no outliers. Distractor B incorrectly calls it strongly right-skewed because 130 > 90, but skew requires an imbalanced tail, not just range asymmetry. Mini-lesson: Symmetry means mirror-image halves; unimodal has one peak. Bimodality shows two peaks, uniformity even spread, left skew tails low. Always verify balance around the center and check for detached points as outliers.

9

A teacher recorded quiz scores (out of 10 points) for 35 students. The dotplot below shows the distribution of scores. Which statement best describes the distribution?

The distribution is uniform because all integer scores from 2 to 10 appear.

The distribution is approximately symmetric because the most common score is 8.

The distribution is left-skewed, with many high scores and a tail toward lower scores.

The distribution is bimodal because there is a score of 2.

The distribution is right-skewed, with many low scores and a tail toward higher scores.

Explanation

This question tests identifying distribution shape from a dotplot of quiz scores. The dotplot shows many students scored high (8-10 points), with progressively fewer students earning lower scores, creating a tail extending toward the left (lower scores). This pattern indicates left-skewness, where the tail points toward smaller values. The distribution is not right-skewed (choice B) because the tail extends left not right, not symmetric (choice C) despite having a mode at 8, not uniform (choice D) because frequencies vary greatly, and not bimodal (choice E) since one low score doesn't create a second mode. Quiz scores often show left-skewness when most students perform well but a few struggle.

10

A biology teacher recorded the number of seeds germinated (out of 50) for each of 32 trays under the same conditions. The dotplot below shows the distribution of germinated-seed counts. Which statement best describes the distribution?

The distribution is bimodal because there is an outlier at 28.

The distribution is uniform because counts occur at many different values from 28 to 47.

The distribution is strongly right-skewed because most trays have high counts and a few have much lower counts.

The distribution is strongly left-skewed because most trays have high counts and a few have much lower counts.

The distribution is approximately symmetric and unimodal, centered around about 41–43 seeds, with one unusually low tray near 28.

Explanation

This question asks about describing a distribution of seed germination counts from a dotplot. The data shows most trays had 41-43 seeds germinate out of 50, forming a symmetric bell shape around this center, with one unusually low value at 28 that stands apart as an outlier. This creates an approximately symmetric, unimodal distribution with an outlier. The distribution is not right-skewed (choice A) or left-skewed (choice B) because the main cluster is symmetric, not uniform (choice D) because values cluster rather than spread evenly, and not bimodal (choice E) because an outlier doesn't create a second mode. When describing distributions, identify outliers separately from the overall shape of the main data cluster.

Page 1 of 4