Using Programs with Data
Help Questions
AP Computer Science Principles › Using Programs with Data
A scientist has a large dataset of daily temperature readings from a weather station. The scientist wants to analyze only the days where the temperature exceeded 30 degrees Celsius. Which of the following data processing techniques is most appropriate for this task?
Filtering the dataset to create a new dataset containing only records with a temperature value greater than 30.
Combining the dataset with a separate dataset of daily precipitation levels to create a more comprehensive climate record.
Visualizing the entire dataset as a line graph to observe the overall temperature trend across all recorded days.
Transforming each temperature value in the dataset by adding 30, creating an adjusted set of readings for analysis.
Explanation
Filtering is the process of selecting a subset of data that meets a specific criterion. In this case, the criterion is a temperature greater than 30, so filtering is the correct technique to isolate the desired records for analysis.
A financial analyst is using a program to process a list of stock prices. The program iterates through the list and converts each price from US dollars to Euros by multiplying by the current exchange rate. This is an example of which data process?
Visualizing a data set to create a graph of stock price changes over time.
Transforming every element of a data set to create a new representation of the data.
Combining data to find the single highest stock price in the entire list.
Filtering a data set to keep only prices that are within a certain desired range.
Explanation
Transforming data involves applying a calculation or modification to each element in a dataset. In this case, every stock price (each element) is being multiplied by the exchange rate, thereby transforming the entire dataset from one currency to another.
A researcher is analyzing a dataset of website user activity, which includes a timestamp for every click. The researcher uses a program to transform the raw timestamps into broader categories like "Morning", "Afternoon", and "Evening". What is the primary benefit of this transformation for generating new knowledge?
It can help reveal patterns in user engagement based on the time of day, which are not immediately obvious from precise timestamps.
It filters out all user activity that occurs overnight, which is considered irrelevant to the primary research questions.
It cleans the data by automatically removing any incorrect or invalid timestamp entries that were recorded due to system errors.
It significantly reduces the storage size of the dataset by using shorter text strings instead of long numerical timestamp values.
Explanation
Transforming data into categories is a way to abstract details and reveal higher-level patterns. By grouping precise timestamps into general time-of-day categories, the researcher can more easily identify trends, such as whether users are more active in the evening.
A city's transportation department collects data on traffic flow from road sensors and separate data on public transit usage from fare card systems. By using a program to combine these two datasets, what new insight could the department most likely gain?
The total number of vehicles registered in the city, which is only available from a separate vehicle registration database.
The daily weather patterns for the last year, which would require a different dataset from a meteorological service.
An understanding of how traffic congestion levels correlate with the usage of specific public transit routes.
The specific manufacturing date of each bus in the transit fleet, which would be found in internal maintenance records.
Explanation
Combining datasets allows for the discovery of relationships and patterns between them. By merging traffic flow data and transit usage data, analysts can investigate whether higher transit usage corresponds with lower traffic congestion, providing valuable insight for urban planning.
A wearable collects numerical step counts and heart rate readings and stores them in a weekly summary. A program compares the user’s activity to personal goals, calculates progress percentages, and sends reminders when progress falls behind. The reminders help users adjust habits over time. Based on the scenario above, what is a potential limitation of the data processing method described?
It depends on sentiment analysis of posts to compute step-count progress.
It always improves health outcomes because reminders guarantee behavior change.
It cannot store numerical data, so weekly summaries must be text-only.
If sensor readings are inaccurate, progress percentages and reminders may be misleading.
Explanation
This question tests AP Computer Science Principles skills: using programs to process data. Using programs to process data involves collecting, manipulating, and analyzing data to achieve specific outcomes, but accuracy depends on data quality. In the passage, the program processes fitness data by comparing activity to goals, calculating progress percentages, and sending reminders based on these calculations. Choice A is correct because it identifies a fundamental limitation: if the sensor readings are inaccurate, all subsequent calculations (progress percentages) and outputs (reminders) will be misleading, potentially harming rather than helping users. Choice B is incorrect because it makes an unrealistic claim that reminders guarantee behavior change, ignoring both data accuracy issues and human psychology. To help students, discuss how data quality affects all downstream processing and outcomes. Emphasize the principle of 'garbage in, garbage out' and encourage critical thinking about data collection reliability.
A store app collects transaction records with categorical product IDs and numerical quantities, saved in a customer history. A program finds frequently repeated purchases, ranks related items, and displays suggestions during checkout. The goal is to improve shopping efficiency for returning customers. Based on the scenario above, what is the main purpose of the program in the described scenario?
To recommend products by analyzing patterns in past transaction records.
To block all purchases until the database is completely empty.
To predict tomorrow’s weather using product IDs and quantities.
To convert customer histories into unrelated random numbers for entertainment.
Explanation
This question tests AP Computer Science Principles skills: using programs to process data. Using programs to process data involves collecting, manipulating, and analyzing data to achieve specific outcomes, often focusing on pattern recognition for practical applications. In the passage, the program processes transaction data by finding frequently repeated purchases, ranking related items, and displaying suggestions to improve shopping efficiency for returning customers. Choice A is correct because it accurately identifies the program's purpose: analyzing patterns in past transaction records to recommend products, which matches the described functionality. Choice B is incorrect because it absurdly suggests using product IDs and quantities to predict weather, demonstrating confusion between completely unrelated data domains and processing goals. To help students, emphasize matching the program's stated goal with its processing methods. Practice identifying logical connections between data types, processing steps, and intended outcomes.
A city collects numerical speed and car-count data from cameras at intersections and stores it in time-stamped logs. A program detects congestion by comparing current speeds to typical speeds for that time of day. It then adjusts traffic light durations to improve flow during rush hour. Based on the scenario above, what is a potential limitation of the data processing method described?
It requires sentiment labels like “positive” and “negative” for each car.
It prevents any data collection because logs cannot store time stamps.
It may miss sudden events if typical-speed comparisons lag behind real-time changes.
It guarantees perfect traffic flow because averages always match every situation.
Explanation
This question tests AP Computer Science Principles skills: using programs to process data. Using programs to process data involves collecting, manipulating, and analyzing data to achieve specific outcomes, but all methods have limitations based on their design. In the passage, the program processes traffic data by comparing current speeds to typical speeds for specific times, which relies on historical patterns to detect congestion. Choice A is correct because it identifies a real limitation: the system may not respond quickly to sudden, unexpected events since it relies on comparisons to typical patterns, creating a potential lag in detection and response. Choice B is incorrect because it makes an unrealistic claim about guaranteeing perfect traffic flow, which no data processing system can achieve due to the complexity and variability of real-world traffic. To help students, discuss how data processing methods have inherent trade-offs between accuracy, speed, and adaptability. Encourage critical thinking about what scenarios might challenge a system's assumptions.
A school weather station collects numerical temperature and humidity readings each hour for a month and stores them in a spreadsheet-like table. A program cleans missing entries, calculates daily highs and lows, and looks for repeating patterns across weeks. It uses those patterns to estimate likely conditions for the next few days. Based on the scenario above, what is the main purpose of the program in the described scenario?
To create social media posts that advertise the weather station.
To predict upcoming weather conditions from temperature and humidity trends.
To translate the spreadsheet into another language for international visitors.
To recommend new clothing brands to students based on their purchases.
Explanation
This question tests AP Computer Science Principles skills: using programs to process data. Using programs to process data involves collecting, manipulating, and analyzing data to achieve specific outcomes, often requiring pattern recognition and predictive analysis. In the passage, the program processes weather data by cleaning missing entries, calculating statistical measures (daily highs/lows), and identifying patterns across weeks to make predictions. Choice B is correct because it accurately identifies the program's purpose: using temperature and humidity trends to predict future weather conditions, which aligns with the described pattern analysis and estimation. Choice A is incorrect because it introduces unrelated concepts (clothing recommendations) that aren't mentioned in the weather station context, a common error when students don't carefully read the scenario. To help students, emphasize matching the program's purpose to its described actions and outputs. Encourage careful reading to avoid introducing unrelated concepts, and practice identifying the main goal versus intermediate steps.
A retail company has a list of prices for all its products. To account for a new tax, the company needs to increase every price by 5%. Which of the following describes the most direct way a program could accomplish this?
Transforming the dataset by iterating through the list and multiplying every price by 1.05 to calculate the new price.
Visualizing the prices in a pie chart to see the distribution across different product categories before the tax is applied.
Combining the prices to calculate the average product price, which is then used for inventory valuation reports.
Filtering the dataset to remove any prices that are below a certain minimum value before applying the tax.
Explanation
Transforming a dataset involves applying an operation to every element. To increase each price by 5%, a program would iterate through the list and multiply each value by 1.05. This modifies every element as required.
A teacher has a list of student scores from a recent exam. To understand the overall performance, the teacher wants to find the highest score achieved. Which data processing technique would a program use to extract this specific piece of information?
Combining or comparing data by iterating through the list to identify and store the single maximum value.
Transforming the data by converting each numerical score into a letter grade based on a predefined scale.
Visualizing the data by creating a histogram that shows the frequency distribution of all the scores.
Filtering the data to remove all scores that are below the class average to focus only on high-performing students.
Explanation
Finding the highest score requires comparing elements within the dataset. A program would iterate through the list, comparing each score to a variable holding the current maximum, and updating it when a higher score is found. This is a form of combining/comparing data to produce a single result.