anvaya prep

ACT · Math · Statistics and Probability

High YieldMedium20 min read

Box plots

A complete ACT guide to Box plots — covering key concepts, exam-focused explanations, and high-yield FAQs.

Overview

Box plots (also called box-and-whisker plots) are visual representations of data distribution that display five key statistical measures simultaneously. On the ACT Math test, ACT box plots questions assess a student's ability to interpret statistical data, understand measures of central tendency and spread, and extract meaningful information from graphical representations. These questions typically appear 1-2 times per test and are considered medium difficulty, making them high-yield targets for score improvement.

Understanding box plots is essential for the ACT because they efficiently communicate complex statistical information in a single diagram. The ACT frequently uses box plots to test whether students can identify quartiles, medians, ranges, and outliers—all fundamental concepts in data analysis. Box plots also serve as a bridge between raw numerical data and statistical interpretation, requiring students to think both visually and analytically.

Box plots connect to broader mathematical concepts including data analysis, percentiles, measures of spread (range and interquartile range), and measures of center (median). They often appear alongside other statistical representations like histograms, dot plots, and frequency tables, requiring students to translate between different data visualization methods. Mastering box plots strengthens overall statistical literacy and prepares students for questions involving data comparison, distribution analysis, and real-world problem-solving scenarios that are increasingly common on standardized tests.

Learning Objectives

  • [ ] Identify when Box plots is being tested
  • [ ] Explain the core rule or strategy behind Box plots
  • [ ] Apply Box plots to ACT-style questions accurately
  • [ ] Construct a box plot from a given data set and identify all five key values
  • [ ] Compare two or more box plots to determine which data set has greater spread or higher median
  • [ ] Calculate the interquartile range (IQR) from a box plot and use it to identify potential outliers
  • [ ] Interpret the shape of a distribution (symmetric, skewed left, or skewed right) from a box plot

Prerequisites

  • Basic statistics terminology: Understanding terms like mean, median, mode, and range is essential because box plots specifically display median and range values
  • Ordering numbers: The ability to arrange data in ascending order is necessary for determining quartiles and constructing box plots
  • Number line interpretation: Box plots are drawn on number lines, so comfort with reading and plotting values on a number line is fundamental
  • Percentages and fractions: Understanding that quartiles divide data into fourths (25% segments) helps interpret what each section of a box plot represents

Why This Topic Matters

Box plots appear in real-world applications across numerous fields including business analytics, medical research, quality control, and social sciences. Companies use box plots to compare sales performance across regions, researchers use them to display experimental results, and educators use them to analyze test score distributions. The visual efficiency of box plots makes them invaluable for presenting complex data to diverse audiences who need to quickly grasp central tendencies and variability.

On the ACT Math test, box plot questions appear with moderate frequency—typically 1-2 questions per exam—but they represent a high-yield study opportunity because they follow predictable patterns. According to ACT test analysis, approximately 12-15% of Statistics and Probability questions involve graphical data interpretation, with box plots being one of the most common graph types. These questions often appear in the middle-to-later portion of the test (questions 30-50) and are worth the same point value as any other question, making them excellent targets for strategic preparation.

Common ACT question formats include: identifying specific values from a box plot (median, quartiles, range), comparing two data sets represented by box plots, determining which statements about a data set are true based on its box plot, and occasionally constructing a box plot from given data. Questions may also combine box plots with other statistical concepts like mean, standard deviation, or probability, testing multiple skills simultaneously.

Core Concepts

The Five-Number Summary

A box plot is constructed using the five-number summary of a data set, which consists of:

  1. Minimum: The smallest value in the data set (excluding outliers)
  2. First Quartile (Q1): The median of the lower half of the data; 25% of data falls below this value
  3. Median (Q2): The middle value that divides the data set in half; 50% of data falls below this value
  4. Third Quartile (Q3): The median of the upper half of the data; 75% of data falls below this value
  5. Maximum: The largest value in the data set (excluding outliers)

These five values completely define the box plot's structure. The "box" portion extends from Q1 to Q3, with a line marking the median inside. The "whiskers" extend from the box to the minimum and maximum values.

Anatomy of a Box Plot

The visual structure of a box plot contains specific components that each convey distinct information:

ComponentLocationInformation Conveyed
Left whiskerLine from minimum to Q1Range of lowest 25% of data
Left edge of boxVertical line at Q125th percentile boundary
Line inside boxVertical line at median50th percentile; center of data
Right edge of boxVertical line at Q375th percentile boundary
Right whiskerLine from Q3 to maximumRange of highest 25% of data
Box widthDistance from Q1 to Q3Interquartile range (IQR); middle 50% spread

The interquartile range (IQR) is calculated as Q3 - Q1 and represents the spread of the middle 50% of the data. This measure is resistant to outliers, making it more reliable than range for describing typical data spread.

Reading Values from a Box Plot

To extract information from a box plot on the ACT:

  1. Identify the scale: Check the number line beneath the box plot to understand the units and intervals
  2. Locate the five key points: Find where the whisker ends, box edges, and median line align with the scale
  3. Read values precisely: Align each feature vertically with the number line below
  4. Calculate derived values: Use the five-number summary to compute range (max - min) and IQR (Q3 - Q1)

Constructing a Box Plot

When given a data set, construct a box plot by following these steps:

  1. Order the data: Arrange all values from smallest to largest
  2. Find the median (Q2): Identify the middle value (or average of two middle values if even number of data points)
  3. Find Q1: Determine the median of all values below Q2
  4. Find Q3: Determine the median of all values above Q2
  5. Identify minimum and maximum: Note the smallest and largest values
  6. Draw the plot: Create a number line with appropriate scale, then plot the five values and connect them according to box plot structure

Example: For the data set {3, 5, 7, 8, 9, 11, 15, 16, 20}

  • Minimum = 3
  • Q1 = 6 (median of {3, 5, 7, 8})
  • Median = 9
  • Q3 = 15.5 (median of {11, 15, 16, 20})
  • Maximum = 20

Comparing Box Plots

The ACT frequently asks students to compare two or more box plots displayed on the same scale. Key comparison points include:

  • Center: Which data set has a higher median?
  • Spread: Which has a larger IQR or range?
  • Symmetry: Which distribution is more symmetric?
  • Overlap: Do the boxes overlap, indicating similar middle 50% ranges?

When comparing spread, remember that a wider box or longer whiskers indicate greater variability in the data.

Distribution Shape from Box Plots

Box plots reveal the shape of a distribution:

  • Symmetric: Median is centered in the box, whiskers are approximately equal length
  • Skewed right (positively skewed): Right whisker is longer than left, median is closer to Q1
  • Skewed left (negatively skewed): Left whisker is longer than right, median is closer to Q3

Understanding skewness helps predict where the mean falls relative to the median: in right-skewed distributions, the mean exceeds the median; in left-skewed distributions, the mean is less than the median.

Outliers and Modified Box Plots

An outlier is a data point that falls far from the rest of the data. The standard criterion uses the IQR:

  • Lower outlier boundary: Q1 - 1.5(IQR)
  • Upper outlier boundary: Q3 + 1.5(IQR)

Values beyond these boundaries are considered outliers. In modified box plots, outliers are shown as individual points, and whiskers extend only to the most extreme non-outlier values. While less common on the ACT, recognizing this variation is important.

Concept Relationships

The five-number summary serves as the foundation for all box plot interpretation. From this summary, the median provides the measure of center, while the IQR (derived from Q1 and Q3) provides the measure of spread. These two concepts work together to describe the middle 50% of the data—the most stable and representative portion of any distribution.

The relationship flows as follows: Raw data → Ordered data → Five-number summary → Box plot construction → Visual interpretation. Each step builds on the previous, transforming numerical information into visual form and then back into statistical insights.

Box plots connect to percentiles because Q1, median, and Q3 represent the 25th, 50th, and 75th percentiles respectively. This connection allows students to answer questions about what percentage of data falls above or below certain values. For example, knowing that Q3 represents the 75th percentile means that 25% of data values exceed Q3.

The concept of range (maximum - minimum) relates to box plots as the total span of the whiskers and box combined, while the IQR represents only the box width. Understanding that IQR is a more robust measure of spread than range (because it excludes extreme values) helps students evaluate data variability more accurately.

Box plots also connect to data comparison skills. When two box plots appear on the same scale, students can directly compare centers (medians), spreads (IQRs), and overall distributions. This visual comparison is more efficient than comparing two lists of numbers and represents a higher-level statistical thinking skill that the ACT values.

High-Yield Facts

The median is always represented by the vertical line inside the box, not the center of the box itself

The box width represents the IQR, which contains the middle 50% of all data values

Q1 marks the 25th percentile, meaning 25% of data falls below this value and 75% falls above it

The range equals maximum minus minimum and represents the total spread of the data

To find IQR, calculate Q3 - Q1; this value is resistant to outliers

  • The whiskers extend from the box edges to the minimum and maximum values (or to the most extreme non-outlier values in modified box plots)
  • If the median line is closer to Q1, the distribution is skewed right; if closer to Q3, it's skewed left
  • Box plots do not show the mean, mode, or individual data values—only the five-number summary
  • When comparing two box plots, the one with the larger box width has greater variability in its middle 50% of data
  • A symmetric box plot has the median centered in the box and whiskers of approximately equal length
  • You cannot determine the exact number of data points from a box plot alone
  • The minimum and maximum shown on a box plot may not be the true extreme values if outliers exist and are plotted separately

Quick check — test yourself on Box plots so far.

Try Flashcards →

Common Misconceptions

Misconception: The median is always at the center of the box.

Correction: The median is marked by a line inside the box, but its position varies based on the distribution. In skewed distributions, the median line appears closer to one edge of the box.

Misconception: The box plot shows the mean of the data set.

Correction: Box plots display the median (Q2), not the mean. These are different measures of center, and the mean cannot be determined from a box plot alone.

Misconception: Longer whiskers always mean the data set has greater overall spread.

Correction: While longer whiskers contribute to greater range, the IQR (box width) is often more important for understanding typical spread. A data set can have long whiskers but a narrow box, indicating extreme values but consistent middle data.

Misconception: Each section of the box plot (left whisker, left box half, right box half, right whisker) contains the same number of data points.

Correction: Each section represents 25% of the data values, but the physical length of each section shows the spread of that 25%, not the count. A longer section means those values are more spread out, not more numerous.

Misconception: You can determine the exact data values from a box plot.

Correction: Box plots show only the five-number summary. You cannot identify individual data points or determine how many data points exist unless additional information is provided.

Misconception: A wider box always means a larger data set.

Correction: Box width (IQR) indicates the spread of the middle 50% of data, not the number of data points. A small data set can have a wide box if its values are spread out, and a large data set can have a narrow box if its values are clustered together.

Misconception: The range and IQR are the same thing.

Correction: Range = maximum - minimum (total spread), while IQR = Q3 - Q1 (middle 50% spread). The IQR is always less than or equal to the range and is more resistant to extreme values.

Worked Examples

Example 1: Reading and Interpreting a Box Plot

Problem: The box plot below represents the test scores of 40 students in a mathematics class. The five-number summary shows: minimum = 62, Q1 = 74, median = 81, Q3 = 88, maximum = 98.

Questions:

a) What is the range of test scores?

b) What is the IQR?

c) What percentage of students scored above 88?

d) Is the distribution symmetric, skewed left, or skewed right?

Solution:

a) Range = maximum - minimum = 98 - 62 = 36 points

The range represents the total spread from the lowest to highest score.

b) IQR = Q3 - Q1 = 88 - 74 = 14 points

The IQR shows that the middle 50% of students' scores span 14 points, indicating moderate variability in the central data.

c) Q3 represents the 75th percentile, meaning 75% of students scored at or below 88. Therefore, 25% of students scored above 88.

This equals 0.25 × 40 = 10 students, though the question asks for percentage.

d) To determine distribution shape, examine the median's position within the box and compare whisker lengths:

  • Distance from Q1 to median: 81 - 74 = 7
  • Distance from median to Q3: 88 - 81 = 7
  • Left whisker length: 74 - 62 = 12
  • Right whisker length: 98 - 88 = 10

The median is centered in the box (equal distances of 7 on each side), and the whiskers are approximately equal length. The distribution is approximately symmetric.

Example 2: Comparing Two Box Plots

Problem: Two box plots are shown on the same scale, representing daily high temperatures (in °F) for City A and City B during June.

City A: min = 68, Q1 = 74, median = 78, Q3 = 82, max = 90

City B: min = 70, Q1 = 76, median = 82, Q3 = 86, max = 88

Which statements are true?

I. City B has a higher median temperature than City A

II. City A has greater temperature variability than City B

III. At least 25% of days in City B had temperatures above 86°F

Solution:

Statement I: Compare medians directly.

  • City A median = 78°F
  • City B median = 82°F

City B's median is 4 degrees higher. Statement I is TRUE.

Statement II: Compare measures of spread.

  • City A range = 90 - 68 = 22°F
  • City B range = 88 - 70 = 18°F
  • City A IQR = 82 - 74 = 8°F
  • City B IQR = 86 - 76 = 10°F

City A has a larger range (22 > 18), but City B has a larger IQR (10 > 8). The IQR is generally more important for assessing typical variability. However, since City A has greater overall range, we can say it has greater total variability. Statement II is TRUE (though this could be debated based on which measure of spread is prioritized; on the ACT, range is typically considered the primary measure of total variability).

Statement III: Q3 = 86°F represents the 75th percentile for City B. This means 75% of days had temperatures at or below 86°F, so only 25% had temperatures above 86°F. The statement says "at least 25%," which would include exactly 25%. Statement III is TRUE.

Answer: All three statements (I, II, and III) are true.

This example demonstrates the importance of carefully reading what each quartile represents and understanding that "at least" includes the boundary value.

Exam Strategy

When approaching ACT box plots questions, follow this systematic process:

  1. Identify what's being asked first: Before analyzing the box plot, read the question carefully to know whether you need to find a specific value, compare data sets, or determine a percentage.
  1. Locate the five key values: Quickly identify minimum, Q1, median, Q3, and maximum by aligning the box plot features with the number line scale.
  1. Watch for trigger words:

- "Median" → look for the line inside the box

- "Range" → calculate maximum - minimum

- "Interquartile range" or "IQR" → calculate Q3 - Q1

- "Middle 50%" → refers to the box (from Q1 to Q3)

- "At least 75%" → refers to values at or below Q3

- "Greater variability" or "more spread" → compare IQR or range

  1. Use process of elimination: For multiple-choice questions with several statements about a box plot, evaluate each statement independently. Eliminate obviously false statements first, then carefully verify remaining options.
  1. Draw quick sketches if needed: If asked to identify which box plot matches given data, quickly sketch the five-number summary on your test booklet to compare with answer choices.
  1. Check percentile relationships: Remember that Q1 = 25th percentile, median = 50th percentile, Q3 = 75th percentile. Use these to answer questions about what percentage of data falls above or below certain values.
  1. Time allocation: Box plot questions typically require 30-60 seconds. If a question asks you to construct a box plot from raw data, budget up to 90 seconds for ordering data and finding quartiles.
Exam Tip: If you see two box plots on the same scale, the question almost certainly asks you to compare them. Immediately calculate and compare their medians and IQRs before reading the answer choices.
Exam Tip: When a question asks "which statement must be true," be cautious about statements involving the mean or mode—these cannot be determined from a box plot alone.

Memory Techniques

Mnemonic for the Five-Number Summary: "My Quarterback Might Quit Monday"

  • Minimum
  • Q1 (First Quartile)
  • Median
  • Q3 (Third Quartile)
  • Maximum

Visualization Strategy: Picture a box plot as a "data sandwich":

  • The bread slices are the minimum and maximum (outer boundaries)
  • The box is the "filling" containing the middle 50% of data
  • The median line is the "center ingredient" that divides everything in half

Quartile Percentile Connection: Remember "Quarter = 25" to recall that:

  • Q1 = 25th percentile (1 quarter)
  • Q2 = 50th percentile (2 quarters = half)
  • Q3 = 75th percentile (3 quarters)

IQR Acronym: "IQR = Inner Quartile Range" helps remember it's the range of the inner box (Q3 - Q1), not the total range.

Skewness Memory Aid: Think of the longer whisker as a "tail" pointing in the direction of the skew:

  • Long right whisker = right skew (tail points right)
  • Long left whisker = left skew (tail points left)

Summary

Box plots are essential graphical tools for displaying the five-number summary of a data set: minimum, Q1, median, Q3, and maximum. On the ACT Math test, students must be able to read values from box plots, calculate derived measures like range and IQR, compare multiple data sets using box plots, and interpret distribution characteristics. The box itself represents the middle 50% of data (from Q1 to Q3), with its width indicating the IQR—a robust measure of spread. The median line inside the box marks the 50th percentile, while Q1 and Q3 represent the 25th and 75th percentiles respectively. Whiskers extend to the minimum and maximum values, showing the full data range. To succeed on ACT box plot questions, students must recognize that these graphs display only the five-number summary and cannot reveal means, modes, or individual data values. Comparing box plots requires analyzing both center (median) and spread (IQR and range), while distribution shape can be determined by examining the median's position within the box and the relative lengths of the whiskers.

Key Takeaways

  • Box plots display the five-number summary: minimum, Q1, median, Q3, and maximum
  • The box width equals the IQR (Q3 - Q1) and contains the middle 50% of all data values
  • The median line inside the box represents the 50th percentile, not necessarily the center of the box
  • Q1 and Q3 represent the 25th and 75th percentiles, allowing percentage calculations
  • Range (maximum - minimum) measures total spread, while IQR measures the spread of the middle 50%
  • Box plots cannot show the mean, mode, or individual data values
  • When comparing box plots, examine both center (median) and spread (IQR and range) to make complete comparisons

Histograms and Frequency Distributions: Understanding how data can be grouped into intervals and displayed as bars helps connect box plots to other visualization methods. Both show distribution shape, but histograms reveal more detail about frequency within ranges.

Measures of Central Tendency: Deepening knowledge of mean, median, and mode relationships helps interpret when box plots (which show median) are more appropriate than other statistical summaries.

Standard Deviation and Variance: These measures of spread complement the IQR by describing variability using all data points rather than just quartiles, providing a more complete picture of data dispersion.

Percentiles and Quartiles: Advanced understanding of how data is divided into percentage-based segments enables more sophisticated interpretation of box plots and other statistical displays.

Data Analysis and Probability: Box plots serve as a foundation for more complex statistical reasoning, including hypothesis testing and probability calculations based on data distributions.

Practice CTA

Now that you've mastered the fundamentals of box plots, it's time to solidify your understanding through practice! Work through the practice questions to apply these concepts to ACT-style problems, and use the flashcards to reinforce the key definitions and relationships. Remember, box plot questions are high-yield opportunities on the ACT—investing 20 minutes in focused practice can translate directly into points on test day. You've got this!

Key Diagrams

Ready to practice Box plots?

Test yourself with ACT flashcards and practice questions — free on AnvayaPrep.

Frequently Asked Questions