Data sets accessible through Data Analysis App or Spreadsheet App.

## Univariate > Categorical

1. **2000 State Motorcycle Statistics**

**Description**: Number of motorcycle registrations, helmet requirements, and numbers of fatalities in 2000 by state.

**Source**: U.S. Federal Highway Administration

**Uses**: Compare fatality rate of states with helmet requirements to those in states without requirements; and construct a bar graph comparing the states.

2. **Chicago White Sox**

**Description**: 1919 season batting averages and 1919 World Series batting averages for Chicago White Sox who had 10 or more at bats in the World Series game and if the player was accused of throwing the series.

**Source**: www.baseball-reference.com/postseason/1919_WS.shtml

**Uses**: Compute change in batting average and compare to whether or not player was accused of throwing Series; and construct a bar graph comparing the change in average and whether or not accused.

## Univariate > Quantitative

1. **2000 State Motorcycle Statistics**

**Description**: Number of motorcycle registrations, helmet requirements, and number of fatalities in 2000.

**Source**: U.S. Federal Highway Administration

**Uses**: Find a fatality rate; construct a box plot or histogram; identify outliers; estimate centers and spread; and identify skewed distributions.

2. **Achievement Test Scores**

**Description**: Achievement test scores for all ninth graders in one high school.

**Uses**: Construct a histogram or box plot; identify a mound shaped distribution; estimate centers and spread; and investigate the percent of data within 1, 2, or 3 standard deviations from the mean.

3. **Apartment Temperatures**

**Description**: Variation in an apartment temperature (in degrees F) with its thermostat set to 70 degrees Fahrenheit each day at noon.

**Uses**: Construct a box plot or histogram; identify outliers; estimate centers and spread; identify skewed distributions; and compare sample mean to the hypothesized mean of 70.

4. **Battery Life**

**Description**: Life in hours of two brands of batteries

**Source**: *Navigating through Data Analysis in Grades 6–8 *(NCTM, 2003)

**Uses**: Construct comparative box plots and histograms, and use plots to determine if battery life is different between the two brands.

5. **Best Actress**

**Description**: Birthdays and ages of actress whose performances won in the Best Leading Actress category at the annual Academy Awards (Oscars) and the year the award was given.

**Sources**: www.oscars.com and www.imdb.com

**Uses**: Construct a box plot or histogram; identify outliers (e.g., examine the effect of removing 80 and 74); estimate centers and spread; identify skewed distributions; and look for trends over time (e.g., Are there predictable patterns in the age of the winners?).

6. **Certificate Perimeters**

**Description**: Perimeter measurements (mm) of the border of a certificate measured by 131 students of a Wisconsin high school.

**Uses**: Construct a box plot or histogram; identify outliers; and estimate centers and spread.

7. **Concord & Portland Monthly Precipitation**

**Description**: Normal monthly precipitation (rain and snow) in inches for Concord, New Hampshire and for Portland, Oregon.

**Source**: National Climate Data Center, 2005

**Uses**: Construct comparative box plots; investigate measures of variability (IQR and standard deviation); estimate centers and spread; and investigate if there is a statistical difference between the mean temperatures for the two cities.

8. **Dissolution Times**

**Description**: Variation in times (in seconds) for a solute to dissolve.

**Source**: This is student-collected data from a chemistry experiment.

**Uses**: Construct a histogram or box plot; identify a mound shaped distribution; estimate centers and spread; and use the histogram to investigate the percent of data within 1, 2, or 3 standard deviations from the mean.

9. **Fastest Growing Franchises**

**Description**: Rank, franchise name, type of service, minimum and maximum start-up costs (in $1,000s) for the 100 fastest growing franchises in the U.S. The ranking is based on the number of new franchise units added from 2005 to 2007.

**Source**: www.entrepreneur.com/franzone/rank/o,6584,12-12-F5-2006-7-0.html

**Uses**: Construct a box plot or histogram; identify outliers; estimate centers and spread; identify skewed distributions; and investigate which type of franchises have the largest difference between minimum and maximum start-up costs.

10. **Gas Mileage**

**Description**: Variation in gas mileage of a car over a 25-week span.

**Uses**: Construct a histogram or box plot; identify a mound shaped distribution; estimate centers and spread; use the histogram to investigate percent of data within 1, 2, or 3 standard deviations from the mean; and examine the trend in gas mileage over time.

11. **Heights of Students and Basketball Players** (Univariate Quantitative)

**Description**: Heights (cm) for a group of middle school students and heights of 25 professional basketball players.

**Source**: *Navigating Through Data Analysis in Grades 6–8 *(NCTM, 2003)

**Uses**: Construct and analyze comparative box plots; construct histograms; identify mound shaped distributions; and estimate centers and spread.

12. **Heights of Young Adults**

**Description**: Heights (in inches) of 1,000 males and 1,000 females. The heights have been rounded to the nearest inch.

**Uses**: Construct and analyze comparative box plots; identify outliers; construct histograms; identify mound shaped distributions; and estimate centers and spread.

13. **January Sunshine**

**Description**: Average percent of sunshine for the month of January (up to 2002). The percent of sunshine is the percentage of time that sunshine reaches the surface of the Earth at 174 different major weather-observing stations in all 50 states, Puerto Rico, and the Pacific Islands. The two stations with the highest percentages are Tucson and Yuma, Arizona. The station with the lowest percentage is Quillayute, Washington.

**Source**: www.ncdc.gov/oa/climate/online/ccd/avgsun.html

**Uses**: Construct a histogram or box plot; identify outliers; identify a mound shaped distribution; estimate centers and spread; and use the histogram to investigate the percent of data within 1, 2, or 3 standard deviations from the mean.

14. **Land Use **

**Description**: Number of acres (in 1,000) of urban areas in 1960 and 2002 and number of acres of forest in 1959 and 2002 for each of the 48 continental U.S. states and the District of Columbia (excludes Alaska and Hawaii).

**Source**: www.ers.usda.gov/Data/MajorLandUses/

**Uses**: Construct comparative box plots (e.g., compare acres of urban in 1960 and 2002, compare acres in forest 1959 and 2002); create and examine histograms of skewed and mound shaped distributions; examine the effect of outliers on measures of center and spread; and investigate if there is a statistical difference between the mean temperatures for the two cities.

15. **Los Angles Rainfall**

**Description**: Rainfall (in inches) in Los Angeles for the 129 years from 1878 through 2006.

**Source**: National Weather Service

**Uses**: Construct a box plot or histogram; identify outliers (1884 rainfall amount effect on measures of center and spread); estimate centers and spread; and identify skewed distributions.

16. **Manufactured Nails**

**Description**: Nail length (in inches) for 10 nails made by a machine that is set to have a mean length of 2 inches and a standard deviation of 0.03 inches.

**Uses**: Make a comparison of this sample mean to the hypothesized mean of 2.

17. **Mean Hourly Earnings**

**Description**: Mean hourly earnings (in dollars) for 70 different occupations in the United States. Earnings are for all full-time, nonmilitary workers and do not include benefits, overtime, vacation pay, nonproduction bonuses, or tips.

**Source**: U.S. Department of Labor, National Compensation Survey: Occupational Earnings in the United States, Table 3.

**Uses**: Construct a box plot or histogram; identify outliers (e.g., examine the effect of CEO and physician earnings on measures of center and spread); estimate centers and spread; and identify skewed distributions.

18. **Meaningful Words**

**Description**: Two lists of 20 three-letter “words.” One list contained meaningful words (e.g., CAT, DOG), whereas the other list contained nonsense words (e.g., ATC, ODG). A ninth-grade class of thirty students was randomly divided into two groups of fifteen students. One group was asked to memorize the list of meaningful words; the other group was asked to memorize the list of nonsense words. The number of words correctly recalled by each student was tabulated, and the resulting data are as follows:

**Source**: *Focus in High School Mathematics Reasoning and Sense Making *(NCTM 2009)

**Uses**: Construct comparative box plots; and introduce the randomization test.

19. **Migraines**

**Description**: Time passed to get relief from a migraine headache for two different medications.

**Source**: *Navigating Through Data Analysis in Grades 6–8 *(NCTM, 2003)

**Uses**: Construct comparative box plots and histograms; refer to the comparative plots and discuss if there is a difference between the two medications; and introduce randomization test.

20. **Min and Max Temperatures**

**Description**: Maximum and minimum temperatures (in F) on record at 289 major U.S. weather-observing stations in all 50 states, Puerto Rico, and Pacific Islands.

**Source**: www.ncdc.noaa.gov/oa/climate/online/ccd/

**Uses**: Construct a histogram or box plot; identifying outliers; determine the minimum, maximum and the difference between min and max temperatures.

21. **Nickel Weights**

**Description**: Height to nearest hundredth of a gram, of a sample of 100 new nickels.

**Uses:** Construct a histogram or box plot; identify a mound shaped distribution; and estimate centers and spread.

22. **Non-Normal Distribution**

**Description**: Random set of numbers generated from a non-normal distribution.

**Uses: **Construct a box plot or histogram; identify outliers; estimate centers and spread; and identify skewed distributions.

23. **Number of Marriages**

**Description**: Marriage rate per 1,000 people for 50 U.S. states in 2004. The District of Columbia had a rate of 4.5.

**Source**: Division of Vital Statistics, National Center of Health Statistics www.cdc.gov/nchs/data/nvss/marriage90_04.pdf

**Uses:** Construct a box plot or histogram; identifying outliers (e.g., examine the effect of Nevada); estimate centers and spread; and identify skewed distributions.

24. **Number of Video Games**

**Description**: Number of video games available on 43 selected platforms.

**Source**: www.mobygames.com/moby_stats

**Uses**: Construct a box plot or histogram; identify outliers (e.g., examine the effect of Windows and DOS); estimate centers and spread; and identifying skewed distributions.

25. **Old Faithful**

**Description**: Each column is a set of consecutive eruptions wait times in minutes for the Old Faithful geyser in Yellowstone National Park collected in 1985.

**Source**: *Focus in High School Mathematics Reasoning and Sense Making in Statistics and Probability *(NCTM, 2009)

**Uses**: Construct histograms, box plots, and time series plot (or observation number on *x*-axis and duration time on *y*-axis).

26. **PSU Women Heights**

**Description**: Heights (in inches) of 123 women in a statistics class at Penn State University in the 1970s.

**Source**: Joiner, Brian L. “Living Histograms.” *International Statistical Review *3 (1975): 339–340.

**Uses**: Construct a histogram or box plot; identify a mound shaped distribution; estimate centers and spread; and use the histogram to investigate percent of data within 1, 2, or 3 standard deviations from the mean.

27. **Random Rectangles**

**Description**: Area of 100 rectangles placed randomly on a sheet of paper.

**Source**: *Navigating Through Data Analysis in Grades 9–12 *(NCTM, 2003)

**Uses**: Construct a histograms; estimate center and spread; and conduct trials with Distribution of Sample Custom App.

28. **Ratings of Movie Showings**

**Description**: How a student at the University of Alabama, Huntsville, rated the projection quality of nearby movie theaters. For each showing, a point was deducted for such things as misalignment, misframing, or an audio problem. He visited one theatre in Huntsville 92 times in the first five-and-a-half years it was open. The number of points deduced per showing is given below.

**Source**: home.hiwaay.net/~criswell/theatre/generated_subpages/ratings_table/ratings_table.html

**Uses**: Construct a box plot or histogram; identify outliers (e.g., examine the effect of the two 12s); estimate centers and spread; and identify skewed distributions.

29. **Roller Coasters**

**Description**: Greatest drop in feet of 55 major roller coasters in the U.S.

**Source**: *Navigating Through Data Analysis in Grades 6–8 *(NCTM, 2003)

**Uses**: Construct a box plot or histogram; identify outliers; estimate centers and spread; determine effect of outliers on the mean; and identify skewed distributions.

30. **Study Time**

**Description**: Number of hours 36 members of a high school softball team reported that they studied in a typical week.

**Uses**: Construct a box plot or histogram; identify outliers; estimate centers and spread; and investigate the effect an outlier has on measures of center and spread.

31. **Sunshine for All Months**

**Description**: Average percent of sunshine of the maximum possible through 2002 for 174 selected cities around the United States.

**Source**: www.ncdc.noaa.gov/oa/climate/online/ccd/avgsum.html

**Uses**: Construct stacked box plots or histograms to compare the shape, center, and spread of the distributions by month.

32. **TV Watching**

**Description**: Number of hours watching TV for one week as reported by a group of seventh graders.

**Source**: *Navigating Through Data Analysis in Grades 6–8 *(NCTM, 2003)

**Uses**: Construct a box plot or histogram; identify outliers; and estimate centers and spread.

33. **U.S. Census 2000 & 2010**

**Description**: The spreadsheet contains data about apportionment populations and political representation in the U.S. House of Representatives for the 50 states from the censuses conducted in 2000 and 2010.

**Sources**: http://2010.census.gov/2010census/data/ and www.census.gov/population/apportionment/data/2010_apportionment_results.html

**Uses**: Make comparisons across population and political representation data for 2000 and 2010; and investigate change over time (e.g., Which states gained members in the House of Representatives and which states lost members? Are there any regional patterns of change? Is there consistent representation across the states’ varied populations?).

34. **Vertical Jumps**

**Description**: Vertical jump height (in inches) of 27 basketball players in an NBA draft.

**Uses**: Construct a box plot or histogram; identify outliers; estimate centers and spread; and use the histogram to investigate the percent of data within 1, 2, or 3 standard deviations from the mean.

35. **Walking Speeds**

**Description**: These data show the mean times (in seconds) to walk 60 feet in various cities of the world.

**Source**: www.britishcouncil.org/paceoflife.pdf

**Use**: Construct a box plot or histogram; identify outliers; and estimate centers and spread.

Back to *Core Math Tools* Homepage