Probability & Statistics Honors Unit Test #2

_____ / 41

 1 The number of trees planted in in New York City by the Port Authority of New York and New Jersey is recorded each year. A sample of these measurements is shown above. Describe the distribution. (5)

The distribution of number of trees planted is skew right and centered near 35 trees. The range is somewhere between 110 (10-120) and 130 (0-130) trees. There are two potential high outliers between 100 and 130 trees.

 2 Benji’s ACT composite score is at the 75th percentile. Explain what this means. (3)

Benji’s score is as high or higher than 75% of all other ACT composite scores.

 7.23 15.44 12.66 7.97 14.37 11.11 5.67 7.83 13.12 10.84 11.71 13.04 12.73 15.35 11.22 11.38 13.75 14.24 13.21 10.95

 3 In order to test an oyster sorting machine, a sample of oysters was obtained and the volume (cubic centimeters) of each was obtained. The collected data are shown above. Determine if there are any outliers in the data. (5)

Tukey’s Rule of Thumb indicates outliers are lower than ${Q}_{1}-1.5\left(IQR\right)$ or higher than ${Q}_{3}+1.5\left(IQR\right)$. For these data, ${Q}_{1}=10.895$ and ${Q}_{3}=13.48$, so $IQR=13.48-10.895=2.585$. The lower limit for outliers would be $10.895-1.5\left(2.585\right)=7.0175$ and the upper limit would be $10.895+1.5\left(2.585\right)=17.3575$. There is one datum lower than 7.0175 (at 5.67 cc). There are no data above 17.3575. Thus, there is one outlier in this data set.

The City of Los Angeles obtains water from two sources—fresh water (from rainfall and streams) and recycled water (water that has been used in the city once already). The data below are the amount of recycled water used in Los Angeles (in acre-feet) for a random selection of months.

 1125 874 1006 835 336 980.2 833 860 760 230 921 715.1 782 1104 665 815 1159 1035 880 1223

 4 Find the mean of the data. (2)

856.915 acre-feet

 5 Find the median of the data. (2)

867 acre-feet

 6 Find the range of the data. (2)

993 acre-feet

 7 Find the standard deviation of the data. (2)

248.3002 acre-feet

 8 Find ${Q}_{1}$ for these data. (2)

771 acre-feet

 9 Find ${Q}_{3}$ for these data. (2)

1020.5 acre-feet

 10 Find the interquartile range of these data. (2)

249.5 acre-feet

 11 Determine which measures of center and spread are best for these data. Justify your answer. (4)

Let’s look at the distribution first.

The distribution is approximately symmetric with no obvious outliers, so the better measures of center and spread would be the mean and standard deviation.

The California Department of Transportation routinely collects weather data in the city of Pasadena. The data shown above are the daily high temperature (°F) for a random sample of days.

 77 91 98 83 87 80 80 77 83 73 96 73 54 94 86 83 60 90 79 85

 12 Construct a boxplot of these data. (5)

Ecologists in Brazil measured the diameter (cm) of a sample of trees from the Tapajos National Forest in two different years. Samples from the data collected each year are given below.

2001 Measurements

 14.9 53.2 85.7 38.7 17.8 16 14 44.3 27.2 24.3

2005 Measurements

 78 23 67.7 11.4 16.3 142.3 200.4 15.3 12.9 57.4

 13 Compare the distributions of these two samples. (5)

Let’s look at the graphs.

The center for the 2001 distribution is clearly lower than that of the 2005 distribution. The spread of diameters in 2001 is also smaller than that in 2005. Both distributions are skew right. Both distributions may have one high outlier.

Page last updated 15:46 2021-09-09