Objective: This example shows how inferences are made about a single population mean for continuous data when the sample size is small. The t-test and the associated confidence interval are the standard tools for making statements about the population mean, but they require the data to be sampled independently from a normal distribution. This example goes through the steps of checking the assumptions and setting the confidence interval. Example #1 in the hypothesis testing module will revisit this problem to test a hypothesis about the population mean.
Problem Description: Data was collected on the breadth-to-length ratios of beaded rectangles used as decorations on leather goods by the Shoshoni American Indians. Are these rectangles consistent with the golden rectangles of the ancient Greeks (i.e., 1:0.618034)?
The histogram of the ratios shows that the data is positively skewed. Click on the right Animate button to increase the number of bins by 1. Then click on the Moments triangular reveal button to see the skewness value of 0.632.
Despite the lack of normality, we will proceed with inferences based on the normal assumption, but we must be cautious in our interpretation. We will set a confidence interval on the population mean ratio. This can be done by clicking on the Confidence Interval reveal button on the histogram plot.
These moment summaries, shown in the top histogram above, help to clarify the interpretation of the confidence band. The sample mean is 0.66 which is graphically represented by the center line of the normal density. Note that the sample mean line passes through the center of the 95% confidence band. The location and length of the confidence band is compromised by the skewness (and outliers) of the distribution. Specifically, the sample mean is larger than the sample median of 0.641, which is seen in the normal quantile plot. The outliers not only increase the mean, but they also increase the standard deviation, which in turn lengthens the confidence band.
The golden rectangle ratio of 0.618 is within the 95% confidence band, which shows it is a plausible value. However, 0.618 is just inside the interval and clearly we must be careful in stating our conclusions. Due to violations of the assumptions, the evidence is inconclusive. The outliers increase the sample mean relative to the value of 0.618. On the other hand, the outliers increase the width of the confidence band, which makes it easier for the interval to cover 0.618. The confidence level can be changed by selecting other menu items from the Level popup menu. A 99% confidence interval gives more confidence at the expense of a wider interval. The golden rectangle is consistent with a 99% interval, but not with a 90% confidence interval.
Since the normal assumption is violated, it would be prudent to find a normalizing transformation of the data values prior to making inferences. Alternatively, a nonparametric approach can be used or the outlying values can be deleted. We will examine this latter option.
A histogram of the ratios with the two outliers removed is shown in the plot below.