## NORMAL DISTRIBUTION

The normal curve was developed mathematically in 1733 by DeMoivre as an approximation to the binomial distribution. His paper was not discovered until 1924 by Karl Pearson. Laplace used the normal curve in 1783 to describe the distribution of errors. Subsequently, Gauss used the normal curve to analyze astronomical data in 1809. The normal curve is often called the Gaussian distribution. The term bell-shaped curve is often used in everyday usage.

The normal distribution is the most used statistical distribution. The principal reasons are:

1. Normality arises naturally in many physical, biological, and social measurement situations.
2. Normality is important in statistical inference.

The applet below plots the normal curve.

The displayed curve is called the normal density and is denoted by f(x). The density curve is shown by default, but if it is hidden, it can be made visible by selecting the f(x) radio button in the bottom panel. The normal density is used to compute probabilities.

The normal distribution is characterized by two parameters: the mean µ and the standard deviation sigma. The mean is a measure of location or center and the standard deviation is a measure of scale or spread. The mean can be any value between ± infinity and the standard deviation must be positive. Each possible value of µ and sigma define a specific normal distribution and collectively all possible normal distributions define the normal family.

Any member of the normal family can be displayed by changing µ and sigma in the above applet. This is done by clicking on the right or left arrow animation buttons of µ or sigma in the bottom panel. Clicking on the right (left) arrow button of µ moves the curve to the right (left) without changing the spread. Clicking on the right (left) arrow button of sigma flattens (contracts) the normal curve without changing the location. As µ and sigma are changed, the curve can march off the display area. The image can be centered by clicking on the Rescale button. Once you are finished experimenting with the effects of changing µ and sigma, click on the Reset button to get the standard normal density back.

The standard (or canonical) normal distribution is a special member of the normal family that has a mean of 0 and a standard deviation of 1. The standard normal random variable is denoted by Z as will be discussed later.

The standard normal distribution is important since the probabilities and quantiles of any normal distribution can be computed from the standard normal distribution—if µ and sigma are known.

Often we want to compute the probability that our random outcome is within a specified interval, i.e., P(a <= X <= b) where a could be -infinity and/or b could be +infinity. For continuous random variables, this probability corresponds to the area bound by a and b and under the curve. The probability X is a specific value, i.e., P(X = x), is 0 since no area is above a singe point. It follows that P(a <= X <= b) = P(a < X <= b) = P(a <= X < b) = P(a < X < b).

Probabilities are computed by selecting the appropriate menu item from the Prob menu. In this case, a corresponds to the lower limit and b corresponds to the upper limit. For example, to compute P(0 <= Z <= 1): 1) select a <= x <= b from the Prob menu (assuming µ is set to 0 and sigma to 1); 2) type 0 as the Lower Limit and 1 as the Upper Limit; and 3) click ok. The probability is 0.3413.

Probabilities can be computed directly in the above applet simply by changing the values of µ and sigma and entering the lower (a) and/or upper (b) limits when requested. For example, if X is normally distributed with a mean of 10 and a standard deviation of 2, then P(9 <= X <= 12) = 0.5326. This is computed by: 1) changing µ to 10 and sigma to 2; 2) selecting a <= x <= b from the Prob menu; 3) entering 9 as the Lower Limit and 12 as the Upper Limit; 4) and clicking ok. The mean and standard deviation can be changed by clicking on the left or right arrow buttons or by clicking directly on their symbols to display a dialog box.

This probability can also be computed from the standard normal distribution as P(-0.5 <= Z <= 1) = 0.5326 since 12 is (12 - 10)/2 = 1 standard deviation above the µ and 9 is (9 - 10)/2 = -0.5 standard deviation below µ. More generally, the z-value is computed as: z = (x - µ)/sigme and this is called the Z score transform.

The qth normal quantile is the value xq defined by:

q = P(X <= xq) = F(xq),
where F(x) denotes the cumulative normal distribution function. The cumulative normal distribution can be displayed by clicking on the F(x) radio button in the bottom panel.

Normal quantiles are computed from the Normal CDF, which is displayed by clicking on the F(x) radio button in the applet.

Example #1shows how probabilities and quantiles are computed for the LDL chlosterol distribution of male adults.

Example #2 computes probabilities and quantiles for the distribution of male heights which are assumed to be normally distributed.

Exercises

Exercise #1 requires you to compute probabilities and quantiles for the distribution of machined cylinder diameters, assumed to be normally distributed.

Exercise #2 requires you to compute probabilities and quantiles for the distribution of annual rainfall, assumed to be normally distributed, in a region.

Exercise #3 requires you to compute probabilities and quantiles for the distribution of daily absences per 100 employees before and after a health improvement program, assumed to be normally distributed, at a large corporation.

Exercise #5 requires you to compute quantiles for the distribution of scholastic test scores which are assumed to be normally distributed.