"The price tag is easy to read, butan understanding of quality requires education."

W. Edwards Deming

9.2 SAMPLE PROPORTIONS (Pages 472 - 477)

OVERVIEW: The normal distributioncurve is often extremely useful in analyzing sample proportions. This section provides insights into the circumstances that allow foruse of normal distribution properties.

Consider a simple random sample (SRS) of 1,000people from a large population. If X represents the number in thissample who are Republicans, then there are 1,001 possible values ofX, namely 0,1,2,3, ..., 998, 999, 1000. If p(hat) represents thepossible sample proportions of Republicans in the sample, then thereare 1,001 possible values of p(hat), namely 0/1000, 1/1000, 2/1000,..., 998/1000, 999/1000, 1000/1000. For a given sample, we mightfind p(hat) = .56. For another sample, we might find p(hat) = .52. We could choose many SRS's and calculate a p(hat) for each sample. In general, we would expect the distribution of p(hat) to beapproximately normal.

If we choose an SRS of size n from a largepopulation with population proportion p having some characteristic ofinterest, and if p(hat) is the proportion of the sample having thatcharacteristic, then

The sampling distribution of p(hat) isapproximately normal.

The mean of the sampling distribution is p (thepopulation parameter).

The standard deviation of the samplingdistribution is sqrt[p(1-p)/n].

It is reasonable to use the above statementswhen

-the population is at least 10 timesas large as the sample (Rule of Thumb 1).

-np is at least 10 and n(1-p) is at least 10.(Rule of Thumb 2).

Example:
Suppose it is known that 60% of the registered voters in a districtof over 20,000 people are Republicans. If you choose an SRS of 1000registered voters,

(a) what is the probability that theproportion of registered voters in the sample is between 58% and62%?

(b) what is the probability that the sample willcontain no more than 550 Republicans?

First, note that both thumb rules are satisfied. The sample proportion p(hat) has mean = .6 and standard deviation =sqrt[(.6)(.4)/1000] = .0155.

Response to (a): Using the TI-83, normalcdf(.58,.62,.60,.0155) =.8031, or 80.31%

Response to (b): 550/1000 = 0.55. the probabilityof a sample proportion containing at most 55% Republicans isnormalcdf(-1E99,.55, .60,.0155) = 0.000628, or about 0.0628%.

Things we might note:z.55 = (.55- .60)/.0155 = -3.225. That is, a proportion of .55 is more than 3standard deviations below the mean. This represents a rather rarescore in a normal distribution N(.60, .0155). Also, 0.000628 isapproximately 1/1592. In other words, if we had around 1,600 randomsamples of size 1000, we would "expect" only one of them to have 550of fewer Republicans.