Home | About Sanderson Smith | Writings and Reflections | Algebra 2 | AP Statistics | Statistics/Finance | Forum

**This is an e-mail note sent to students in
Advanced Placement Statistics.**

Hi AP STATers...

This is an attempt to give you a better feel for the power of statistics and to summarize some of the sophisticated topics we have experienced in the last few weeks.

=====================

* You take random samples and examine a
**statistic** from
the sample in an attempt to obtain some useful information about the
population from which the sample came. Generally, you are attempting
to gain some knowledge about an unknown population **parameter**.

* In a properly-conducted survey, you often work
with **proportions**. While you can use use sample **counts**, proportions are perhaps
easier to read and interpret. For instance, if 823 from a random
sample of 2,054 voter favor Herkimer in an upcoming election, the 823
is a sample count, and one could work with it. However, it is usually
easier to discuss and interpret the proportion 823/2054, which is
.40, or 40%.

This situation is modeled by a

binomial distribution, but it can be approximated by anormal distribution. We can, for instance, easily determine the approximate probability that a sample of size 2054 would contain more than 830 Herkimer supporters, or less than 38% Herkimer supporters. (First situation involves count, the second involves proportion. They are, of course, related. You simply must be consistent in your analysis.)You can construct a

95% confidence intervalwhich is quite meaningful for a properly-obtained sample proportion. This is what is reported (very indirectly) in political polls. If a properly-conducted survey has Herkimer favored by 57% of the voters in a sample with a margin of error of 4%, then the 95% CI is 53% to 61%. Interpretation (assuming all the sampling was done properly): There is a 95% probability that the calculated interval will contain the true proportion of Herkimer voters in thepopulation from which the sample came. It is reasonable to assume that the population parameter is between 53% and 61%. Note that you are providing statistical information about anunknown population parameter. And, you are not saying anything definite. You have used the statistic 57% to make aninferenceabout a population parameter.

* In a somewhat different vein, you have what I
like to call **quality control
situations**. Sometimes you examine sample
means and make **inferences** from them. This is done in business, industry, research,
etc. In this situation, you have a population with known mean
m and
standard deviation s. You take a sample of size n from this population
calculate the mean (a statistics), and attempt to determine if you
would statistically conclude that it came from the described
population. You set up a **null
hypothesis**, H_{0}, and an **alternate hypothesis**,
H_{a}.

H

_{0}: Sample same came from population with mean m.H

_{a}: Sample did not come from population with mean m.

In industry, it you reject H_{0}, this might suggest your
equipment needs repair, adjustment, etc. In research, if you want to
establish that you have done something that makes a difference, you
would hope to reject H_{0}. Note that we are running test just on means, and making
no attempt to account for a possible change in standard deviation or
variance.

OK, you have a sample mean, x(bar). We know that
this single statistic is part of a normal distribution if
H_{0} is true.
Again, If H_{0}
is true, then, by the **Central Limit
Theorem**, x(bar) should be a number in a
normal distribution with mean m and standard deviation
s
/sqrt(n).

We ask "How likely is it that we would get x(bar)
if H_{0} is
true?" We calculate a**
P-value**. This is the probability that we
would get a statistic as extreme as we did if H_{0} is true. If the P-value is
small, we might be tempted to reject H_{0}. If we reject
H_{0}, then the
sample statistic is statistically significant. That is, it is
considerably different than what we might expect if
H_{0} is true.

But, what is statistically significant?

Common levels of significance are 5% and 1%. If
you test at the 5% level, you are saying that H_{0} will be rejected if the
P-value is less than 5%. That is, if there is less than a 5%
probability that you would get the statistic you did if
H_{0} is true,
then you will reject H_{0}. In essence, you are saying that there is strong evidence
to suggest that this sample did not come from a population with mean
m . There
is, of course, a 5% chance that you are incorrectly rejecting
H_{0}. When you
incorrectly reject a null hypothesis, you are making a
**Type I Error**.
When you set a level of significance at 5%, you are allowing a 5%
chance of making a Type I error.

OK, STATers... Lots to digest here. But, it will come together if you stick with it.

STATISTICAL POWER AND REASONING IS AWESOME.

Previous Page | Print This Page

Copyright © 2003-2009 Sanderson Smith