Next: Aspects of science
Up: Introduction
Previous: The Scientific Method
Statistics
There are some ``classic'' examples of the scientific method at
work. Mendeleev noticed some patterns in particular elements and
proposed the periodic table, which contained gaps that were later
filled by the discovery of new elements. Einstein started with a
theoretical hypothesis on the nature of gravity which was only years
later confirmed by experimental observations. Often though it
is hard to discern from experimental data whether or not some
sort of scientific principle is at work or not, and if so, what it
might be. Suppose you asked a group of 20 students to tell you the
season they were born in and also whether or not they own a car.
You might get some data as below.
Table 1.1:
20 student sample size
Birth month |
Owns a car |
Does not own a car |
Fall |
3 |
1 |
Winter |
1 |
2 |
Spring |
6 |
0 |
Summer |
6 |
1 |
|
From this data we might conclude that
- a student is more
likely to be born in the spring or summer,
- a good majority of students own cars,
- a student born in the spring or summer is more likely
to own a car than one born in the fall or winter.
However, if we went out and questioned
20,000 students, we might obtain the following data:
Table 1.2:
20,000 student sample size
Birth month |
Owns a car |
Does not own a car |
Fall |
2334 |
2425 |
Winter |
2456 |
2534 |
Spring |
2581 |
2533 |
Summer |
2625 |
2512 |
|
The apparent trends we saw with 20 students have now disappeared, and
we now find a more homogeneous mixture of birth months and owning or
not owning cars. This probably agrees with our expectations on this
question (although one should be objective!), but in any case this data
is contrived, being in fact generated by a random sequence of
numbers via a computer program.
The lesson to be learned is that one should be very careful in interpreting
statistics for trends, and that a small sample base can lead one to
misdirected conclusions. One less trivial example of this that
we shall encounter
later
concerns the dangers of living near power lines. Recent
statistical studies have come to opposite conclusions, but in this case
there are some
plausible scientific reasons that might explain a correlation between
a possible health hazard and living near power lines. Another example is
the link between lung cancer and smoking - initially this too started
as a statistical study, which was dismissed by many as being statistically
insignificant, but today the dangers are of course accepted by the vast
majority.
Next: Aspects of science
Up: Introduction
Previous: The Scientific Method
modtech@theory.uwinnipeg.ca
1999-09-29