Logical

Random Sampling

When determining cause-and-effect relationships, and in many other questions, science requires random sampling.

"Random" here means that every individual in a population is equally likely to be included in any sample taken, or at least that individuals with a certain intersting property are no more likely to be included than individuals without that property. That's all it means.

Say we're trying to assess the side effects of Upslap, a drug that is widely used in the treatment of mopery. Some doctors have noticed that some of their patients who take Upslap also experience chronic hiccups. Scientists decide to find out if Upslap really causes hiccups and, if so, how strongly it does so.

Three studies are conducted. Each one studied a small proportion of the people who take Upslap and compared their hickuppery with the hickuppery of the general population, and then generalized their result to all Upslap takers.

Study one found that their sample of Upslap takers had 20% more chronic hiccupers than the general population, and concluded that Upslap takers in general were 20% more likely to have hiccups than non-Upslap takers.

Study two found that their sample of Upslap takers had 3% more chronic hiccupers than the general population, and concluded that Upslap takers in general were 3% more likely to have hiccups than non-Upslap takers.

Study three found that their sample of Upslap takers had no chronic hiccupers, and concluded that Upslap actually prevented hiccups.

Upon investigation, it was found that:

Study one obtained its sample by asking randomly selected consumers of Coldkey, a popular remedy for grouchery, snappishness and hiccups if they also took Upslap, and included them in its sample if they did.

Study two obtained its sample by asking randomly selected residents of Bakersfield if they took Upslap, and included them in its sample if they did.

Study three obtained its sample by asking randomly selected professional actors if they took Upslap, and included them in its sample if they did.

So which study is random with respect to incidence of hiccups?

Study one is bad because it picks out people who are likely to have hiccups already, so it runs the risk of including people who have hiccups from some other cause than Upslap.

Study three is bad because people who suffer from hiccups have a very hard time working as professional actors, and so excludes hiccupers from its sample.

Bakersfield has no known relationship to hiccups, so it's random with respect to hiccups.

This is what random sampling means.

Copyright © 2005 by Martin C. Young

This Site is Proudly Hosted By: