That’s So Random: Getting sampling right

On Wednesday, we talked about sample bias, or ways to really screw up the results of a survey or study. So how can researchers avoid this problem? By being random.

There are several kinds of samples from simple random samples to convenience samples, and the type that is chosen determines the reliability of the data. The more random the selection of samples, the more reliable the results. Here’s a run down of several different types:

Simple Random Sample: The most reliable option, the simple random sample works well because each member of the population has the same chance of being selected. There are several different ways to select the sample — from a lottery to a number table to computer-generated values. The values can be replaced for a second possible selection or each selection can be held out, so that there are no duplicate selections.

Stratified Sample: In some cases it makes sense to divide the population into subgroups and then conduct a random sample of each subgroup. This method helps researchers highlight a particular subgroup in a sample, which can be useful when observing the relationship between two or more subgroups. The number of members selected from each subgroup must match that subgroup’s representation in the larger population.

What the heck does that mean? Let’s say a researcher is studying glaucoma progression and eye color. If 25% of the population has blue eyes, 25% of the sample must also. If 40% of the population has brown eyes, so must 40% of the sample. Otherwise, the conclusions may be unreliable, because the samples do not reflect the entire population.

Then there are the samples that don’t provide such reliable results:

Quota Sample: In this scenario, the researcher deliberately sets a quota for a certain strata. When done honestly, this allows for representation of minority groups of the population. But it does mean that the sample is no longer random. For example, if you wanted to know how elementary-school teachers feel about a new dress code developed by the school district, a random sample may not include any male teachers, because there are so few of them. However, requiring that a certain number of male teachers be included in the sample insures that male teachers are represented — even though the sample is no longer random.

Purposeful Sample: When it’s difficult to identify members of a population, researchers may include any member who is available. And when those already selected for the sample recommend other members, this is called a Snowball Sample. While this type is not random, it is a way to look at more invisible issues, including sexual assault and illness.

Convenience Sample: When you’re looking for quick and dirty, a convenience sample is it. Remember when survey companies stalked folks at the mall? That’s a convenience or accidental sample. These depend on someone being at the right (wrong?) place at the right (wrong?) time. When people volunteer for a sample, that’s also a convenience sample.

So whenever you’re looking at data, consider how the sample was formed. If the results look funny, it could be because the sample was off.

On Monday, I’ll tackle sample size (something that I had hoped to include today, but didn’t get to). Meantime, if you have questions about how sampling is done, ask away!