The **population** or target population is that entire group of items or cases about which you want to gather data. The approaches in defining it are discussed on a separate page, titled Demarcating the Study.

In an empirical study, the population usually consists of physical objects like people or products, or of events. In a case study it contains just one object or event, but in theoretically oriented research it can be infinite, i.e. you want to know something that is true for every object or event of the given type in the universe.

In some projects every specimen or event of the population is actually measured or recorded. Such a **total study** gives an excellent description of the population, but it is possible only if the population is not too large and if all the objects are available for study.

Total study is a relatively expensive method, because empirical work takes time and often involves apparatus, travels and other costs. Remember, too, that the objectives of a research project do not always require an absolutely exact account of the entire population and a trustworthy approximation would often suffice. Therefore it is quite common that you measure or record only so many units of the population that you can afford and that are necessary for reaching the goals of the project. To this end, several strategies are available. Some are listed below.

- Sometimes
**one single case**can represent all the specimens or events in the population, if all of them are identical. Physical and chemical processes are generally assumed to function similarly everywhere in the universe, so you can study the process simply in your laboratory and assert that the results are valid also on Moon, if that would be needed. **Studying one case for each class**is possible if you know certainly that the population consists of a few classes containing identical objects. In the study of mobile phones, the population of which is counted in billions, you can thus start by finding out which are the models of phones that have been made, and then you study just one specimen of each model.**Sampling**means deliberately limiting the number of cases in the study. It involves a risk of the study findings being not true for some of the left-out cases, but this risk can often be calculated and restricted on a tolerable level.

Note that sampling does not mean that you were not equally interested in all the items in the population. On the contrary, you would like to study all of them, but you pick the sample for practical reasons. Perhaps you have a population of millions of objects and it is impossible to reach even a major part of them. Also in those cases (with populations of, say, up to 10,000) where you might choose to study every object, the sampling study may be a prudent choice, because it saves your time and you can then use the time you save to study the sampled items more carefully.

Above, we stated that in sampling research, we are always interested, not in the sample but in the population; more exactly in the properties of the items of the population. When studying the items in the sample, we would like that the average of their attributes is the same or very near the average in the population. If that is the case, our sample is **representative.**

There are two alternative principles which you may use when selecting
a sample:

- random sampling, where only chance determines which items are selected (figure on the left),
- non-random sampling, where a particular criterion or a not aleatoric procedure selects the objects to be studied (on the right). It is also possible that the researcher deliberately selects the items to the sample.

The act of sampling itself generates two types of disagreement between the target population and the sample:

**Random**divergence: you get a few cases with unusual properties accidentally in the sample, which in turn will infect the summarized data (e.g. averages) from the sample. If a random sample is large enough, the divergences to opposite directions mostly cancel each other.**Bias,**a systematic error, or a constant difference between the data from the population and from the sample, occurs often in non-random sampling. It is caused by the method of selection, which often inadvertently favors some types of items before others. This nuisance can often cause many times greater a decline in the representativeness of a sample than random divergence could create.

You might wonder why to use non-random sampling at all, because it involves the risk of bias, a seemingly unnecessary source of disagreement with the population? There are several possible motivations to it:

- The population is infinite or near it. You cannot enumerate the cases and create a list for making a random selection.
- Sometimes you cannot reach some items in the population. A random selection would not be meaningful, because it would be possible to execute just a part of it, and this would bias the selection.
- The objectives of the study do not require exact results. Non-random sampling is usually cheaper and quicker.
- The project includes an efficient control procedure at a later stage. For example, it could be difficult to persuade randomly selected customers to participate in the procedure of creating new product concepts and testing the proposals which takes normally several days. Instead, it can be easier to use a non-random sample of volunteers for the initial work groups of product development projects. Later on the final product proposals will then be tested with proper random samples from the population of target customers.

If a random sample (also called "probability sample") is properly made, it contains no systematic bias and it is therefore relatively representative of the population. Of course, you can never be 100% certain that the results measured from the sample are also true in the population. However, for practical purposes it is often enough if you know that the risk of a deviation from the population is, say, 1%. You will be able to make such statements that are based on probability calculus if you have used a random sample.

The principle in selecting the items to the random sample is the same as when casting lots. All the objects of the population shall have an equal probability to be selected into the sample. This probability is called *sampling ratio,* and it is equal to the number of the items in the sample divided by the number of the population.

There are alternative methods of creating a random sample. In the following diagrams, items of the original population are presented as small dots or as other small symbols, and items selected in the sample are shown as bold symbols.

**1. Simple random sample.** The sample is drawn by lot, for example by picking numbered tags from a hat. If you have a list of the population as a computer file, you can let the computer do the random selection. When the population is very large and it already consists of clusters, the items of which are listed in a file, it can be practical do the sampling in the stages as *cluster sampling,* i.e. select first a sample of clusters and then, from the items in these clusters, select the final sample. For example, if the population consists of all the people in a country, you can first select randomly a few subdivisions of the country and then select the final sample among the people in these subdivisions. If you intend to interview these people in their homes, you will thus save much time of travelling.

**2. Systematic sample.** If the intended sampling ratio is 1/n, you can start by choosing the first item at random among the first *n* objects in the list of the population, and after that you pick each n:th object. The procedure is very easy even without a computer, and the result is just as representative, except in the unusual situation that an important property of the objects should be repeated at every n:th case.

**3. Weighted random sample.** When the
population is known to include a very small but essential group, there is the risk that
no members of this group will fall into a random sample. Among the users of products such important groups are, among others, people with impaired sight, hearing or motor ability, see a list of such people. Other often significant minorities originate from religions, nationalities and language groups.

In order to guarantee that at least some cases from an important minority (marked with **x** in the picture on the right) get into the sample, you can deliberately increase the sampling ratio on this important group. This will of course generate unbalance in the measurements that you get, but it will be easy to restore the original balance later. This is done so that when you combine the results, e.g. by calculating the mean of all measurements, you give the measurements from each group its proper weight corresponding to the genuine percentages in the population.

Non-random (or "nonprobability") samples are selected by any kind of procedure that does not give all cases in the population equal chances to fall into the sample. Sometimes the context of the study allows or facilitates using a certain method of sampling, sometimes the researcher has the possibility of selecting the method. Various such procedures will be discussed below.

Whatever the procedure, it is always possible that it will favor certain types of cases in the population more than the others, in other words the data from the sample will be biased.

In *descriptive* studies the presence of bias is usully a grave handicap, because it can prohibit generalizing the results into the population. This is a difficulty that you will meet later in your project, when Assessing Non-Random Sampling and when writing the final chapter of your report, so it can be prudent to think about it in advance, when selecting the sampling method.

When assessing a non-random sample you should ask yourself: Will the results from the sample be the same that you would get from the population? Is it certain that the criterion that you have used in selecting the sample (e.g. the willingness of people to participate) has no relationship with those variables that you want to record from the sample? If there is correlation, your sample will be biased and you should consider constructing a new sample with less correlation.

As a contrast, in *research and development* projects the risks in using non-random samples are smaller, because often a bias can be compensated later. For example, it is common to use convenience sampling when selecting potential customers to a think-tank in order to develop an early product concept. The selection of persons will probably be biased, as well as the proposals from the think-tank, but normally the proposals will be rectified at a later stage when they are evaluated anew by another, larger group of people.

Common types of non-random samples include:

**1. Convenience sample.** A coincidental group, e.g. people at a meeting, might be specified as a sample. More exactly, the sample contains those persons in the group who are willing to take part. Such a sample is often heavily biased, but this can be accepted if the data obtained from the sample are not really going to be used, such as is the case in demonstrations in survey method classes at universities. Likewise, this is a possible method when you need a few potential customers to assist in product development, on the condition that the obtained results shall later be tested with a better sample from the intended customers.

**2. Sample of volunteers** is created when all the members of a population have the opportunity to participate in the sample, and all volunteers are accepted. If you insert a survey form in an Internet page and ask people to give their opinions on a topic, you will get this type of sample from all the readers of this page. Similarly, the persons who spontaneously send customer feedback to a company are a sample of volunteers from all the customers.

A sample of volunteers can be a practical alternative when there is no list of the members of the population from which a random sample could be drawn, or when it is difficult to contact the people in a sample because their addresses are not known. The disadvantage is that it is difficult to assess the presence of bias, i.e. whether the opinions or other interesting properties of the volunteers deviate from those of the population. When considering this question, there are two questions to ask:

- What is the population that you are aiming at? Have all the members of the target population equal chances to be included in the sample?
- Is there any reason why the volunteers should differ from the rest of the population? For example, have they, or at least some of them, a special reason for volunteering?

If you, for example, want to get a sample of those people that have bought your latest product, you can include in the package of the product a postage-paid form where the people can give their names and addresses. What would happen if you additionally asked the respondents to give their opinions of the product? Quite probably you would get answers mostly from people that have a strong opinion of your product, either a positive or negative one. The people with no definite view of your product would probably not so often bother to answer. The sample would thus risk being biased, and you would have to consider whether such a bias could be acceptable for your purposes.

**3. Snowball sample.** When interviewing members of a population, you can ask the interviewed persons to nominate other individuals who could be asked to give information or opinion on the topic. You then interview these new individuals and continue in the same way until the material gets saturated, i.e. you get no new viewpoints from the new persons.

Snowball sampling is a good method for such populations that are not well delimited nor well enumerated, for example the homeless. The drawback is that you get no exact idea of the factual distribution of the opinions in the target population. Besides, people usually propose people that they know well and who share their own views, which means that small groups of interest often are passed by unnoticed. One method for compensating this could be asking people to nominate *both* such persons who share the same views *and* such persons who are of the opposite opinion. Another method is to start the snowball chain from not one but several different people, perhaps from different social groups.

**4. Sample that consists of all the available cases.** Sometimes the researcher is interested in a population of which only a few cases or specimens are available for study, and these then must serve as a sample of the population. Typical such samples are:

- 4a. Surviving cases.
- 4b. Permitted cases.

**Surviving cases** among historical or archaeological material, when a large part of once relevant material have disappeared before researchers get at it, can be regarded as a kind of convenience sample even when it is the historical reality and not convenience that selects the sample. Both samples involve a similar handicap for research: if the disappearance of material, during the period until the study, has not been random nor proportional but instead somehow partial or selective, the remaining material will be biased and the researcher should try to assess the likely bias. You should ask yourself whether any of the following factors have affected differently on the preservation of different sorts of material:

- Has the material sometimes been selected for any purpose, for example in order to be kept in archives, libraries or museums?
- Have some objects in the material sometimes been replaced with new ones?
- What sorts of things have, in prevous times, been regarded as rubbish, or worthy and proper to be preserved?
- Are there physical factors which can have affected differently on the preservation of various groups of material?

**Permitted cases.** When studying private enterprises it often happens that the management will not allow recording information from certain units in the organization. The management's decision perhaps is motivated by their judgement about the objectives of the study, but from the researcher's scientific point of view such a sample will often seem seriously biased.

**Overstepping the limits of population.** You must not include in your sample items that are not members of the defined population. For example, in snowball sampling it often happens that some interviewed people nominate candidates that do not belong to the same population. Of course, you have often the option of altering your original delimitations.

**Sample of typical cases.** Often the goal when studying a heterogenous group is to find what is common and typical of the majority of the cases in the group. To this effect, sampling has sometimes been used so that the most typical cases are selected into the sample and all the extraordinary cases are left out. In the figure on the right, typical cases are marked with dots, and exceptional cases with the symbols + and x.

The selection of "typical" cases is not quite commendable because the researcher's prejudices (which can be biased) influence too much the final results of the inquiry. The researcher can, without noticing it, select mostly such cases which corroborate his preconceptions or hypotheses. To sum up, if you want to point out the average or the most common cases in the population, a better method is to classify all items of the population or a random sample of it, and note the most frequent type. When necessary, you can then continue the study of this class, which hereafter becomes the new population of study.

**Sample of specialists.** It might look like a sensible idea to ask directly those, usually few, people that know a lot about the topic, instead of asking a large sample of randomly selected laymen whose knowledge can be sporadic and opinions may diverge. In this way, we might, for example:

- Investigate consumer preferences of household devices by interviewing salespersons.
- Study life styles of tenants through a questionnaire to house managers or landlords.
- Test a new family car model by asking celebrated racing drivers to try and evaluate it.
- Assess the working atmosphere in a company by interviewing the managers.

The advantage in interviewing specialists is that you need to interview just a few people and in the discussion you get quickly to the point. Nevertheless, you should not think that a sample of "specialists" could be taken as a sample of "non-specialists". These are two different populations. You should not generalize the results from "specialists" to any other population than just the population of "specialists" whoever they may be.

If you anyway choose to interview specialists, you can do it, of course. If you then additionally want to gather the opinions of the average consumers, you should define these as a second population and select a suitable sample of it, too. One possibility is to make these two surveys in succession. You could perhaps use the results from the specialists as new hypotheses to be tested with another sample of the consumer population. In other words, you would use the interview of the specialists as a preliminary study only. Or the other way round - you can first consult ordinary consumers and then the specialists.

**Normative sampling.** Normative aspect is acceptable in development projects which aim at improving similar objects in the future, but it is better to keep it out of sampling because it is not compatible with the principles of representativeness and generalization.

Studying normatively only a "sample of the best exemplars" is quite a tradition in art history: you only take into account the great works of art. The idea is that the best cases are closest to the ideals that artists had in their time and in this sense they represent the truest art of the era. They, too, had the greatest influence into later development. However, it is self-evident that the best works are not typical of the era and they do not represent average works of art. This does not suggest that you should not study them, but if you do it, do not call it a "sample" if you mean that the *population* of your study are the great masters. Cf. the discussion under Demarcating the
Study.

Later in the project, when analyzing the data, you can easily uphold the normative aspect if it is needed, by using the methods of Normative Case Study , Normative Comparison , Normative Classification, and Normative Study of Development, so there is no need to mix up the sampling procedure with normative considerations.

The main purpose of sampling is to reduce the need for empirical operations which entail labor and cost. How small can a sample then be without losing its usability? In other words, what is the smallest number of cases that still give us reliable enough data about the population?

Data that we can get from a sample are normally slightly different from those of the population. The reason is that the random selection has brought to the sample, not only average items of the population, but also a few more or less exceptional items. How many of them, can be anticipated by calculus of probability. It can also tell us how large is the risk of getting erroneous data because of these exceptional cases. The risk is roughly proportional to the variances of the variables and in inverse relation to sample size.

If we use the formula the other way round and know the desired level of statistical significance of the data we wish to record from the sample, we can calculate the required sample size on the basis of the number of variables, and their variances. The variances are often not known in advance, but an approximation can be used instead.

You have, for example, measured two variables from a small sample and found that their correlation is 0.26. Now it is always possible that such a correlation has been created in the sample just accidentally and it is not true in the population. You want that the probability of such an accident be less than 1%. If you consult the table presented under t-test you will find that a sample of 100 cases is needed before the probability of getting accidentally a 0.26 correlation diminishes to 1%.

Another example. You are studying percentages and you want to be 95% certain that the percentage that you have measured from a sample is true in the population as well, you can use the formula of confidence interval:

where

p = percentage calculated from a sample

n = sample size.

If the confidence interval, according to the formula, is too wide, you can cut it down by using a larger sample. From the formula you can infer that if you multiply the sample size by four, the confidence interval will shrink into half. Note that the formula is independent of the size of the population.

The formulas for calculating statistical significances are exact but somewhat cumbersome to use because you have to use a different formula for almost every type of statistic. That is why these formulas are not presented here. A very rough rule of thumb says that for doing analysis of variance you will need 30 cases, for regression analysis 40 cases multiplied by the number of variables, and for a Chi test at least five cases in each cell in the table of distribution. In important projects with ample resources, a statistician is usually consulted for calculating the size of a sample. In a research project with limited resources, the rule of thumb is: Use as large a sample as you can afford.

There is no formula to determine the size of a non-random sample. Often, especially in qualitative research, you may simply enlarge your sample gradually and analyse the results as they come. When new cases no longer yield new information, you may conclude that your sample is *saturated,* and finish the job. This method is however very sensitive to biased sampling, so you should be careful and make sure that you do not omit any groups from your population.

Remember also that if a sample is biased it does not help to increase the sample size. The added sample will be just as biased if you use the same method of selection as for the original sample.

If you can afford to make a second sample, try creating it with another method of selection. Keep initially separate the data from each of the samples. By comparing them you have an excellent means of judging the presence of bias in either of them.

Before deciding the size of a non-random sample, you might want to read how to assess the results from a non-random sample. Otherwise you might experience quite a nasty surprise when trying, too late, to define the field where your results could be declared valid.

It often happens that some cases in the sample turn out fruitless because they cannot be reached, measurements fail, interviewees refuse to co-operate etc. The most usual method is to overdimension the sample slightly, and then simply forget the failing cases.

If you, however, want do the sampling very carefully, you should ask yourself: Is it probable or possible that the failing cases differ from the successful ones in any respect that is interesting in your project? Only when the answer is *no*, the absence of these cases will not introduce bias in the results. If you, on the contrary, think that the failing cases systematically differ from the rest, you can try to neutralize the bias by giving different weights to the data that come in time and those that come first after a reminder. The method is explained in The Problem of No-Reply.

August 3, 2007.

Comments to the author:

Original location: http://www2.uiah.fi/projects/metodi