Experimentation means testing empirically a causal hypothesis. On the basis of the hypothesis, you design the experiment in such a way that your object of study has a chance, either to behave according to your hypothesis, or not. The method is thus solidly anchored in the existing theory and possible only when you already know your object quite well in the beginning and only want to refine your knowledge, e.g. by establishing quantitative associations between variables.
In the three following diagrams, the hypothesis includes only one independent variable, and also one dependent variable. In actual research projects, the hypothesis often consists of several variables of both types.
In the experimental method, the researcher's starting point is a theoretical hypothesis. The first task is to translate the hypothesis into an empirical, experimental design, where the independent variable appears as a stimulus that is applied on the object of study. To this stimulus, the object may choose to react in the way that is specified in the hypothesis; this reaction is then measured and we thus get the value of the dependent variable.
The experiment must be duplicated at least once, so that you can test at least two different values of the independent variable and learn if they have any effects on the dependent variable. But you may need far more experiments if your hypothesis includes more than two variables, if the causal relation you are testing is very weak, or if there are disturbances.
In the real world, the object of study usually receives, besides the intended stimulus, also other influences, which have not been included in your hypothesis. From the point of view of your research project, these other influences are disturbing, because they may cause unwanted variation in the reactions (Fig. on the left). The extraneous influences are sometimes called noise.
One way of eliminating the unwanted variation is to regard it as "random variation" and eliminate it by increasing the number of experiments and calculating the mean of the results. However, this method may compel you to arrange thousands of experiments just to test a single hypothesis.
Another method to eliminate the disturbances and the superfluous independent variables is to cut them away by shielding your object of study. This is usually done by moving the experimental design into a closed room in a laboratory environment, and preventing all extra stimuli (as in fig. on the right).
On the other hand, shielding makes the conditions of the object less natural and less authentic, and thus it may lessen the "ecological validity" of your experiment and your results. This risk is significant especially when the study object is a human being or an animal; the risk is often negligible when the object is dead material. -- If you are in doubt whether shielding has influence on the reaction or not, you may first test its effects as a separate project.
When selecting the object or objects to be studied in the experiment, the first task is to define the population into which you want to generalize your results. You have to ask yourself the question: Which is the population where your hypothesis is supposed to hold true? Once this question is answered, you can proceed to selecting the specimen(s) from that population, perhaps by sampling.
As it is desirable to get as little variation as possible between the
duplicated experiments (apart from the deliberate variation of the
independent variable), it would be practical to use only one test object in
each of the repeated experiments. However, this will only be feasible if
the experiment brings about no permanent changes in the object of
study. In some cases, the object will be deformed or even destroyed in
the course of the experiment, e.g. when we are studying the strength of
Likewise, when the object of study is a person, we cannot fully replicate the experiment because the person's reaction will be modified accordingly the next time.
So in many cases you are compelled to have a different test object in each experiment, each of them being selected preferably by random sampling.
If all the items in the population are practically identical, everything is fine. But if the objects are dissimilar, e.g. people, this variance will bring about quite a lot of unwanted variation in the reactions. One way to compensate for this is to increase the number of subjects in each experimental design. This means that we administer each different type of stimulus not only to one subject but to a whole group of them.
In the historically earliest experiments, there were usually only two groups which received the following names:
The parallel experimental groups should be as similar as possible so
that there would be no other differences than the ones caused by the
differences of the stimuli. The fact that different test persons will vary up
to a certain point at random cannot be avoided, but the
researcher must try not to cause a systematic difference in the
test groups. In order to promote this goal, you can draw lots to place the
individuals in the groups.
Another way to promote the similarity of the test groups is to place the subjects according to matched subject design. First, the subjects are matched in pairs, on the basis of similarity. Then, for the purpose of the test, these matched pairs are divided and placed into different groups. Traditional matched pairs in research have been identical twins.
In experimentation, it is not always possible to use the real object of research. This is the case when the object of interest is a product which is just being designed: it does not even exist at the time of the experimentation, an it has to be simulated by replacing it by a model of the real object. The alternatives of making prototypes and other kinds of models for not yet existing products are discussed under the title Prototyping.
Another aspect which complicates and sometimes prohibits arranging an experiment is related with ethics. If a living creature is to be involved in an experimental design, the process must not cause pain or disturbance, at least excessively. In order to guarantee this, there are ethical instructions for many fields of research.
There are also many phenomena which we cannot at all duplicate as experiments. These include for example the important decisions in human life: choosing a school, a career, a spouse or a place to live, and also such detrimental events as natural disasters, traffic accidents and crimes. Nevertheless, in the case that these events and their conceivable reasons or consequences have been recorded in a sufficient degree it is sometimes possible to create an as-it-were experimental design with such files with the so called ex post facto -method.
In an experimental design the researcher translates his theoretical hypothesis into events in the empirical world. In this translation, the task of a stimulus is to present the variation of the independent variable. The simplest possible experimental design consists of just two cases of experiment, in which the stimulus has two different values (or is wholly absent in one of the cases).
How should a stimulus be designed and how realistic it should be? There are controversial requirements with respect to this question:
Let us take an example from the study of art. If we want to find out if people appreciate the fact that some works of art follow the proportion of the so-called Golden Section (1:1.6), the researcher has to choose between the alternatives that will be presented in the following, starting from the most realistic approach and ending up with the most abstract one. The most realistic one (number 1) would even require that the works of art should be presented in their natural environment, for example at an art exhibition. In the other alternatives, however, the test is thought to be carried out in a set environment, for example in a laboratory.
Levels 2 and 3 of abstraction are common in such studies that aim at assisting the design of new products. An example of abstraction on level 3 was Riitta Brusila's study of the effect of colours in a newspaper. She made two prints of the same day's journal, one in colour and one black and white.
Likewise, Minna Uotila has studied which effect the texture of the fabric has on the general impression given by a garment. She made an experiment by showing test subjects pictures of a woman's dress. The only variation in the pictures, and also the independent variable in Uotila's (Arki & Image, 1992) hypothesis, was the pattern of the material: stripes, dots or small squares. Her purpose was to study the effect of the texture on the overall impression of the suit; she found, however, that the effect of the texture was smaller than the variance between the subjects.
Some stimuli cannot be totally regulated by the researcher. Such stimuli include natural phenomena: direction of the sun, the temperature outside etc. Then the researcher has to adapt to the natural rhythm of variation of the stimulus.
Sometimes it would be necessary to survey people's reactions to
an object, a building or an environment which are only being planned.
Presenting a genuine stimulus is impossible; instead, a
substitute, for example a drawing, a cardboard model or some
other construction set in the laboratory or in the real environment must be
The substitute stimulus should be presented so clearly that the subject would react to it in the same way as he would to a real stimulus. The researcher should test the presentation in advance. Black and white technical drawings and miniatures favoured by designers are often too complicated for laymen to understand. Instead, a colour TV picture is nowadays familiar and comprehensible to almost everybody, and it is preferable for simulating machines and environments. For the various alternatives, even here see Prototyping.
When people's behaviour in social situations is studied in a laboratory, the ecological validity of the experiment may suffer, because people seldom behave naturally in a laboratory. The remedy might be organizing the experiment as systematic observation in a natural environment; however, this method usually involves a higher level of disturbances.
In behavioral experiments we often use stimuli in which a research assistant plays a role agreed on in advance. The "performance" is made slightly different for each experimental group and the difference in the reactions of the two groups is recorded. It is by no means unusual that the subjects are truly fooled in the experiment in order to make them react genuinely. However, this involves ethical problems.
An alternative to using assistant actors is the role playing method. The stimulus is usually a short narrative, or a picture, devised by the researcher. The subjects are asked to tell what had happened before the narrated situation took place (or alternatively, how the situation would continue). The stories given to subjects in the two groups are otherwise identical except for a difference related to the detail the researcher wants to investigate.
In some experiments with people, like for instance when doing research on the effects of a medicine, it is important that the subjects do not know which group they belong to. Such a blind experiment can be arranged by making all the test drugs look the same although those given to one of the groups are placebos. If the research assistant carrying out the experiment is also kept unaware of which medicines are which, we are dealing with a double blind experiment.
Only those reactions of the object that have
something to do with the hypothesis researched are registered.
If the reaction is action of some kind, you can usually use methods which are documented under the heading Systematic observation. If verbal reactions of the subjects are wanted, the methods are given under Interrogating research. These two can also be combined: the subject can be asked to "think aloud" while he is performing the task; the danger of this, however, is that the performance of the subject changes.
If the experiment is carried out in a laboratory, it is possible to organize complicated measurements if necessary. The equipment in every laboratory includes a range of meters and gauges; these can often be linked with a local computer in order to register the results automatically. Physical reactions like the pulse of the subjects can be measured, likewise their eye movements which indicates which details in an advertisement, for example, caught the most attention by the subjects.
Some reactions may be inherently too weak or slow to allow reliable measurement. In some cases, the researcher could resort to reinforcing the stimulus by making it stronger than it naturally is. The ageing of materials has for example been studied by raising the temperature, the humidity or the amount of harmful substances or organisms. The reliability of the method is, however, rather doubtful and it should be tested in advance.
Once the empirical measurements or estimations have been recorded the study can proceed to the phase of analysis.
Books on experimental designs:
August 3, 2007.
Comments to the author:
Original location: http://www2.uiah.fi/projects/metodi