This is one of my mantras that gets repeated time and again to students. Sometimes it feels like a catch-phrase, one of a small number of formulaic pronouncements that come out when you pull the string on my back. Why do I keep saying it, and why should anyone pay attention?
Let me give an example*. Imagine you’re setting up an experiment into how leaf damage by herbivores influences chemical defence in plant tissues. You might start by assuming that the more damage you inflict, the greater the response induced in the plant. This sounds perfectly sensible. So are you ready to get going and launch an experiment? No, absolutely not. First let’s turn your rather flimsy hypothesis into an actual expectation.
Why is this a straight line? Because you’ve not specified any other shape for the relationship. A straight line has to be the starting point, and if you collect the data and plug it directly into a GLM without thinking about it, this is exactly what you’re assuming. But is there any reason why it should be a straight line? All sorts of other relationships are possible. It could be a saturating curve, where the plant reaches some maximum response; this is more likely than a continuous increase. Alternatives include an accelerating response, or a step change at a certain level of damage.
Perhaps what you need to do is take a step back and think about that x-axis. What levels of damage are you measuring, and why? If you expect an asymptotic response then you’re going to need to sample quite a wide range of damage levels, but there’s no need to continue to extremes because after a certain point nothing more is going to happen. If it’s a step change then your whole design should concentrate on sampling intensively around the point where the shift occurs so as to identify that parameter accurately. And so on. Drawing this first sketch has already forced you to think more carefully about your experimental design.
Let’s not stop there though. Look at the y-axis, which rather blandly promises to measure the induced defence response. This isn’t an easy thing to pinpoint though. Presumably you don’t expect an immediate response; it will take time for the plant to metabolise and mobilise its chemical defences. How long will they take to reach their maximum point? After that it’s unlikely that the plant will maintain unnecessarily high levels of defences once the threat of damage recedes. Over time the response might therefore be humped. Will defences increase and decrease at the same rate?
This then turns into a whole new set of questions for your experimental design. How many sample points do you need to characterise the response of a single plant? What is your actual response variable: the maximum level of defences measured? When does this occur? What if the maximum doesn’t vary between plants but instead the rate of response increases with damage?
This thought process can feel like a step backwards. You started with an idea and were all set to launch into an experiment, but I’ve stopped you and riddled the plan with doubt. That uncertainty was already embedded in the design though, in the form of unrecognised assumptions. In all likelihood these would come back to bite you at a later date. If you were lucky you might spot them while you were collecting data. More likely you would only realise once you came to analyse and write things up**.
This is why I advocate drawing your dream figure right at the outset. Not just before you start analysing the data, but before you even begin collecting it. The principle applies to experimental data, field sampling, even to computer simulations. If you can’t sketch out what you expect to find then you don’t know what you’re doing, and that needs to be resolved before going any further***.
If you’re in the position where you genuinely don’t know the answers to the types of questions above then there are three possible solutions:
- Read the literature, looking for theoretical predictions that match your system and give you something to aim for. Even if the theory doesn’t end up being supported, you’ve still conducted a valid test.
- Look at previous studies and see what they found. Note that this isn’t a substitute for a good theoretical prediction; “Author (Date) found this therefore I expect to find it too” is a really bad way to start an investigation. More important is to see why they found what they did and use that insight to inform your own study.
- Invest some time in preliminary investigations. You still have to avoid the circularity of saying “I found this in preliminary studies and therefore expected to find it again”. If you genuinely don’t know what’s going to happen then try, find out, and think about a robust predictive theory that might account for your observations. Then test that theory in a full and properly-designed experiment.
Scientists are all impatient. Sitting around dreaming about what we hope will happen can sound like an indulgence when there’s the real work of measurement to be done. But specifying exactly what you expect will greatly increase your chances of eventually finding it.
* This is not an entirely random example. I set up a very similar experiment to this in my PhD which consumed at least a month’s effort in the field. You’ll also find that none of the data are published, nor even feature in my thesis, because I found absolutely nothing. This post is in part an explanation of why.
** There are two types of research students. There are those who realise all-too-late that there were critical design flaws in some of their experiments. The rest are liars.
*** Someone will no doubt ask “what about if you don’t know at all what will happen”. In that case I would query why you’re doing it in the first place. So-called ‘blue skies research’ is never entirely blind but begins with a reasonable expectation of finding something. That might include several possible outcomes that can be predicted then tested against one another. I would argue that truly surprising discoveries arise through serendipity while looking for something else. If you really don’t know what might happen then stop, put the sharp things down and go and ask a grown-up for advice first.