Tag Archives: research

In praise of backwards thinking

What is science? This is a favourite opening gambit of some external examiners in viva voce examinations. PhD students, be warned! Imagine yourself in that position, caught off-guard, expected to produce some pithy epithet that somehow encompasses exactly what it is that we do.

It’s likely that in such a situation most of us would jabber something regarding the standard narrative progression from observation to hypothesis then testing through experimentation. We may even mumble about the need for statistical analysis of data to test whether the outcome differs from a reasonable null hypothesis. This is, after all, the sine qua non of scientific enquiry, and we’re all aware of such pronouncements on the correct way to do science, or at least some garbled approximation of them.* It’s the model followed by multiple textbooks aimed at biology students.

Pause and think about this in a little more depth. How many great advances in ecology, or how many publications on your own CV, have come through that route? Maybe some, and if so then well done, but many people will recognise the following routes:

  • You stumble upon a fantastic data repository. It takes you a little while to work out what to do with it (there must be something…) but eventually an idea springs to mind. It might even be your own data — this paper of mine only came about because I was learning about a new statistical technique and remembered that I still had some old data to play with.
  • In an experiment designed to test something entirely different, you spot a serendipitous pattern that suggests something more interesting. Tossing away your original idea, you analyse the data with another question in mind.
  • After years of monitoring an ecological community, you commence descriptive analyses with the aim of getting something out of it. It takes time to work out what’s going on, but on the basis of this you come up with some retrospective hypotheses as to what might have happened.

Are any of these bad ways to do science, or are they just realistic? Purists may object, but I would say that all of these are perfectly valid and can lead to excellent research. Why is it then that, when writing up our manuscripts, we feel obliged — or are compelled — to contort our work into a fantasy in which we had the prescience to sense the outcome before we even began?

We maintain this stance despite the fact that most major advances in science have not proceeded through this route. We need to recognise that descriptive science is both valid and necessary. Parameter estimation and refinement often has more impact than testing a daring new hypothesis. I for one am entranced by a simple question: over what range do individual forest trees compete with one another? The question is one that can only be answered with an empirical value. To quote a favourite passage from a review:

“Biology is pervaded by the mistaken idea that the formulation of qualitative hypotheses, which can be resolved in a discrete unequivocal way, is the benchmark of incisive scientific thinking. We should embrace the idea that important biological answers truly come in a quantitative form and that parameter estimation from data is as important an activity in biology as it is in the other sciences.”Brookfield (2010)

Picture 212

Over what distance do these Betula ermanii trees in Kamchatka compete with one another? I reckon around three metres but it’s not straightforward to work that out. That’s me on the far left, employing the most high-tech equipment available.

It might appear that I’m creating a straw man of scientific maxims, but I’m basing this rant on tenets I have received from reviewers of manuscripts, grant applications or been given as advice in person. Here are some things I’ve been told repeatedly:

  • Hypotheses should precede data collection. We all know this is nonsense. Take, for example, the global forest plot network established by the Center For Tropical Forest Science (CTFS). When Steve Hubbell and Robin Foster set up the first 50 ha plot on Barro Colorado Island, they did it because they needed data. The plots have led to many discoveries, with new papers coming out continuously. Much the same could be said of other fields, such as genome mapping. It would be absurd to claim that all the hypotheses should have been known at the start. Many people would refine this to say that the hypothesis should precede data analyses (as in most of macroecology) but that’s still not the way that our papers are structured.
  • Observations are not as powerful as experiments. This view is perhaps shifting with the acknowledgement that sophisticated methods of inference can strip patterns from detailed observations. For example, this nice paper using Bayesian analyses of a global dataset of tropical forests to discern the relationship between wood density and tree mortality. Ecologists frequently complain that there isn’t enough funding for long-term or large-scale datasets to be produced; we need to demonstrate that they are just as valuable as experiments, and recognising the importance of post-hoc explanations is an essential part of making this case. Perfect experimental design isn’t the ideal metric of scientific quality either; even weak experiments can yield interesting findings if interpreted appropriately.
  • Every good study should be a hypothesis test. We need to get over this idea. Many of the major questions in ecology are not hypothesis tests.** Over what horizontal scales do plants interact? To my mind the best element of this paper by Nicolas Barbier was that they determined the answer for desert shrubs empirically, by digging them up. If he’d tried to publish using that as the main focus, I doubt it would have made it into a top ecological journal. Yet that was the real, lasting contribution.

Still wondering what to say when the examiner turns to you and asks what science is? My answer would be: whatever gets you to an answer to the question at hand. I recommend reading up on the anarchistic model of science advocated by Paul Feyerabend. That’ll make your examiner pause for thought.


* What I’ve written is definitely a garbled approximation of Popper, but the more specific and doctrinaire one gets, the harder it becomes to achieve any form of consensus. Which is kind of my point.

** I’m not even considering applied ecology, where a practical outcome is in mind from the outset.

EDIT: added the direct quotation from Brookfield (2010) to make my point clearer.

Two lumps please

Here’s a quick thought experiment. Imagine you have a spare flowerbed in your garden, in which you scatter a handful of seeds across the bare ground. You then ignore them, and come back some months later. What will have happened?* Your expectation might be that you will have a healthy patch of plants, all about the same size. Some might be larger or smaller than average, but overall you’d expect them to be pretty similar. This is known as a unimodal size distribution. They have after all experienced identical conditions.

You’d be wrong. In fact, it’s more likely that your plants will have separated into two or more size groupings. There will be a set of larger plants, spread apart from one another, and which dominate the newly-formed canopy. In between them will be scattered other plants of smaller size. This results in a bimodal (or multimodal) size distribution. There isn’t a standard, expected size; instead there will be different size classes present.

modes.png

A normal, unimodal distribution of sizes (left) is what you might expect to see when all plants are the same age and growing in the same conditions. In fact it’s more common to see a bimodal size distribution (right), or something even more complicated.

This observation is nothing new. Much was written about the issue from the 1950s through to the 70s, particularly in the context of forest stands. The phenomenon was widely-recognised but remained paradoxical.

I stumbled upon this old literature back in 2010 when I published a small paper based on a birch forest in Kamchatka which showed a clearly bimodal size distribution. I didn’t need to go all the way to Kamchatka to find a stand with this feature; but since I had the data it made sense to use it. I used the spatial pattern of stems to infer that the bimodality was the result of asymmetric competition (i.e. that large trees obtain disproportionately more resources than small trees, which is definitely true in terms of light capture). All the trees were the same age, but the larger stems were spread out, with the smaller stems in the interstices between them. Had the bimodality been the result of environmental drivers we would expect there to be patches of large and small stems, but in fact they were all mixed together.

White birch forest, central Kamchatka

This is the stand of Betula platyphylla with a bimodal size distribution that was described in Eichhorn (2010). If it looks familiar, it’s because the strapline of this blog is a picture of us surveying it. The white lights on the photo aren’t faeries, it’s the reflectance of mosquito wings from the camera flash. So many mosquitoes.

Three things struck me when I was reading the literature. The first was that hardly anyone had thought about multimodal size distributions in cohorts for several decades**. This was a forgotten problem. The second was that the last major review of the phenomenon back in 1987 had concluded that asymmetric competition was the least likely cause — which conflicted with my own conclusions. Finally, I had no difficulty in finding other examples of multimodal size distributions in the literature, but authors kept dismissing them as anomalous. I wasn’t convinced.

Analysing spatial patterns is all well and good but if you want to really demonstrate that a particular process is important, you need to create a model. Enter Jorge Velazquez, who was a post-doc with me at the time but now has a faculty position in Mexico. He built a simple model in which trees occupy fixed positions in space and can only obtain resources from an the area immediately around themselves. Larger trees can obtain resources from a greater area. When two trees are close to one another, their intake areas overlap, leading to competition for resources.

overlap.png

When there are two individual trees (i and j), each of which obtains resources from within a radius proportional to its size m, the overlap is determined by the distance d between them. Within the area of overlap the amount of resources that each receives depends on the degree of asymmetric competition, i.e. how much of an advantage one gets by being larger than the other. This is included in the model as a parameter described below.

This is where asymmetric competition is introduced as a parameter p. When = 0, competition is symmetric, and resources are evenly divided between two trees when their intake areas overlap. When = 1, each tree receives resources in direct proportion to its size  (i.e. a tree that’s twice as large will receive two thirds of the available resources). Increasing makes competition ever more asymmetric, such that the larger competitor receives a greater fraction of the resources being competed for. In nature we expect asymmetric competition to be strong because a taller tree will capture most of the light and leave very little for those beneath it.

We applied the model to data from a set of forest plots from New Zealand which have already been well-studied. Not only did we discover that two thirds of these plots had multimodal size distributions, but also that our model could reproduce them.

We then started running our own thought experiments. What if you changed the starting patterns, making them clustered, random or dispersed? That turned out to have very little effect on size distributions. What about completely regular patterns? That’s when things started to get really interesting.

By testing the model with different patterns we discovered three important things:

  • Asymmetric competition is the only process which consistently causes multimodal size distributions within simulated cohorts of plants. Nothing else we tried worked.
  • Asymmetric competition is the cause, not the consequence of size differences in the population.
  • The separation of modes is determined by the length of time it takes for competition in the cohort to start, which usually reflects the distance between individuals.
  • The number of modes reflects the effective number of competitors that each individual has.

What does all this mean? Given that asymmetric competition is normal for plants, I would argue that we should expect to see multimodal size distributions everywhere. In fact, seeing unimodal size distributions should be a surprise. Don’t believe me? Grab some seeds, give it a go, and tell me if I’m wrong.

You can read our new paper on the subject here. If you can’t get hold of a copy then let me know.


* Luckily this is a thought experiment, because in my garden the usual answer is ‘everything has been eaten by slugs’.

** I should stress here that I’m specifically referring to multimodality in size distributions of equal-aged cohorts. When several generations overlap then the distribution of sizes reflects the ages of the individuals. If multiple species are present this adds additional complications, and in fact size distributions of species across communities have been a hot topic in the literature of late. This is very interesting but a completely different set of processes are at work.

We’re all stupid to someone

I spend an increasing proportion of my time collaborating with engineers and theoretical physicists. It keeps me on my toes and I’ve had to adjust to very different research cultures. The engineers, for example, get particularly excited by designing a technical solution to a problem. The long haul of data collection and statistical analysis has less appeal; once they’ve proven it can be done then they’re itching to move on to the next challenge. Likewise physicists genuinely do spend meetings in front of whiteboards sketching equations, which leaves me feeling a bit frazzled. Nevertheless, I’ve learnt that if an idea can’t be expressed mathematically then it hasn’t been properly defined. That turns out to apply to a lot of verbal models in ecology.

Both engineers and physicists are ready to publish at an earlier stage than most ecologists would, and their papers are a model of efficiency in preparation. Not for them a lengthy waffle of an introduction, followed by an even more prolonged and rambling discussion. Cut to the point, make it clearly, then wrap up. It makes me wonder whether we’re doing something wrong in ecology. I certainly don’t enjoy either reading or writing long papers, and I can’t fully justify our practice.

I also find myself fielding questions or tackling issues that would never come up when chatting to an ecologist. One of the misapprehensions I’ve had to counter is that trees are not lollipops. It might be more computationally efficient to assume that trees are spheres of leaves on a stick, and it can lead to some elegant mathematical solutions, but the outcomes are going to depart from natural systems pretty rapidly. Our disciplinary training leads us to consider particular assumptions to be perfectly reasonable, despite them sounding ridiculous to others or bearing little resemblance to the real world. (Even within their own field, forest ecologists are not immune to this syndrome).

Understanding how another researcher arrived at their assumptions can be informative — sometimes it boils down to analytical frameworks, computational efficiency or technological limitations, all of which are valid reasons to consider accepting a proposition that on first hearing might sound far-fetched. Likewise it helps to have our own assumptions challenged. Sometimes we are able to justify and defend them. Other times they leave us exposed, which is when we know we’re onto something important.

It’s also a sad but common trait within all social groups to mock outsiders for making mistakes about things that appear self-evident to those on the inside. Ecologists can easily play the same game, but make no friends by doing so. I had a chat with one of my collaborators this week who was itching to find a small tree on campus, scan it using ground-based LiDAR, then strip and record the sizes of all its leaves. It’s a perfectly reasonable idea (if a lot of hard work). The main stumbling block is that it’s the middle of February and we’re a good three months at least from having full leaf canopies to play with. An obvious problem? Only to someone who spends their life thinking about trees the whole time. We had a laugh about it then moved back to our simulations, which have the considerable benefit of not shedding their leaves seasonally.

This kind of interaction only makes me wonder what crazy things I’m responsible for coming out with in our meetings. It also makes me grateful to my collaborators for their patience in humouring me, because I’m pretty sure that I come across as an idiot more often than I realise. This to me is the greatest pleasure of interdisciplinary collaborations. We could all spend the rest of our careers treading the same academic paths, publishing in the same journals, and not need to stretch ourselves quite as far. By heading way outside our comfort zones we all end up learning more than we expected to, so long as we don’t mind feeling stupid every now and again (which happens every time I get tangled in algebra). If you’re not willing to be wrong then you’re not willing to learn. And if I end up the subject of an amusing anecdote at a theoretical physics meeting? That’s fine by me. I hope it raises a good laugh. As a wise man once said, ridicule is nothing to be scared of.