Gordon Strachan, genetics of height and fitba

The Scotland football manager Gordon Strachan has made headlines this week, after his team narrowly failed to reach the World Cup Qualification Play-offs. In the crucial game against Slovenia, Scotland drew 2-2 when they needed to win. Afterwards Strachan lamented that Scotland’s opponents were bigger and stronger, and that Scotland were ‘Genetically behind’ their opponents (http://www.bbc.co.uk/sport/football/41546717), even going as far as saying “Maybe we get big women and men together and see what we can do”. Presumably the latter comment was said tongue in cheek.

Strachan has been much derided for his comments, not least because so many of Scotland’s greatest players have been short of stature e.g. Jimmy Johnstone, Kenny Dalglish, Archie Gemmill (scorer of the wonder goal responsible for the misplaced sex tape in Trainspotting) and of course ….. Gordon Strachan. Others have pointed out that Spain have a lot of short players and are an outstanding football team. Does he have a point though about nations differing in height for genetic reasons, rather than say, due to nutrition?

Human height is almost certainly the best understood quantitative genetic trait. It is highly heritable and highly polygenic (many genes of small effect). We know the identity of many of the genes that cause this heritable variation, but it is also the case that there are likely to be many thousands of unidentified additional genes with effects so tiny we may never detect them. I won’t provide a full bibliography here, although interested readers should check out the publications by Peter Visscher, Naomi Wray and Jian Yang on the subject – http://cnsgenomics.com/publications.html. In recent years, researchers have begun to look at how height varies between countries. Do populations from different countries differ in height, are  those differences genetic, and is natural selection the cause of those differences?  Matt Robinson and colleagues have shown that the answer to all three questions is ‘Yes’. In fact, quite a lot some* of the inter-population differences are due to genetics rather than, say, differences in nutrition or diet. Details are in Robinson et al (2015).

Ok, so nations vary in height, but does that affect their football prowess? In the Robinson paper, the genetic heights of 14 different European countries was estimated using the genes known to affect height. Unfortunately, the UK data was not broken down into the different nations, so we cannot answer the question whether Scottish people are genetically small. However, the UK was genetically the 5th tallest of the 14 countries and in general Northern European populations have ‘taller genes’ than more Southern nations. For a bit of fun, I’ve looked at the relationship between a nation’s genetic height and it’s current FIFA football ranking (disclaimer: I don’t believe for one second that any relationship is causal). I’ve used England’s ranking here for the UK, simply because I think most of the genetic data on UK cohort studies of height has used data from England rather than Scotland, Wales or Northern Ireland.


Amusingly, there is a marginally significant negative relationship (r = -0.53, P = 0.048) between genetic height and football ranking – genetically smaller countries are better at football. Therefore,  whatever the reason for Scotland’s agonising miss, it wasn’t because they were too small. Alternative explanations, like having a really weak domestic league seem better explanations (but of course we knew that already!).

Finally, it’s often commented that the Dutch are the tallest nation (with honourable mentions to Lithuania and Montenegro), and in this dataset they were the tallest. Perhaps that explains the surprising failure of the Netherlands to qualify for the world cup. They need to ignore Gordon Strachan’s advice and breed their footballers smaller!


Robinson et al (2015). Population genetic differentiation of height and body mass index across Europe Nature Genetics 47:1357-1362 (behind a paywall, unfortunately

*I’ve edited this because Graham Coop (@Graham_Coop) has pointed out to me that actually in Robinson et al. the height-related SNPs only explain small inter-population differences in height. This will be an underestimate, because not all height-related genes were considered/are known. The extent to which genes explain population differences in height is not fully resolved.
Posted in Uncategorized | Comments Off on Gordon Strachan, genetics of height and fitba

A moan about ‘Executive Search Firms’ and academic posts

Whenever university or research institutions create new positions, for example, due to retirements, people moving or expansion, the existing staff get quite excited. “Who can we recruit?”, “What fields do we want to grow our expertise in?”, “Where are we under-represented?”, “Who will we get on with?”, “Who will get on with us!”

It’s not hard to come up with a long list of stellar people that would be great additions to the hiring institution. After that, a few informal chats might be had to encourage people to apply (‘tapping-up’ in football-transfer speak), adverts will be placed, shortlisting and interviews will proceed, and at the end of it all, hopefully an appointment will be made that results in (i) a better department and, (ii) a happy appointee. It’s not that complicated – although discussions about who to appoint can be – and the process is driven by the members of the recruiting department.

At least, that is how it has seemed to work for most of my career in academia. However, there seems to be a new trend to involve third parties in the process. In the last year or so I have been approached about 5 times by ‘Executive Search Firms’ acting on behalf of universities and research institutes. They are contracted by the hiring body to identify candidates for the post, and then perform the search for them. “What’s wrong with that?”, you may ask? “They’re professionals, they do this kind of thing all the time. It frees up time for staff to get on with their teaching/research/admin/etc”.

I do have a problem with this trend though, for various reasons. The main one is how the process actually happens. The search firms typically cover a very broad remit (well outside of academia), and simply cannot have the appropriate expertise in whatever post is being filled. They recognise this, and so approach academics, asking them who would be appropriate for the post. In effect, they are asking us to do their job. Presumably the search firm charges the hiring institution a fairly hefty fee to find a candidate. Of course, none of this fee gets passed on to the experts providing the search firm with the information they need to make sensible recommendations. To my mind, this seems like a business model where there is only one winner – the search firm. I cannot believe that the hiring body gets better staff as a result of the process. The search firm are unaffected if the wrong person is appointed, and they may not bothered about appointing people from under-represented groups. I don’t believe it saves time either. As a community, we will spend just as much time thinking about suitable candidates (usually for a different institution than our own), than if there was no search firm involved. Financially it makes little sense. The money that Universities spend on search firms could surely be better deployed elsewhere – on labs, libraries, widening participation funds, subsidised field courses, pump-priming research grants, etc, etc.

The last time, I had a request to help  – ironically, it arrived while I was pondering this post, and looking through old emails at how many times I’d had these approaches – I tried a different kind of reply. I explained that I was grateful that the search firm recognised my expertise, and of course I was happy to help, but I would expect a consultation fee for my time and judgement. Of course, the response was entirely predictable – ‘I am not in a position with sufficient authority to engage you in such a contract’. The reality is that I’m not especially interested in doing this kind of thing, even for a fee. I did want to send a message though, that I thought it was wrong for a firm to expect that we give up our time to help them make a profit. If I had an approach directly from the hiring body, I wouldn’t hesitate to help and I wouldn’t expect any payment. I just resent the growth of what seems to be another business model that drains time and money from academia, but has no obvious benefits.  To me it seems analogous to dubious publication models that turn a tidy profit by exploiting the good will of academics.

I’d be interested in what others think about this. Perhaps I’m just being a grumpy old fart, who thinks change is bad. Perhaps there are benefits of using search firms that I’ve missed? For now though, I’m not going to help search firms, and I hope others share that view. I also hope that people in a position to decide whether to use search firms think twice before spending money (some of which will come from tuition fees) on them.

Posted in Uncategorized | Comments Off on A moan about ‘Executive Search Firms’ and academic posts

Zebra finch supergene and sperm

We’ve just published a paper on the genetics of sperm morphology and motility in zebra finches. This was the culmination of 4 years work, mostly carried out by Kang-Wook Kim, where we investigated the genetic architecture and gene expression of sperm traits, and we were amazed when it turned out that an inversion polymorphism explained most of the heritable variation. Perhaps we shouldn’t have been so surprised, given the recent burst of cool papers showing how inversions can cause complex phenotypes e.g. in fireants, ruffs (here and here), monkeyflowers, Heliconius butterflies, etc.

Anyway, the journal where we published the work has a rather nice ‘Behind the Paper’ feature, where I go into a bit of background about how the work came about, and what it shows. Rather than repeat it verbatim,  the link to the Blog is here.

A video showing how different the sperm with different inversion polymorphism ‘genotypes’ can be seen below. Note how fast and straight the AB sperm are (you might need to loop the video).





Posted in Uncategorized | Comments Off on Zebra finch supergene and sperm

Comparing PhD Vivas/defences across Europe

A couple of weeks ago I was in Germany to take part in a PhD defence. It was a very enjoyable experience, and the student performed impressively well. There was never any doubt he would pass with flying colours, but what was nicest about the event was that it took place in front of his family and friends (as well as colleagues). It’s great that after 4 years of graft, he could show how much he had learned and could demonstrate his expertise in what probably seemed (to the non-biologists there) a very esoteric area of endeavour. The biologists all appreciated how cool Nicaraguan crater lake cichlids are for studying speciation, but that is something that is probably lost on the wider public (sorry Andi!).

Anyway, while I was there, it got me thinking that we really seem to miss a trick when we examine PhDs in the UK. A typical UK viva involves an internal and external examiner  grilling the student behind closed doors for 3 hours, with a trip to the pub afterwards. The process can be quite stressful (for examiners as well as the candidate and supervisor) and tears are not uncommon (more often from the candidate, but occasionally from the examiners as well!). I don’t doubt the process is appropriately rigorous, but there is no sense of occasion, or opportunity for the student to showcase the end result of all that hard work. To me, this seems a real shame, especially as friends and family don’t get the opportunity to share in the celebration in the same way.

The experience I had in Germany was by no means unique. I’ve examined PhDs in Finland, Sweden, Switzerland and Germany, and in all cases there was some form of public defence. This may sound more daunting than the behind closed doors scenario, but in fact it isn’t. The thesis has already been examined by experts, and the student has already made revisions to it. The defence is not technically a formality, but in reality any problems have been addressed by the candidate before the big day, and it is exceptionally rare for candidates to receive anything other than an outright pass. More importantly, the defence feels like a big occasion that everybody gets involved in and can celebrate. The degree of formality seems to vary between countries, but the more formal vivas actually feel more theatrical and that makes them even more of an occasion. Of the ones I’ve been involved in, they rank Finland > Sweden > Switzerland > Germany in terms of decreasing formality. The Finnish and Swedish vivas involved sit down meals afterwards, that actually felt more like a wedding; contrast that to the UK trip to the local. For the Finnish one, I arrived in my best suit, thinking I was dressed to the nines, only to be promptly taken to an ‘outfitters’ to be measured up for a suit (including tails, separate daytime and evening waistcoats and  two different pairs of shoes). I looked like that notorious picture of Cameron, Osborne and co, only without the (alleged) offshore bank account. It seems to me that there are several advantages to the mainland European defence.

1) Family and friends get to celebrate in a much more involved way.

2) The candidate gives a public talk about their work, which will stand them in good stead for job interviews, conference presentations, etc. It’s not that unusual for a UK student to never give a talk between the start of their PhD and being awarded it, and there is no requirement to present their PhD work to an audience.

3) The public talk may even lead to a job offer there and then. In some countries, there are several opponents (examiners) and it is quite likely that an opportunity to work for or collaborate with one of them will arise.

4) After the defence, everything is done and dusted. In the UK, a common outcome is ‘minor changes’ with a window of 3 months to make them. In other words, the degree isn’t actually awarded right after the viva.

I don’t want to sound too negative about PhD vivas in the UK relative to in mainland Europe (even if it is like comparing a soggy sandwich with a Michelin starred meal). They can often be a rewarding chat (for everybody) and usually the candidate’s friends will make a real effort to celebrate the PhD award with them. A recent trend in Sheffield has been for ‘study organism cakes’, which has led to some genuinely brilliant creations worthy of Bake-Off ‘Showstoppers’ (see below). Of course, there is always the graduation ceremony as well, although that can take place months after the viva. In fact, I didn’t go to mine, as I was living 8000 miles away by the time it came around. I do think that public defences help celebrate the event a bit more, and they are certainly personal to the candidate.

Anyway, I’ve been lucky to have examined 4 such excellent and hospitable students (Paula Lehtonen, Maja Tarka, Pirmin Nietlisbach and Andi Kautt). Perhaps by extolling the virtues of European vivas, I can make it onto a Daily Mail ‘dangerous’ list. I can only hope anyway.

Some viva/defence pictures

Study Organism Cakes – Great tits with large brood (Jenny Armstrong viva); Timema with dark morph (Aaron Commeault viva); bedbug (@Toby_Fountain viva, cake by Sophie Webster @SE_Webster)

JennyVivaCake TimemaCakeTobyVivaCake

Study organism hats and beards seem to be more common in Switzerland and Germany.

Mandarte Island song sparrows (Pirmin Nietlisbach, Zurich) and Nicaraguan crater lake cichlids and lots of seemingly unrelated organisms (Andi Kautt, Konstanz).

PirminHat AndiKautt

Typical attire for a Finnish viva (Paula Lehtonen, Turku). I’m the bald bloke with glasses ….. the one on the right.


Posted in Uncategorized | Comments Off on Comparing PhD Vivas/defences across Europe

Why I’m wary of candidate gene studies

Candidate genes

The candidate gene approach to finding genes that determine phenotypic variation is hugely attractive. Rather than performing whole genome scans, involving large amounts of time and money (and a statistical burden of multiple testing), why not just look at those genes which are good candidates? Candidates can be identified by looking through the literature to find genes that influence similar traits in other organisms such as humans, mice, fruit flies or Arabidopsis. There is no doubt that the candidate gene approach has yielded some nice results, perhaps most obviously in studies of plumage and coat colour polymorphism in vertebrates [1-3]. However, at least in the evolutionary genetics literature, most successes are for traits with a simple Mendelian basis, and  where there is a well-established pathway of which genes interact to produce the phenotype of interest. In other words, the candidate gene approach works well for simple traits.

Of course, most traits are complex, and are determined by many genes of small effect. I worry –  only a little bit, not  in the ‘I can’t sleep at night’ sense –  that the candidate gene approach applied to these traits can send people on a wild goose chase. As I explain below, I think there are already a number of studies in the molecular ecology field where people are investing too much effort into following up candidate gene results which are, at best, equivocal. The example I’m going to use comes from studies of the DRD4 gene and it’s possible role in behavioural variation in wild birds. I’ve chosen this example, because I think the authors of the original studies have been extremely careful. If the results are not clear cut when this much care is taken, I think the problems I describe will be exacerbated in less rigorous studies (of which there are plenty).

Population structure – the known unknown

There are two main statistical problems when assessing whether candidate gene studies are significant. The first is the well known issue that population structure can lead to false positive associations, unless properly controlled for. In a study involving many thousands of tests, it is easy to see when population structure is an issue. A comparison of observed test statistics (or P values) against an expected distribution, shows when datasets are prone to false positives. The slope of the regression of observed against expected values, gives an inflation factor (lambda) which is a measure of the degree of the problem. The plot below shows two lines, each with 100,000 statistical tests – one where there is no issue (lambda = 1) and another where there is a lot of inflation (lambda = 3), caused by population structure.


You might be thinking that an inflation factor of 3 is pretty extreme – it probably is – but it certainly isn’t unprecedented. Susan Johnston (@SuseJohnston) published a very nice paper [4] a few years ago where she performed a GWAS for the age at which wild salmon migrated back to freshwater from the sea. In that study, lambda was 3.24, and without controlling for the genetic structure there would have been many false positive results. We can see the effect of failing to control for structure below:

Here, I have simulated 100,000 datapoints (imaginary SNPs), assuming either no structure (lambda = 1) or salmon-like structure (lambda = 3). The plot below shows the distribution of chi square values following association testing. For clarity, I’ve restricted the x-axis to range from 0.5 to 20, but in fact the difference between the histograms is more extreme than it looks, because the structured dataset has a long tail, while more than half of the lambda = 1 datapoints have chisquare values < 0.5. In the structured dataset, if we (naiively) assumed our test statistic followed a chi square distribution with 1 df we would have a massive false positive problem. Almost 26% of the datapoints are ‘significant’ at P < 0.05, 13.7% are significant at P < 0.01 and 5.7% are significant at P < 0.001. In other words, we are in real danger of assigning biological importance to what are actually chance results.


Fortunately, when we perform GWAS studies, there are methods for controlling for the population genetic structure – the simplest is just dividing the test statistic by the inflation factor before calculating the P value. There are other more sophisticated approaches, but they all do a good job of dealing with the effects of structure. So what’s the problem for candidate gene studies then?

The problem is that we only know we have a problem of inflated test statistics when we have performed genomewide scans with many independent datapoints. When we do a candidate gene study, we typically look at one or a few genes, and so it is virtually impossible to know whether our significant results are real or simply an artefact of unappreciated genetic structure causing false positives.

So, one of the problems with candidate gene studies is that too few statistical tests have been performed for us to be able to say whether a gene really does have an effect on the trait. Strangely, the second statistical problem can arise when we have performed relatively few, yet still too many, tests and the multiple testing problem can make it difficult to know whether a result is statistically significant or not. Typically, this happens when multiple SNPs are examined, sometimes in tandem with several traits and several different genetic models. The DRD4 studies are informative in this regard.

DRD4 in wild birds – background

The paper that started the enthusiasm for studying DRD4 in birds is by Fidler and colleagues and was published in 2007 in Proc. R. Soc. B. [5]. The paper has been very influential; at the time of writing it has been cited > 80 times. Briefly, the authors studied novelty-seeking behaviour in great tits, which was recorded by observing how long it took birds to explore novel objects in a cage or room. Data were collected on hand-reared wild birds, and on birds from selection lines for fast or slow ‘early exploratory behaviour’ (EEB). EEB had previously been shown to be heritable, and to respond to artifical selection in this population, hence the existence of divergent lines. The authors reasoned that the dopamine receptor D4 gene (DRD4) was a good candidate for explaining variation in behaviour because it has been associated with novelty-seeking behaviour in humans (see refs within Fidler).

The authors sequenced the full coding sequence of DRD4 in great tits – no easy task. In total they found 73 polymorphsism (66 SNPs and 7 indels). However, most of their attention was on the 3rd exon, because it was this region that was associated with relevant traits in other vertebrates. They first showed that the exon contained a single synonymous SNP (hereafter SNP830) which differed in frequency between the fast (n=29) and slow (n=21) lines. As the authors pointed out, genetic drift could cause these kinds of effects. Therefore, they analysed a further 91 hand-reared but unselected birds that were wild-caught as nestlings, from 17 nests.

The results of an association study on the wild birds are shown below – this figure is taken directly from the paper. Genotype TT is associated with the highest EEB score, which is consistent with the trend from the selection lines. This study was conducted in the days before many of the ways we correct for population structure were fully developed, but the authors were well aware of the problem. It is unlikely that there was much structure in the data because great tits have a very large effective population size, there is very low genetic structure across Europe, and the birds were sampled from a single location. However, by sampling 91 birds from just 17 nests, structure is introduced by the inclusion of relatives. The authors recognised this problem, by fitting nest as a random effect in their models. Today, we may take a slightly different approach by fitting a relationship matrix (derived from the pedigree or markers) in the model, but the approach Fidler and colleagues took here probably deals with the problem ok, and it certainly demonstrates an awareness of the problem and how to deal with it.


At face value, this looks like a pretty good story. The selection line and wild bird analyses are consistent, and the SNP is in the functionally relevant part of the gene. It’s worth stating here though, that the authors actually fitted three different genetic models, and they also typed an indel in the wild birds, so they actually performed 6 different (but perhaps not independent) tests on the data, which does make the statistical significance pretty marginal (even ignoring the possibility of inflated lambda). To their credit, Fidler and colleagues raised the possibility that the results could be an artefact, caused by population genetic structure. I have no beef with this paper – it was ground-breaking, potentially important, and yet circumspect. However, my worry is that subsequent studies  seem to make an assumption that the DRD4 story is both robust and repeatable elsewhere. I don’t think it is.

DRD4 in great tits – followup studies

The first follow-up paper to Fidler was by the same group – this time led by Peter Korsten [6]. This is a really nice example of an attempt to replicate the original results. Not only do the authors type a new set of wild birds from the original population (Westerheide, in the Netherlands), but they also look at three other great tit populations – Wytham Woods in Oxford, Boshoek in Belgium and Lauwersmeer in the Netherlands. Similar tests of exploratory behaviour were examined in each population. The orginal SNP – SNP830 – and the indel that significantly differed between the fast and slow EEB lines in Fidler were typed in all 4 populations.

The results from Korsten et al are quite striking. First of all, the SNP830 result in Westerheide was replicated in the new set of 77 birds sampled from the wild. Second, the associations could not be replicated in any of the other 3 populations. An analysis pooling data from all 4 populations did not show an association between SNP830 and exploratory behaviour. There was a significant interaction term between SNP genotype and population, but this could mean either there are genuine gene by environment interactions, or the original Westerheide result was a false positive and the others are non-significant. Again, the association between Westerheide and SNP830 is only nominally significant, and if multiple tests for the number of loci, populations or genetic models were performed, it would be non-significant. Regardless though, the evidence for SNP830 being associated with EEB in this population is strengthened by this study.

One of the puzzles from Korsten et al is why wasn’t the original result replicated elsewhere? One possiblilty that the authors suggested is that SNP830 isn’t the causal mutation, but is in LD with an unknown causal mutation closeby. If the LD structure varied between populations, then one might not expect to see similar associations with SNP830 in other populations. Another possibility is that there is no real association in any of the populations, including Westerheide.

In 2013, the same authors (led by Jakob Mueller) typed a much larger number of polymorphisms in the DRD4 region, in the same birds as used by Korsten [7]. Again, this is a very substantial and impressive piece of work. This time, around 98 polymorphisms were studied, which made it possible to define linkage disequilibrium around DRD4 and to make comparisons in pattern of LD between populations. In addition, there were now a large enough number of tests to get some feel for whether test statistics were inflated due to e.g. the population structure or the presence of relatives. In all 4 populations, LD declined quite rapidly, but patterns were quite similar across populations. This means that (i) the different SNPs can be regarded as fairly independent tests (especially if they are in different haploblocks) and (ii) the suggestion that Westerheide shows an assocation with SNP830, but the others don’t, due to heterogeneity in the LD between SNP830 and a causal variant, is not really supported.

The data from Mueller et al are on Dryad, and so I have had a look at them in a little more detail. I was interested to know whether (i) there is much evidence for test statistic inflation (i.e. lambda >1), which could be driving false positive results and (ii) whether the significant results look very compelling. For simplicity, I have only looked at additive models for a couple of reasons. First, the original SNP830 association in Westerheide was more significant using this model compared to models that assumed dominance or overdominance. Second, I don’t find the reasons for fitting overdominance models very compelling, although the reasons for that can wait for a future post.

I followed the model fitting procedure used by Mueller; SNP genotypes were coded 0,1 or 2 and fitted as a covariate in models of association with personality score. For the Westerheide population brood was fitted as a random effect. Plots showing observed against expected p values (assuming a uniform distriubtion of P values between 0 and 1) are shown below. Each SNP is a separate datapoint and SNP830 is shown as a red dot.




Let’s first look at the Westerheide population – the combined data from both years (i.e. the Fidler study and the Korsten study) shows that SNP830 was more strongly associated with exploratory behaviour than any other locus. Not only that, but the data fit pretty well on the lambda = 1 line, but with SNP830 some way above it. This looks pretty convincing to me, although compared to the most significant loci in a genome wide association study, it is not very significant. It’s also noteworthy that when the data are restricted to birds sampled in 2007 – these are the birds in the Korsten follow-up study – SNP830 is still the most significantly associated SNP.

In the three other populations, SNP830 is unremarkable. There is no evidence whatsoever that it is associated with exploratory behaviour. The other populations do not show serious problems with inflation of test statistics, although there is a hint of it in the Wytham Woods population. The most significant SNP in Lauwersmeer is more significant than expected by chance, but not by much. It should be said, that none of these SNPs reach significance after applying a Bonferrroni correction or similar procedure.

On balance then, I think the SNP830 association in Westerheide looks reasonably convincing, but I don’t think associations between DRD4 and exploratory behaviour in other great tit populations have been demonstrated. The authors of these papers have always been very careful to highlight alternative explanations, and overall it’s an impressive body of work. What I find surprising, is that despite these fairly equivocal results, there seems to be a slew of follow-up DRD4 candidate gene studies in other bird species, with what look to me, to be fairly wishful interpretations of the results. I worry that a narrative is building up that DRD4 explains lots of variation in personality traits in lots of different bird species. What do these other papers show?

DRD4 in other bird populations

I’ll briefly discuss some of the other studies.

Gillingham and colleagues looked at DRD4 in relation to body condition in flamingoes [8]. They argue that DRD4 genotypes are associated with the phenotype, but it is pretty hard to tell whether there is much going on. They use an unconventional (for association studies) approach of AIC-based multi-model parameter estimation and they only test DRD4. I dislike this approach for association studies, because by studying a single locus (as is the case here) or fitting one locus at a time it is really easy for a locus that doesn’t affect a trait to appear to explain a few % of the variation. The authors actually typed 10 microsatellites, and it is my guess that fitting genotypes at the microsatellites would have yielded similar results to DRD4. Certainly, I think the claim that “This is to our knowledge, the first study to show an association between exon 3 DRD4 polymorphism and body condition in non-human animals” needs some more validation before it can be believed.

In a study of adaptation to urban environments in blackbirds [9], Mueller et al (2013) examined 16 candidate polymorphisms (including 5 in DRD4) in 12 populations, but there was no compelling evidence that DRD4 played a role in adaptation.

Mueller and colleagues also scored the effects of DRD4 polymorphisms in exploratory behaviour in yellow-crowned bishops [10]. They replicated the study in Portugese and Spanish populations. The sample sizes were fairly small, nor was there any control for population structure. A large proportion of SNPs were non-significant, but two SNPs were associated with behavioural variation. One of these SNPs was significant in both populations; the other was significant in one population and approached significance in the other. The directions of the associations were the same in both populations If both SNPs had been significant in both populations (they narrowly missed), the result would have been experiment-wide significant. This study looks worthy of follow-up, although I’d be a little wary of population structure being an issue given the demographic history of this species and the way the birds were sampled/obtained. The jury’s still out for me.

Garamszegi and colleagues looked at two DRD4 SNPs in relation to three behavioural traits in collared flycatchers [11]. One association was significant and another marginally so – again though, we know very little about what test statistics would be generated by other loci and whether population structure is a confounding issue.

In a study of invasive starling populations in Australia [12], Rollins used propensity to enter a trap as a proxy for boldness. There were no significant associations between DRD4 and boldness – sample sizes were quite large in this study.

There has been a new great tit study this year [13], with an Estonian population typed at SNP830 and measured for feeding times/delays when presented with a novel object near the nest. In males, birds with the CC genotype delayed feeding longer than other birds, but the effect was not observed in females. The sample sizes were modest. It’s a tantalising result.

Finally, an attempt to examine DRD4 in relation to personality traits in Seychelles Warblers found no DRD4 polymorphisms [14].

Final Thoughts

Overall, I think some of the follow up studies provide some, tentative evidence for associations between DRD4 and personality traits, but I’m not convinced that any randonmly chosen gene would have given much weaker evidence.  Would I take a candidate gene approach to studying behavioural variation in birds? Probably not. If we cannot replicate results across different great tit populations (where genetic structure is low), it doesn’t seem likely that we can easily detect variation in other species. Is there good evidence that DRD4 explains behavioural variation in other bird species? I don’t think so.

I do think that the main problem with candidate gene studies is the more general one, that we just don’t know how significant something is until we have seen data from across the genome. Fortunately, with whole genome scans becoming ever easier, the need to even perform candidate gene studies is diminishing. For what its worth, I would avoid them unless the trait has a really simple Mendelian basis and the genetic pathways underlying the trait are well understood.


1. Nachman et al. (2003) The genetic basis of adaptive melanism in pocket mice. PNAS 100: 5268–5273
2. Mundy et al. (2004) Conserved genetic basis of a quantitative plumage trait involved in mate choice. Science 303: 1870-1873
3. Gratten et al (2007) Compelling evidence that a single nucleotide polymorphism in TYRP1 is responsible for coat colour polymorphism in a free-living population of Soay sheep. Proc. R. Soc. Lond B. 274: 619-626
4. Johnston et al. (2014). Genome-wide SNP analysis reveals a genetic basis for sea-age variation in a wild population of Atlantic salmon (Salmo salar). Molecular Ecology 23:3452-3468
5. Fidler et al. (2007) Drd4 gene polymorphisms are associated with personality variation in a passerine bird. Proc. R. Soc. Lond B. 274: 1685-1691.
6. Korsten et al. (2010) Association between DRD4 gene polymorphism and personality variation in great tits: a test across four wild populations. Molecular Ecology 19: 832-843.
7. Mueller et al (2013) Haplotype structure, adaptive history and associations with exploratory behaviour of the DRD4 gene region in four great tit (Parus major) populations. Molecular Ecology 22:2797-2808
8. Gillingham et al (2012) Genetic polymorphism in dopamine receptor D4 is associated with early body condition in a large population of greater flamingos, Phoenicopterus roseus. Molecular Ecology 21:4024-4037.
9. Mueller et al (2013) Candidate gene polymorphisms for behavioural adaptations during urbanization in blackbirds. Molecular Ecology 22:3629-3637
10. Mueller et al (2014) Behaviour-related DRD4 polymorphisms in invasive bird populations. Molecular Ecology 23:2876-2885
11. Garamszegi et al (2014) The relationship between DRD4 polymorphisms and phenotypic correlations of behaviors in the collared flycatcher. Ecology and Evolution 4: 1466–1479
12. Rollins et al (2015) Is there evidence of selection in the dopamine receptor D4 gene in Australian invasive starling populations? Current Zoology 61: 505–519
13. Timm et al (2015) DRD4 gene polymorphism in great tits: gender-specific association with behavioural variation in the wild. Behavioural Ecology and Sociobiology 69:729–735
14. Edwards et al (2015) No Association between Personality and Candidate Gene Polymorphisms in a Wild Bird Population. PloS One 10: e0138439.
Posted in Uncategorized | Comments Off on Why I’m wary of candidate gene studies

In defence of Paula Radcliffe and her biological passport

Most posts on here will be about evolutionary genetics, but here’s a diversion into athletics and Sports Science. Paula Radcliffe, the marathon world record holder and UK running legend, has been in the news here (and I guess elsewhere) following a Parliamentary Culture, Media and Sport Committee Hearing on doping in sport. Although she wasn’t named, a comment was made that could only really point the finger at her and nobody else. She released a press statement later that morning denying that she has ever cheated, and explaining both her anger, and the reasons for her unusual tests – more details below.

The current process for trying to detect drug cheats in sport is a process where athletes are regularly blood tested, in order to build up a picture of their ‘biological passport’. Various blood parameters are measured, and the idea is that a ‘clean’ athlete will show pretty consistent readings for these parameters while a cheat will show much more variation between samples. The data are formally analysed using a Bayesian framework. Testers can then identify suspected cheats and test them more rigorously. The Sunday Times, one of the UK broadsheet (i.e. not tabloid) papers has been running a series of investigations about doping in sport, especially athletics. They obtained access to biological passport data from 2001 and 2002, and without naming any athletes, suggested that unusual biological passports were very common, and that doping was rife in endurance sports such as long-distance running. They argued that the IAAF, the body which governs world athletics was ignoring much of these data, and that many Olympic and World Championship medallists showed these patterns, among them (shock, horror), athletes from the UK, including one very well known one. There has been some speculation on social media etc who this unnamed athlete was, and what the results showed. Well now we know who it was, and we know a fair bit about the samples.

What happened yesterday:
During the Parliamentary committee meeting yesterday (available here at about 12:18 – http://goo.gl/0Gicie) the chair essentially revealed the identity of that well known athlete as Paula Radcliffe. He implied that the list of ‘suspected’ athletes included London marathon winners, including a British one. Paula Radcliffe is the only British winner of the London marathon since (well before) 2001. Because the meeting took place in parliament, he is protected from legal action by ‘Parliamentary Privilege’. I didn’t actually notice any reporting of his comments or of people putting two and two together, until Paula Radcliffe released a statement condemning his ‘implication’ that she was cheating. The statement gives quite a bit of background to her samples, including the revelation she has produced three ‘abnormal’ readings that were seen by the Sunday Times.

Athletes releasing their Passport data:
In the months following The Sunday Times article and a BBC Panorama documentary on doping in athletics (where it was alleged that Mo Farah’s coach was involved in helping athletes to cheat), there has been quite a bit of discussion about whether athletes should make their passport data available to the public to ‘prove’ that they are clean. In fact, Mo Farah and 9 other top UK athletes released their data. Each of them showed very little fluctuation over the time series of their readings and their parameters were well within the normal boundries. Paula Radcliffe didn’t release hers (she had no obligation to), but she did point out (correctly) that a biological passport could not prove that somebody was clean, and perhaps more worryingly, clean athletes might still produce abnormal readings which, although explainable, would look suspect if taken out of context. It’s pretty clear now, that she falls into this latter category.

What do Paula Radcliffe’s samples show, and what does the science say?
We can only really glean what was ‘abnormal’ about Paula Radcliffe’s samples from her statement yesterday – it’s in full here, http://www.paularadcliffe.com/statement-september-2015 – because the data are not available, and anyway they are probably impossible to analyse in isolation. What is clear though, is that the three ‘abnormal’ samples were taken following periods of altitude training, and that one of them was taken right after a race held in high temperatures. The questions I had, when I read this defence, were ‘Do these things matter?; Do they affect the readings?’ A quick delve into the scientific literature and it’s pretty clear that they do matter…… a lot. As she writes in her statement, experts have analysed her data and concluded there is no case to answer.

I don’t claim this to be an exhaustive review, but it is a summary of a fair bit of the most recent, most relevant and most cited literature on biological passports. For a couple of decent reviews of the Athletes Biological Passport (ABP), check out [1, 2]. They explain how the ABP works and what parameters are fitted in the model (in case you are wondering, they are gender, ethnicity, age, altitude – both at the time of the sample and beforehand, type of sport and the make/model of the analysis equipment). The model estimates the expected range of each blood parameter, based on both previous readings and the parameters above, and then reports whether scores exceed the e.g. 99%, 99.9% sensitivity threshold. In PR’s statement she says one of readings was above the 1 in 100 false positive test threshold (99% sensitivity) but below the 1 in 1000 threshold (99.9% sensitivity). Although the ABP model takes into account altitude and type of sport, it does this in a fairly crude way. For example, it fits into the model whether it was an endurance sport or not, rather than distinguishing between, say, a 5000m race and a marathon. With altitude, it simply fits whether the training or test were conducted above or below a certain elevation e.g. 1000m. It doesn’t distinguish between different types of endurance event, or between having just trained at altitude and having returned some time ago from training at altitude. The extent to which a sample will be an outlier will also depend on how many previous tests were carried out in similar (or different) conditions. There is strong evidence that readings are sensitive to the time since competition or intense training [3]. In fact, it is recommended that samples are taken at least two hours post a race [note Paula Radcliffe’s comment that hers was taken right after a half marathon race].

A couple of studies have actually looked at how often ‘abnormal’ results are returned when elite athletes are sampled while competing or training at altitude. The sample sizes are small and the studies were done on swimmers and cyclists rather than runners (although both are endurance sports, like marathon running). The athletes were all thought to be clean, based on long-term monitoring of their samples. The findings are remarkable though. The study of swimmers showed that after altitude training, 6 out of 10 swimmers showed ABP scores above the 99% threshold [4]. Among cyclists competing in a multiple day race at altitude, 5 out of 25 exceeded the 99% threshold (1 in 100) and 2 of them were above the 99.9% threshold [5]. The combination of being at altitude and competing intensely gave results that, taken in isolation, would be regarded as abnormal or even highly abnormal. In light of these findings, Paula Radcliffe’s results look far from abnormal; they actually look pretty typical.

I don’t believe for one second that Paula Radcliffe is a drugs cheat. Perhaps more than any other athlete she has taken a strong and clear stance against cheating in her sport. Most famously perhaps, when she and a team-mate held up a sign reading ‘EPO Cheats Out’ in the 2001 World Championships, after an athlete failed a drugs test and was reinstated to the 5000m. Bringing that attention on herself is not the action of a cheat. It looks to me as though the Sunday Times dataset has been analysed in a less than robust way, and therefore the number of ‘abnormal results’ far exceeds the number of genuinely illegal samples. I don’t doubt that cheating is pretty common in athletics, and there is evidence that systematic doping is happening in some countries; see [6] for example. The science behind the ABP looks pretty robust, and the leading experts are well aware that the underlying models need constant refining e.g. “In particular, modifications of haematological parameters during and after exposure to different altitudes/hypoxic protocols need to be properly included within detection models.”[7]. In fact, I was impressed with the sophistication of the analyses in some of the papers I read – I was prejudiced about the rigour of sports science. The ABP is a vital tool in catching cheats, but misusing it can have devastating consequences for clean athletes. In the case of Paula Radcliffe, even if subsequent expert opinions clear her of wrong-doing (as I’m sure they will), some mud is sure to stick. People will remember the controversy now, far more than any subsequent lower profile ‘actually the samples look ok’ story. My hope is that her reputation is not tarnished, and that ABP testing continues to be refined, because there is no doubt that when used properly, it is a powerful weapon to catch cheats.

1. Sanchis-Gomar, F., et al., Current limitations of the Athlete’s Biological Passport use in sports. Clin Chem Lab Med, 2011. 49(9): p. 1413-5.
2. Sottas, P.-E., et al., The Athlete Biological Passport. Clinical Chemistry, 2011. 57(7): p. 969-976.
3. Schumacher, Y.O. and G. d’Onofrio, Scientific Expertise and the Athlete Biological Passport: 3 Years of Experience. Clinical Chemistry, 2012. 58(6): p. 979-985.
4. Bonne, T.C., et al., Altitude training causes haematological fluctuations with relevance for the Athlete Biological Passport. Drug Testing and Analysis, 2015. 7(8): p. 655-662.
5. Schumacher, Y.O., et al., High altitude, prolonged exercise, and the athlete biological passport. Drug Testing and Analysis, 2015. 7(1): p. 48-55.
6. Sottas, P.-E., et al., Prevalence of Blood Doping in Samples Collected from Elite Track and Field Athletes. Clinical Chemistry, 2011. 57(5): p. 762-769.
7. Sanchis-Gomar, F., et al., Altitude exposure in sports: the Athlete Biological Passport standpoint. Drug Testing and Analysis, 2014. 6(3): p. 190-193.

Posted in Uncategorized | Comments Off on In defence of Paula Radcliffe and her biological passport

Wild Animal Genomics Meeting – Trondheim

The plan is to get into a (fairly) regular habit of blogging about evolutionary genetics research (and perhaps a bit of running and biking stuff). Anyway, 1st post is about a meeting I recently attended in Trondheim.

The first Wild Animal Genomics meeting was run by Josephine Pemberton and I, using funds from our ERC grants. That took place in Scotland three years ago, and was aimed at bringing together researchers that were using, or planned to use, genomics tools to study quantitative genetics, selection and microevolution in wild populations. In June this year Henrik Jensen and Arild Husby put together a follow-up meeting, which was held just outside Trondheim. I really like these kinds of meetings – fairly small (~30-35 people) and everybody working on similar topics. The meetings are modelled on the WAMBAM meetings which have been running for around 10 years and have proven hugely successful in bringing together a community of researchers using ‘animal models’ to study quantitative genetics in pedigreed populations.

One of the most striking things at the Trondheim meeting was how quickly research in the field has moved on. Three years ago, we were only just starting GWAS analyses, virtually nobody had full genome assemblies, and a state-of-the-art SNP chip had 10,000 markers. There was a recognition that QTL mapping by linkage analysis was hugely under-powered, but a (in hindsight very naive) hope that association scans would get around this, and that many of the quantitative traits we were studying would harbour some loci of moderate-large effect that might explain measurable variation in fitness. At the recent meeting, there were examples where reseaerchers had sequenced whole genomes of >1000 individuals, and a number of presentations showing GWAS results with high density SNP chips or genotyping-by-sequencing methods. In general (and there were some exceptions) the attempts to find loci underpinning quantitative variation hadn’t found much. Lessons from studies of complex traits in humans suggest this isn’t suprising, although there are clearly some wild populations where patterns of high LD and recent admixture mean that some loci will be found.

So does this all mean that the genomic approach to understanding genetic variation in wild populations has failed? I don’t think so. If one regards ‘gene hunting’ as the main reason for doing this work, then sure, there will be some disappointed researchers. However, the most exciting thing about the recent advances in molecular quantitative genetics has been some of the cool research that has gone beyond a GWAS and given us genuine insight into the genetic architecture of quantitative traits. It isn’t necessary to find the loci that explain significant variation in order to perform analyses such as chromosome partitioning, genomic prediction, regional heritability mapping, and inbreeding coefficient estimation. Not only that, but these analyses do not require pedigrees and are robust to different kinds of genetic architecture. With that in mind, I still think these are exciting times for evolutionary genomics of wild populations. In the next few years we should expect to see research that describes the architecture of traits, identifies genetic changes in response to selection, gives us new insight into the mechanisms of inbreeding depression, helps us understand sexually dimorphic traits and no doubt addresses a whole load of other questions. Not only that, but we can start to do so in any population where phenotypes and DNA can be collected; the constraint of needing decades worth of pedigree data is no longer there, and so we should get broader taxonomic representation in this field. The biggest challenge will probably be in collecting good phenotypic data in large enough numbers to perform robust analyses. In other words, the most important researchers in the genomics era will still be the dedicated natural historians and ecologists that have underpinned so much of what we do today.

Posted in Research | Tagged | Comments Off on Wild Animal Genomics Meeting – Trondheim