In order to explain the complication I'll first give a background on the different types of natural selection that can act on a gene.
Types of natural selection
Natural selection (hereafter simply "selection") is broadly divided into three major categories: negative, positive and balancing selection.
Negative selection is the most common, and probably the least interesting. It acts on genetic variants that reduce the reproductive fitness of organisms, decreasing their frequency until they (usually) disappear completely from the population.
Positive selection is much less common, and much more interesting. It occurs in the rare case that a genetic variant (or allele) actually increases the fitness of organisms that carry it. This means that carriers tend to have more offspring, on average, than non-carriers - so positively selected alleles will tend to increase in frequency over time until they reach "fixation" (that is, they're carried by every single individual in the population). Almost all of the recent genome-wide scans for selection in humans (for instance here, here and here) have been looking for examples of positive selection.
Balancing selection is a special case of selection, which results in multiple versions of the same gene all being maintained at some frequency within the population, sometimes over a very long period of time. Classically, balancing selection occurs in two situations:
- heterozygote advantage, when the fitness of heterozygotes (individuals carrying two different alleles) is higher than that of either of the two homozygotes (individuals who carry two identical alleles); or
- frequency-dependent selection, when genetic variants are advantageous if they're at low frequency, but become less beneficial as they reach a higher frequency.
Frequency-dependent selection appears to play a role in maintenance of extreme levels of genetic diversity in the human major histocompatibility complex (MHC) genes. I'm not going to discuss it much here since it's not really relevant to the rest of the post, but anyone interested in more details can check out good old Wikipedia.
The critical point I want to make is that while positive selection will usually tend to increase the frequency of an allele until it reaches 100% frequency, balancing selection can result in a situation where an allele reaches a stable frequency that is less than 100%. For a case of heterozygote advantage, the stable frequency will be the point at which the selective advantage of heterozygotes is cancelled out by the selective disadvantage of homozygotes.
How does this apply to the climate genes paper?
The authors of the paper assessed whether selection for climate adaptation had acted on each genetic variant by asking a simple question: does the frequency of this variant correlate with the climate experienced by each population? In other words, if you drew a graph with allele frequency on one axis and some climate variable on the other, would there be a significant linear relationship between the two parameters?
An example of one of the correlations they saw in their study is shown below: the pie charts on the map show the frequency of a variant in the RAPTOR gene, while the map is coloured by winter maximum temperature; the graph shows allele frequency (on the y axis) plotted against a summary statistic of winter climate (on the x axis), with populations colour-coded by geographical region. You can see that there's a reasonable linear trend on the graph.

The authors, quite reasonably, interpret a significant linear relationship as evidence of climate-driven adaptation - in other words, the frequency of these variants has been altered by natural selection as they provided an advantage or disadvantage under particular climatic conditions.
But what type of selection are we talking about? The authors suggest that their data are "consistent with a signal of spatially varying positive selection" - in other words, positive selection for something like, say, cold resistance, which will obviously vary in intensity between different geographical regions. Under this model, a cold-resistance variant will tend to increase faster in populations living in colder areas than in those living in warmer areas, and if a cold-adapted population moves into warmer climates (like those Siberians moving down into South America) the variant will presumably tend to decrease in frequency again as the old, warm-adapted allele increases.
What this means is that once a population settles down in a particular region, cold-resistance alleles will either head steadily towards 100% (if the region is cold), move steadily down towards 0% (if the region is warm), or drift around slowly under no particular selection until it reaches 0% or 100% by chance (if the region is somewhere in the middle).
Would this type of selection actually result in a neat linear trend, like that seen for the RAPTOR gene? Well, it might, if the timing was just right, but it's by no means a necessary outcome. There are at least three variables in play here, each of which will have some effect on the current frequency of a positively selected allele: the strength of selection, the starting frequency of the allele in that population, and the amount of time the population has existed in its current environment. For positive selection to result in a clean linear correlation between allele frequency and a climate variable, the latter two factors would have to have had a negligible impact, so that most of the variation is determined by selection intensity.
I think that's pretty unlikely given what we know about human population history: native Americans, for instance, are the descendants of a cold-adapted population living in Siberia that only relatively recently moved down into the warmer climates of central America; selection has not yet had much time to act in these populations. In contrast, humans in Southern Asia have been in their current climate much longer, giving selection more time to do its work. Thus for variants under positive selection, current frequency will be substantially affected by historical contingencies, and the correlation between allele frequency and selective strength will be rough at best.
Another (less convincing) reason to be sceptical of positive selection is that the genes reported in this paper don't seem to show the classic genetic signatures of this type of selection. Positive selection can be detected using a measure known as the integrated haplotype score (iHS), as described in this study. iHS for most of the genes in the human genome in Europeans, East Asians and West Africans can be analysed using this handy online browser; of the sixteen genes that are most significant in the climate genes study, only one (LEPR) shows significant evidence of positive selection, and one other (MAPK1) is a marginal case. The others show no striking evidence of recent positive selection.
The iHS test is imperfect in a number of ways - it certainly misses some types of positive selection, and it may not be sensitive enough to pick up very local selection (e.g. selection acting on just a few smaller populations within East Asia, rather than East Asia as a whole). But the absence of clear signals of positive selections for these genes adds to the case that simple positive selection may not be the major player here1.
If not positive selection, then what?
You've probably already guessed my hypothesis: at least some of the genes pulled out from this study (and probably the ones with the tightest correlations) have been the targets of balancing selection. Balancing selection could be acting on climate genes in different ways, but my first guess would be heterozygote advantage.
In this model, individuals who carry one copy of the (say) cold-resistance allele - that is, heterozygotes - are favoured in cold climates, but those who carry two copies (homozygotes) suffer from some type of deficit that makes them less fit than the heterozygotes. There are various plausible ways that this could happen: for instance, in creating its cold-resistance effects the variant might disrupt an important normal function of the gene, so that people need to have at least one copy of the "normal" version to be healthy.
Under this scenario the frequency of the cold-adapted allele will increase in cold climates, until it reaches a high enough frequency that the benefits of being cold-adapted are outweighed by the increased risk of having sick homozygote kids (since at high allele frequencies, you're more likely to mate with another heterozygote). The equilibrium frequency will depend on two factors: the fitness benefit of being a heterozygote, and the fitness cost of being a homozygote. Since heterozygotes will be fitter in colder climates, this equilibrium frequency will be higher in those places than in warm climates2.This model would result in each population reaching a stable allele frequency that is correlated with the local temperature, regardless of its starting frequency and how long the population had been subjected to that particular environment - so long as there has been enough time for the population to reach equilibrium. This scenario is much more likely to result in a linear correlation between allele frequency and climate variables than a simple positive selection model.
So what type of "deficit" might homozygotes suffer from, for the genes explored in this paper? Well, remember what class of genes were specifically targeted in this analysis: genes known to be associated with metabolic disease. This gene set may well be enriched for variants which provide an adaptive boost to novel climates or food resources in the heterozygous state, but which increase the risk of metabolic disease (be it mild or severe) when homozygous.
So what?
Note that I'm not saying that positive selection hasn't played a role in climate adaptation - in fact, I'm sure it's played a much bigger role than balancing selection - but rather that the straightforward correlation of allele frequency with climate variables used in this paper has much more power to detect balancing selection, so many of the genes they pick up are likely to be due to balanced polymorphisms. Efficiently detecting genetic adaptation to climate driven by positive selection will require a more complex approach - but this is a topic for another day.
A heterozygote advantage situation should be seen as an imperfect and temporary solution to the problem of adaptation, one that evolution uses during the earliest phase of adaptation in the absence of anything better. Ultimately, human populations are likely to stumble across more elegant long-term solutions: fixation (by positive selection) of more subtle "tweaking" mutations that adapt the population to its local climate, without the troublesome handicap of also causing disease in homozygotes. Once this has happened, studies such as this one based on simple allele frequency would be completely unable to uncover climate adaptation genes.
Update: Razib at Gene Expression comments. Hi to new readers from ScienceBlogs - if you like what you see, you can always subscribe to Genetic Future.
1 The lack of an iHS signal also suggests that these alleles have been hanging around in the human population for quite a while; the early phase of balancing selection on a newly arisen allele (a sharp increase in frequency before equilibrium is reached) will create a signal that is identical to that created by positive selection. So if this is balancing selection, either it acted recently on standing neutral variation, or it's been acting for a long time - longer than the 10-50,000 years required for an iHS signal to decay to background noise.
2 This assumes that the fitness cost to homozygotes is independent of climate, which will often but not always be true.
Blizzard image from here.
5 comments:
Balancing selection is a special case of selection, which results in multiple versions of the same gene all being maintained at some frequency within the population, sometimes over a very long period of time. Classically, balancing selection occurs in two situations:
i was taught that environmental & temporal heterogeneity could also do the trick.
Heh, I'm currently struggling through the literature on precisely that issue. I left it out of the post (and used the weasel word "classically") because fluctuating selection is a much more contentious mechanism than overdominance or frequency-dependent selection. However, I'm trying to get to grips with it because it seems like it might be highly relevant to climate adaptation.
The literature is actually surprisingly scarce on this issue, possibly because the added variables (the temporal and spatial scale of variation) make it tough to model. Hopefully I'll be knowledgeable enough to post on it over the next couple of weeks.
check out the chapter's in genetics of populations by hedrick.
I liked your post a lot and I find your hypothesis more than convincing...
I'm struggling with clines and selection my self and I would like to ask you a question, which is merely about definitions and terminology: do you think that spatially varying positive selection can be defined as balancing selection, if you take into account the whole population, from North to South? From this point of view the system is maintaining different alleles which are better adapted to different environments, which is exactly what balancing selection is supposed to do.
Thanks a lot!
Hi Valeria,
Yes, I think spatially varying selection absolutely qualifies as a form of balancing selection. As I mentioned to razib above, I didn't discuss this possibility as I wasn't able to find much information on the specific requirements (levels of gene flow, strength of selection) for this type of balancing selection to operate, so I'm unsure how realistic it is for the human population. Razib has referred me to a book chapter that might help once I track it down.
If this is in fact the type of balancing selection that is operating, then my entire post is basically an exercise in semantics - the authors were totally correct when they talked about "spatially varying positive selection", and I'm also correct when I refer to it as "balancing selection"!
Post a Comment