Sunday, April 6, 2008

The elusive genetics of bipolar disorder

Bipolar disease is a common and profoundly debilitating mood disorder, with a remarkably strong genetic component. These features have made bipolar an appealing target for geneticists; yet despite three large genome-wide association studies, the genetic basis of this disease remains as elusive as ever.

This post serves as an introduction to the complex genetics of bipolar disease. In later posts, I'll be discussing the scientific basis of commercial tests for bipolar disease offered by the company Psynomics, and the implications of the genetic architecture of bipolar and other mental illnesses to normal variation in human personality traits.

The goal
Identifying individuals at serious risk of developing mental health problems later in life raises the possibility of early interventions, which might save at-risk individuals - and society as a whole - from the worst effects of mental illness. However, in order to develop predictive tests we first need to characterise the underlying risk factors, including causative genes.

Bipolar disease is a serious mood disorder that affects somewhere between 1 and 2% of individuals of European descent. Surprisingly, around 85% of the variation in risk for this disease is determined by heritable factors (i.e. genes), meaning that genetic approaches seem likely to be a fruitful way to develop predictive tests. Accordingly, bipolar has now been a target for three large genome-wide association studies (GWAS) involving a total of 4,684 patients and 6,447 healthy controls.

And the results of these studies have been - well, almost nothing. Thus far, not a single bipolar marker has been convincingly replicated in more than one of these studies. Bipolar disease thus serves as an unfortunate but illuminating poster-child for the limitations of genome-wide association studies that I discussed last week.

The WTCCC analysis

The largest genome-wide association study of bipolar conducted to date was the Wellcome Trust Case Control Consortium (WTCCC) analysis of 3,000 healthy individuals and 2,000 bipolar patients.

The WTCCC study is a truly remarkable piece of science: simultaneous genome-wide analysis in seven different common diseases (bipolar, coronary artery disease, Crohn's disease, hypertension, rheumatoid arthritis, type 1 diabetes and type 2 diabetes).

The diagram below illustrates the results of the genome scans performed by the WTCCC in all seven common diseases (each with ~2,000 patients compared to a shared set of ~3,000 controls). Each dot represents a different genetic variant found in one of our 23 chromosomes (numbered below the line), with its height above the horizontal line indicating how closely it was associated with that disease. Green dots indicate variants that were associated with each disease with a reasonable degree of statistical confidence; blue dots represent variants that couldn't be distinguished from random noise.


Hopefully you can see that while most of the other diseases show a a series of nice green peaks indicating regions of the genome that are associated with disease, bipolar disease - and, curiously, hypertension - are relatively flat and featureless wastelands. If you peer closely at the bipolar dots you'll see a few green markers here and there that peek above the statistical noise of the rest of the genome, but nothing like the green towers that furnish the other diseases (for the curious, that massive signal on chromosome 6 in both the rheumatoid arthritis and type 1 diabetes samples is the MHC cluster of immune genes).

In terms of risk genes identified by the study, bipolar disease performed worse than five of the other six diseases, with only one of those green dots reaching convincing genome-wide significance (as opposed to, say, nine in Crohn's disease and seven in type 1 diabetes). And it gets worse: that single significant risk variant disappeared when the researchers used an expanded reference group approach, essentially comparing bipolar samples with a pool of the controls plus the other six disease groups (which can, fairly reasonably, be considered "controls" for this comparison). The authors identified a few other regions with a weak association with the disease, but nothing further that satisfied their stringent criteria for a convincing signal.

Other genome-wide studies
The two other bipolar GWAS performed within the last twelve months haven't made the picture any less murky.

The first ever genome-wide association study of bipolar (published online in May 2007) used a DNA-pooling strategy that is more cost-effective but significantly less powerful than the traditional GWAS approach, followed by replication studies of interesting-looking variants. The authors of this study reported (with surprising confidence) an association between bipolar and a variant in the DGKH gene - the senior author even remarked to the media that "DGKH is a promising target for new treatments that might be more effective and better tolerated" than the existing therapy, lithium. It's not looking so promising now: there's absolutely no trace of the DGKH association in either the WTCCC study or the other genome scan described below. The other, weaker signals seen in this study (with the possible exception of DFNB31, described below) haven't fared much better, receiving no convincing validation from either of the later GWAS.

The third and most recent genome-wide study used a similar approach to the WTCCC analysis, examining 1461 bipolar patients and 2008 controls. It tells a now-familiar story: while the authors identified a number of variants that were somewhat more common in bipolar patients than controls, not one of their top 20 regions overlaps with any of the suggestive signals in either of the other two GWAS.

The best the authors could find is an overlap between their results and the WTCCC for a variant found in the CACNA1C gene, but this is a bit of a stretch; this region is not strongly associated in either of the studies and it's entirely possible that the overlap is down to chance. The same is true for the DFNB31 gene: although markers in this gene are weakly associated with bipolar in all three of the GWAS conducted to date, the variants flagged by the WTCCC study are physically distant from those found in the other two studies. These two genes certainly warrant detailed follow-up studies, but they're not convincing risk genes yet.

To add insult to injury, the authors' attempts to replicate their own findings in independent samples bore little fruit: although a few of the findings were marginally statistically significant, the number was no higher than would be expected by chance alone.

A bitter harvest
In other words, despite valiant attempts, these three large genome-wide association studies have yielded very little new useful information about the specific genes underlying bipolar risk. They have certainly provided leads to be followed up in targeted studies, and it's worth bearing in mind that a failure to replicate doesn't necessarily mean that all of the variants identified in these studies are false leads - rather, the inconsistency could simply be the result of insufficient power in each individual study leading to the identification of random and non-overlapping sets of risk genes being identified by each group. But this must seem a pretty distant consolation for the investigators in these studies, given that there's still no way to determine precisely which of the possible associations this applies to.

The near-complete failure of GWAS in this disease does tell us something about the genetic architecture of bipolar: it is not composed to any significant degree of the common, moderate-effect single-base variants that can be readily detected by current chip-based GWAS technologies.

So where, then, is that heritable 85% of bipolar risk hiding in the genome? In their discussion section, the authors of the most recent genome-wide study put forward several explanations, all of which I've discussed in my recent post on the reasons for failure in genome-wide scans: variants with modest effect sizes, population-specific variants, disease heterogeneity, epistatic interactions, copy number variation, and rare variants. Population-specific variants are unlikely to have played a role in the discrepancy between these three studies (all of which were conducted on subjects of predominantly western European ancestry), but there's a pretty good chance that all of the other factors play a role.

The researchers are surely hoping that small effect sizes are the major problem, since this is the easiest problem to remedy (simply increase sample sizes). Disease heterogeneity - in other words, multiple diseases with distinct causes that all converge on a bipolar end-point - also seems like a particularly plausible explanation given the complexities of mental illness. It's also likely that various types of genetic variants that are largely invisible to existing SNP chips, like rare variants and copy-number variation, are important. I'll be discussing these in more detail soon when I review a recent paper on rare copy-number variations in schizophrenia patients.

The next steps
So long as at least some of that heritable bipolar risk stems from common variants with weak effects on disease risk, it will eventually be captured by further genome-wide scans with much larger numbers of bipolar patients and controls. A relatively cheap way to start this will be to combine the results from existing studies: in the discussion section of the most recent paper discussed above, the authors mention plans to perform a combined analysis with the WTCCC investigators. This will be made easier by the fact that both studies were performed using the same (Affymetrix 500K) genotyping platform. Unfortunately, because the third scan was performed using a different platform and a sub-optimal DNA pooling strategy it will be more difficult to incorporate its results into a three-way combined analysis.

Some time back, the National Institute for Mental Health announced that it was pledging $5 million for genomic approaches to bipolar disorder and schizophrenia, which will help to pay for the recruitment and genotyping of new patients and controls. With ten thousand or so patients and controls, the power of genome scans to detect low-risk variants will be dramatically higher than the studies discussed here - so if there are in fact common bipolar variants out there with an effect worth caring about, we should know about them within the next few years.

Of course, it seems increasingly unlikely that common variants constitute more than a small proportion of the total genetic risk for bipolar, so future studies will need to dig deeply into the less well-mapped regions of human genetic variation. In the immediate future we will see more studies of copy-number variation using high-resolution arrays, but ultimately (once costs drop low enough) the real answers are likely to come from large-scale sequencing studies. Sequencing will detect both rare variants and copy-number variation; however, it will take large sample sizes and some clever analysis to make sense of the huge volumes of data that it generates. Improved diagnostic approaches - perhaps brain-imaging technologies - that allow bipolar patients to be divided into distinct clinical sub-groups (potentially with separate genetic etiologies) may also prove useful.

A final note: while bipolar serves as a rather extreme example of the failure of genome-wide association with common markers, it's also a reminder of the vast swaths of genetic risk that remain unexplained in nearly all other common diseases as well as complex non-disease traits. Bipolar researchers are thus not alone in their frustrations, and the lessons learned in identifying the genes underlying this condition will be highly relevant to other common diseases.


Subscribe to Genetic Future.

Burton, P.R., et al. (2007). Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 447(7145), 661-678. DOI: 10.1038/nature05911

Baum, A.E., et al. (2008). A genome-wide association study implicates diacylglycerol kinase eta (DGKH) and several other genes in the etiology of bipolar disorder. Molecular Psychiatry, 13(2), 197-207. DOI: 10.1038/sj.mp.4002012

Sklar, P., et al. (2008). Whole-genome association study of bipolar disorder. Molecular Psychiatry DOI: 10.1038/sj.mp.4002151

4 comments:

G said...

It is also interesting to note that admixture mapping scans (Deo et al) have also failed to find much of a signal for hypertension.

Daniel said...

Thanks G,

I'm surprised that hypertension, of all common diseases, has proved so difficult to find genes for. As I understand it, predisposition to hypertension is ancestral (due to salt retention being beneficial in our hot, dry African homeland) but has been selected against in populations in cooler, wetter northern climates. Because this adaptation is relatively recent, I would have guessed that some of the large-effect alleles that typically characterise early adaptation would still be segregating, i.e. there should be some reasonably common large-effect protective variants.

But in hindsight, maybe there's been enough time for them to reach fixation (assuming strong selection), leaving only the small-effect nearly neutral alleles to drift around and drive phenotypic variation in modern non-African populations.

G said...

I too am slightly surprised that mapping of hypertension has not be easier. If the variants had been driven to fixation by selection outside of Africa, this would make it pretty ideal for admixture mapping even if not for association mapping.

I'm not very sure about this but I think there might be different forms of hypertension, some of which are more strongly sensitive to salt. If so, I wonder if hypertension is a disease where better phenotyping will really help reduce the noise.

Gerry said...

The genetics of manic-depression are elusive because the DSM-IV diagnostic criteria encompass a wide range of different psychiatric disorders, combining elements of major depressive disorder, hypomania, and severe mania without depression.

This is similar to 40 years ago when the bulk of patients with psychiatric disease were categorized as 'schizophrenic' or 'schizoid'.