If you're here looking for Daniel MacArthur's blog Genetic Future, click here.
Thursday, November 6, 2008
Genetic Future has moved to a new home
If you're here looking for Daniel MacArthur's blog Genetic Future, click here.
Wednesday, October 15, 2008
Navigenics to add gene sequencing to its personal genomic service
Navigenics has announced in the industry publication In Sequence (subscription only) that it plans to add gene sequencing to its personal genomics service. This would make it the first of the "Big Three" personal genomics companies (Navigenics, 23andMe and deCODEme) to offer analysis of rare as well as common genetic variants.
The move into sequencing has always been inevitable for the personal genomics industry. Currently all three of the major players in the affordable personal genomics field (as opposed to Knome's high-end service) use chip-based technology to analyse up to a million common sites of variation, known as SNPs, scattered throughout the genome. SNP chips provide remarkable insight into common variants (that is, variations with a frequency of 5% or greater in the general population), but they don't provide any real information about rarer variants - particularly those with a frequency of less than 1%.
It has become increasingly clear over the last few years that common variants play a disappointingly small role in most common diseases, as SNP chip approaches on ever-larger sample sizes have consistently failed to find the majority of disease-causing variation. Rather it appears likely that a substantial proportion of disease risk lurks in individually rare, large-effect polymorphisms - variants found in just a small fraction of the population, each contributing a substantial increase in disease risk. Most of these variants will never be picked up by SNP chip technology, so new approaches will be required to find them - and that's where sequencing comes in. By determining the complete DNA code within a set of target genes, sequencing identifies both common and rare variants alike.
Until now, the shift of the personal genomics industry into the sequencing market has been held back by two major barriers: cost, and the difficulty of interpreting rare variants. The first barrier is dropping with alarming speed, but the second is still a major challenge - and one that will pose some serious dilemmas for Navigenics and other companies as they launch their sequencing ventures.
Of course, these aren't new dilemmas: molecular diagnostics labs have been facing the challenge of determining whether or not a novel mutation is disease-causing for decades, in the context of both rare Mendelian diseases (like muscular dystrophy) and particularly in complex diseases such as breast cancer (BRCA1 mutation analysis is a particularly subtle art that probably warrants its own post). Navigenics will thus be taking advantage of the experience and the databases of a company called Correlagen Diagnostics, which already offers sequencing-based tests for a range of known disease-causing genes. I don't know enough about Correlagen to comment on their expertise, but it certainly makes sense for personal genomics companies to team up with experienced molecular diagnostics teams as they face the challenges of the sequencing era.
Navigenics will initially restrict the complexity of the problem by focusing on a set of known disease genes, and will draw on Correlagen's database to see if any new variants they find in a client's genes are known to be associated with diseases in other patients. However, most of the possible disease-causing variants they find will be completely novel - such is the nature of rare variants - and their disease-causing status will thus need to be predicted de novo. Navigenics' solution is roughly laid out in the In Sequence article:
Of course, that's not going to stop personal genomics companies from staking out claims in the sequencing arena, and from offering risk predictions from rare variants - however provisional and imperfect - to customers. 23andMe has long expressed interest in a sequencing approach, although co-founder Linda Avey is coy about the company's ambitions in the In Sequence article:
Subscribe to Genetic Future.
The move into sequencing has always been inevitable for the personal genomics industry. Currently all three of the major players in the affordable personal genomics field (as opposed to Knome's high-end service) use chip-based technology to analyse up to a million common sites of variation, known as SNPs, scattered throughout the genome. SNP chips provide remarkable insight into common variants (that is, variations with a frequency of 5% or greater in the general population), but they don't provide any real information about rarer variants - particularly those with a frequency of less than 1%.
It has become increasingly clear over the last few years that common variants play a disappointingly small role in most common diseases, as SNP chip approaches on ever-larger sample sizes have consistently failed to find the majority of disease-causing variation. Rather it appears likely that a substantial proportion of disease risk lurks in individually rare, large-effect polymorphisms - variants found in just a small fraction of the population, each contributing a substantial increase in disease risk. Most of these variants will never be picked up by SNP chip technology, so new approaches will be required to find them - and that's where sequencing comes in. By determining the complete DNA code within a set of target genes, sequencing identifies both common and rare variants alike.
Until now, the shift of the personal genomics industry into the sequencing market has been held back by two major barriers: cost, and the difficulty of interpreting rare variants. The first barrier is dropping with alarming speed, but the second is still a major challenge - and one that will pose some serious dilemmas for Navigenics and other companies as they launch their sequencing ventures.
Of course, these aren't new dilemmas: molecular diagnostics labs have been facing the challenge of determining whether or not a novel mutation is disease-causing for decades, in the context of both rare Mendelian diseases (like muscular dystrophy) and particularly in complex diseases such as breast cancer (BRCA1 mutation analysis is a particularly subtle art that probably warrants its own post). Navigenics will thus be taking advantage of the experience and the databases of a company called Correlagen Diagnostics, which already offers sequencing-based tests for a range of known disease-causing genes. I don't know enough about Correlagen to comment on their expertise, but it certainly makes sense for personal genomics companies to team up with experienced molecular diagnostics teams as they face the challenges of the sequencing era.
Navigenics will initially restrict the complexity of the problem by focusing on a set of known disease genes, and will draw on Correlagen's database to see if any new variants they find in a client's genes are known to be associated with diseases in other patients. However, most of the possible disease-causing variants they find will be completely novel - such is the nature of rare variants - and their disease-causing status will thus need to be predicted de novo. Navigenics' solution is roughly laid out in the In Sequence article:
In many cases, though, a rare gene variant will never have been seen before and, thus, be more difficult to interpret. Based on the variant's properties, like its evolutionary conservation, or whether it results in an amino acid change, Navigenics will attempt to assign it a probability score that predicts its clinical relevance. "And that's a really, really hard problem," Stephan said.I think Stephan is under-estimating the sample sizes required for these studies to be effective - we're talking hundreds of thousands of whole genomes, at least - but the overall message is on-target, and it's not good news for personal genomics customers expecting to find out what their genome means right now. It's going to take a long time and a tremendous amount of work before de novo functional prediction becomes a reliable proposition.
What is needed, he said, is sequencing-based genome-wide association studies. "What you ideally would want to do is take thousands of people with a complex genetic disease and thousands of people without one, sequence their entire exomes, and look for hotspots of accumulation of rare variants in certain regions of the genome [of cases vs. controls, where] the specific variants look like they have some sort of functional consequences."
"Then, the next time a person comes through the door, you can start to informatically stratify the loci that you see variants in ... based on sequencing all these genomes."
Of course, that's not going to stop personal genomics companies from staking out claims in the sequencing arena, and from offering risk predictions from rare variants - however provisional and imperfect - to customers. 23andMe has long expressed interest in a sequencing approach, although co-founder Linda Avey is coy about the company's ambitions in the In Sequence article:
"23andMe is closely following the next-generation sequencing field and will offer an expanded service when the data quality, balanced by the cost, of these offerings meets our criteria," said Linda Avey, co-founder of 23andMe, in an e-mail message. Once the company decides to include sequencing analysis in its service, "we will examine any and all sequencing companies in determining which would work best with our platform," she said.Navigenics will apparently be offering whole-exome sequencing (analysis of the protein-coding regions of all genes in the genome) some time next year, and complete genome sequencing at some stage after that. You can bet that 23andMe's desire to remain at the lead of the personal genomics industry will ensure that Navigenics will not be alone; at the same time, the whole-genome sequencing services offered by industry pioneer Knome and other emerging players will be dropping to affordable levels. When you throw in the current obscene rate of change in the sequencing technology sphere, this is likely to turn into a chaotic and fascinating race.
Baldness genes: one old, one new
This is now an archive site. For fresh content visit the new Genetic Future site or subscribe to the new RSS feed.
From a geneticist's point of view, male pattern baldness - also known as androgenic alopecia - is a tempting target. Baldness is common in the general population, with a prevalence that increases sharply with age (as a rule of thumb, a male's percentage risk of baldness is approximately equal to his age, e.g. 50% at age 50, and 90% at age 90), so there are no shortage of cases to study. It's also a strongly heritable trait, with about 80% of the variation in risk being due to genetic factors. Finally, baldness has been reported to be associated with a wide range of diseases such as prostate cancer, heart disease and diabetes, so learning about the genes that underlie this condition may help to dissect out the molecular pathways behind more serious disorders.
So it was only a matter of time before researchers targeted baldness with their favoured tool of the moment, the genome-wide association study. This week two separate groups published the results of genome-wide scans for baldness genes in the prestigious journal Nature Genetics. In both cases, their findings strongly support a known genetic association with the androgen receptor gene on the X chromosome, and also highlight a new region on chromosome 20 with a smaller (but still significant) effect on baldness risk.
I'm a little late to the party on this story - see posts by Razib, Hsien, Grace and Erin from 23andMe - but there are some interesting facets to this story that warrant a little extra attention.
The candidate gene approach got there first
One of the most striking findings of these studies is the massive signal of association around the androgen receptor (AR) gene, which is located on the X chromosome - it's a clear outlier on the signal plot shown below (each dot is a single genetic variant, with each chromosome labelled in alternating colours, and the height on the Y axis is the strength of the association with baldness). In contrast, the novel association on chromosome 20 is fairly modest.
The unusual thing about this signal is that the association between the AR gene and male pattern baldness has been known since 2001, when it was reported by Justine Ellis from the University of Melbourne (as an aside, The Spittoon erroneously suggests that the first report was in 2005). This is unusual because the pre-genomic era of "candidate gene" association studies, in which only a few selected genes at a time were screened for associations with a disease or trait, was notoriously bad at finding the most important genes. In most cases, the top hits in recent genome-wide association studies are in genes that would never have been identified by the candidate gene approach (e.g. FTO and obesity, the 5p13.1 gene desert in Crohn's disease). Baldness thus represents a rare success story for the candidate gene approach.
The androgen receptor was originally selected for analysis by Ellis on the basis of biological plausibility - it's well-known that baldness is associated with the testosterone pathway, and the androgen receptor is the molecule that signals testosterone's presence to cells all over the body. This means the chromosome X result from these genome-wide studies comes with an immediate biological explanation; unfortunately, the same cannot be said for the chromosome 20 signal.
It's unclear which gene is the culprit on chromosome 20
Both papers highlight the same stretch of DNA on chromosome 20 as the second strongest signal of association (although the two studies highlight different markers as the top hit, both top markers fall within a region of high linkage disequilibrium - which is just a fancy way of saying that they're almost always inherited together, so they're almost certainly both tagging the same underlying causal variant). However, unlike the chromosome X story, there's no obvious candidate gene lurking in this region - the nearest gene (PAX1) is almost 200,000 base pairs away, and has no known role in the testosterone pathway.
One of the studies provides experimental data showing that PAX1 is expressed in the scalp - but it's also expressed (and at much higher levels) in muscle and thymus, so this isn't compelling evidence of a causal role in hair loss. It will take some serious experimental work to unravel the real genetic culprit in this region.
HairDX may be testing the right gene, but the wrong marker
The genetic testing company HairDX offers testing of androgen receptor variants to predict the risk of premature baldness in both males and females. For their male test they examine the marker rs6152, which is located close to the beginning of the androgen receptor gene, but more than 250,000 bases away from the best hit in either of the two genome-wide studies. This suggests that the predictions made by the HairDX test could well be substantially improved by shifting to different markers (and, of course, incorporating markers from the chromosome 20 region).
I'll be discussing the current HairDX tests in more detail over the next few weeks. For the moment, let's just say that they're not something I'll be rushing out to purchase any time soon.
Genes --> baldness cure?
There are very few things on the internet more depressing than a Google search for "baldness cure" - in a single click you are transported into a sordid world of shame, desperation and rampant greed; ad-riddled forums for lonely men looking for a way to restore their once-luxurious manes, and an army of clinicians and researchers willing to sacrifice their credibility for a share of the resulting cash. As in any medical arena fuelled by desperation, that cash is plentiful (one of the Nature Genetics studies notes that annual sales of a single anti-baldness treatment recently surpassed $405 million).
To pharmaceutical companies baldness must be almost as good a target as obesity: it's extremely common, afflicts the wealthy as well as the poor, and its sufferers will readily fork over cash for a potential cure. But to find effective treatments, big pharma needs to have a clear idea of how baldness occurs at the molecular level - and that, in theory, is where genetic studies can help. By finding new genetic variations that influence baldness risk, genome scans might highlight unexpected pathways that ultimately lead to new drug targets.
However, these two new studies haven't provided much to help feed the wallets of pharmaceutical executives: the androgen pathway has long been known to influence baldness risk, and is already targeted by a number of existing baldness drugs (e.g. finasteride, a.k.a. Propecia), while the chromosome 20 region doesn't yield any clear-cut targets or clues regarding baldness pathways.
Judging from the chromosome scan shown above there are no more low-hanging genes on the baldness tree; it's going to be extremely difficult (i.e. requiring much larger sample sizes and/or different research approaches, such as large-scale sequencing) to drill down to find the next tier of small-effect risk genes. However, that's precisely what will be required for effective molecular dissection of the genetic basis of baldness.
Perhaps if a fraction of the money from online sales of dubious baldness therapies went into actual hair loss research we'd have answers more quickly - but I won't be holding my breath.
References
J Brent Richards, Xin Yuan, Frank Geller, Dawn Waterworth, Veronique Bataille, Daniel Glass, Kijoung Song, Gerard Waeber, Peter Vollenweider, Katja K H Aben, Lambertus A Kiemeney, Bragi Walters, Nicole Soranzo, Unnur Thorsteinsdottir, Augustine Kong, Thorunn Rafnar, Panos Deloukas, Patrick Sulem, Hreinn Stefansson, Kari Stefansson, Tim D Spector, Vincent Mooser (2008). Male-pattern baldness susceptibility locus at 20p11 Nature Genetics DOI: 10.1038/ng.255
Axel M Hillmer, Felix F Brockschmidt, Sandra Hanneken, Sibylle Eigelshoven, Michael Steffens, Antonia Flaquer, Stefan Herms, Tim Becker, Anne-Katrin Kortüm, Dale R Nyholt, Zhen Zhen Zhao, Grant W Montgomery, Nicholas G Martin, Thomas W Mühleisen, Margrieta A Alblas, Susanne Moebus, Karl-Heinz Jöckel, Martina Bröcker-Preuss, Raimund Erbel, Roman Reinartz, Regina C Betz, Sven Cichon, Peter Propping, Max P Baur, Thomas F Wienker, Roland Kruse, Markus M Nöthen (2008). Susceptibility variants for male-pattern baldness on chromosome 20p11 Nature Genetics DOI: 10.1038/ng.228
Subscribe to Genetic Future.
From a geneticist's point of view, male pattern baldness - also known as androgenic alopecia - is a tempting target. Baldness is common in the general population, with a prevalence that increases sharply with age (as a rule of thumb, a male's percentage risk of baldness is approximately equal to his age, e.g. 50% at age 50, and 90% at age 90), so there are no shortage of cases to study. It's also a strongly heritable trait, with about 80% of the variation in risk being due to genetic factors. Finally, baldness has been reported to be associated with a wide range of diseases such as prostate cancer, heart disease and diabetes, so learning about the genes that underlie this condition may help to dissect out the molecular pathways behind more serious disorders.So it was only a matter of time before researchers targeted baldness with their favoured tool of the moment, the genome-wide association study. This week two separate groups published the results of genome-wide scans for baldness genes in the prestigious journal Nature Genetics. In both cases, their findings strongly support a known genetic association with the androgen receptor gene on the X chromosome, and also highlight a new region on chromosome 20 with a smaller (but still significant) effect on baldness risk.
I'm a little late to the party on this story - see posts by Razib, Hsien, Grace and Erin from 23andMe - but there are some interesting facets to this story that warrant a little extra attention.
The candidate gene approach got there first
One of the most striking findings of these studies is the massive signal of association around the androgen receptor (AR) gene, which is located on the X chromosome - it's a clear outlier on the signal plot shown below (each dot is a single genetic variant, with each chromosome labelled in alternating colours, and the height on the Y axis is the strength of the association with baldness). In contrast, the novel association on chromosome 20 is fairly modest.

The unusual thing about this signal is that the association between the AR gene and male pattern baldness has been known since 2001, when it was reported by Justine Ellis from the University of Melbourne (as an aside, The Spittoon erroneously suggests that the first report was in 2005). This is unusual because the pre-genomic era of "candidate gene" association studies, in which only a few selected genes at a time were screened for associations with a disease or trait, was notoriously bad at finding the most important genes. In most cases, the top hits in recent genome-wide association studies are in genes that would never have been identified by the candidate gene approach (e.g. FTO and obesity, the 5p13.1 gene desert in Crohn's disease). Baldness thus represents a rare success story for the candidate gene approach.
The androgen receptor was originally selected for analysis by Ellis on the basis of biological plausibility - it's well-known that baldness is associated with the testosterone pathway, and the androgen receptor is the molecule that signals testosterone's presence to cells all over the body. This means the chromosome X result from these genome-wide studies comes with an immediate biological explanation; unfortunately, the same cannot be said for the chromosome 20 signal.
It's unclear which gene is the culprit on chromosome 20
Both papers highlight the same stretch of DNA on chromosome 20 as the second strongest signal of association (although the two studies highlight different markers as the top hit, both top markers fall within a region of high linkage disequilibrium - which is just a fancy way of saying that they're almost always inherited together, so they're almost certainly both tagging the same underlying causal variant). However, unlike the chromosome X story, there's no obvious candidate gene lurking in this region - the nearest gene (PAX1) is almost 200,000 base pairs away, and has no known role in the testosterone pathway.
One of the studies provides experimental data showing that PAX1 is expressed in the scalp - but it's also expressed (and at much higher levels) in muscle and thymus, so this isn't compelling evidence of a causal role in hair loss. It will take some serious experimental work to unravel the real genetic culprit in this region.
HairDX may be testing the right gene, but the wrong marker
The genetic testing company HairDX offers testing of androgen receptor variants to predict the risk of premature baldness in both males and females. For their male test they examine the marker rs6152, which is located close to the beginning of the androgen receptor gene, but more than 250,000 bases away from the best hit in either of the two genome-wide studies. This suggests that the predictions made by the HairDX test could well be substantially improved by shifting to different markers (and, of course, incorporating markers from the chromosome 20 region).
I'll be discussing the current HairDX tests in more detail over the next few weeks. For the moment, let's just say that they're not something I'll be rushing out to purchase any time soon.
Genes --> baldness cure?
There are very few things on the internet more depressing than a Google search for "baldness cure" - in a single click you are transported into a sordid world of shame, desperation and rampant greed; ad-riddled forums for lonely men looking for a way to restore their once-luxurious manes, and an army of clinicians and researchers willing to sacrifice their credibility for a share of the resulting cash. As in any medical arena fuelled by desperation, that cash is plentiful (one of the Nature Genetics studies notes that annual sales of a single anti-baldness treatment recently surpassed $405 million).
To pharmaceutical companies baldness must be almost as good a target as obesity: it's extremely common, afflicts the wealthy as well as the poor, and its sufferers will readily fork over cash for a potential cure. But to find effective treatments, big pharma needs to have a clear idea of how baldness occurs at the molecular level - and that, in theory, is where genetic studies can help. By finding new genetic variations that influence baldness risk, genome scans might highlight unexpected pathways that ultimately lead to new drug targets.
However, these two new studies haven't provided much to help feed the wallets of pharmaceutical executives: the androgen pathway has long been known to influence baldness risk, and is already targeted by a number of existing baldness drugs (e.g. finasteride, a.k.a. Propecia), while the chromosome 20 region doesn't yield any clear-cut targets or clues regarding baldness pathways.
Judging from the chromosome scan shown above there are no more low-hanging genes on the baldness tree; it's going to be extremely difficult (i.e. requiring much larger sample sizes and/or different research approaches, such as large-scale sequencing) to drill down to find the next tier of small-effect risk genes. However, that's precisely what will be required for effective molecular dissection of the genetic basis of baldness.
Perhaps if a fraction of the money from online sales of dubious baldness therapies went into actual hair loss research we'd have answers more quickly - but I won't be holding my breath.
References
J Brent Richards, Xin Yuan, Frank Geller, Dawn Waterworth, Veronique Bataille, Daniel Glass, Kijoung Song, Gerard Waeber, Peter Vollenweider, Katja K H Aben, Lambertus A Kiemeney, Bragi Walters, Nicole Soranzo, Unnur Thorsteinsdottir, Augustine Kong, Thorunn Rafnar, Panos Deloukas, Patrick Sulem, Hreinn Stefansson, Kari Stefansson, Tim D Spector, Vincent Mooser (2008). Male-pattern baldness susceptibility locus at 20p11 Nature Genetics DOI: 10.1038/ng.255
Axel M Hillmer, Felix F Brockschmidt, Sandra Hanneken, Sibylle Eigelshoven, Michael Steffens, Antonia Flaquer, Stefan Herms, Tim Becker, Anne-Katrin Kortüm, Dale R Nyholt, Zhen Zhen Zhao, Grant W Montgomery, Nicholas G Martin, Thomas W Mühleisen, Margrieta A Alblas, Susanne Moebus, Karl-Heinz Jöckel, Martina Bröcker-Preuss, Raimund Erbel, Roman Reinartz, Regina C Betz, Sven Cichon, Peter Propping, Max P Baur, Thomas F Wienker, Roland Kruse, Markus M Nöthen (2008). Susceptibility variants for male-pattern baldness on chromosome 20p11 Nature Genetics DOI: 10.1038/ng.228
Thursday, August 21, 2008
The gene for Jamaican sprinting success? No, not really.
Anyone who has walked past a TV set over the last few days will have seen footage of the remarkable Jamaican sprinter Usain Bolt, who comfortably cruised to victory (and a world record) in the Olympic 100 metre sprint, and as I write this has just done precisely the same thing in the 200 metre sprint. The interest in Bolt stems not from the fact that he wins his races, but rather from the contemptuous ease with which he does so.And Bolt is not the only Jamaican to impress in short distance events in Beijing: the country's women's sprint team took all three medals in their 100 metre dash.
Naturally, these performances have provoked widespread speculation about the basis of Jamaica's sprinting success, and the short-distance prowess of other populations of West African ancestry. One controversial suggestion has drawn the most headlines: that sprinting is in their genes, or rather in one gene in particular - variously referred to as "Actinen A" or "ACTN3".
This gene has been the subject of a recent rash of news stories sparked by Bolt's victories, all of which refer to comments by Jamaican academic Errol Morrison in the Jamaica Gleaner over a month ago. The Gleaner article summarised the (unpublished) results of a collaboration between Morrison and a group at the University of Glasgow:
At the base of sprint speed are the fast-twitch muscle fibres stocked with the speed protein Actinen A. And early data indicate that 70 per cent of Jamaican athletes have the gene for Actinen A. Only 30 per cent of Australian athletes studied had the gene.(The Gleaner reporter, Martin Henry, astonishingly went on to speculate that this gene may help to explain why Jamaicans are "also disproportionately aggressive and violent".)
The Daily Mail followed up on the story two weeks later with a marginally more coherent account:
What they have found - and Morrison emphasises the findings are preliminary - is that fast men have a special component called Actinen A in their fast-twitch muscles, which determine whether humans are sprinters or plodders. It is found in 70 per cent of Jamaicans. In a control study of Australians, only 30 per cent were found with it.The "preliminary" nature of the findings didn't stop the Daily Mail reporter from following this paragraph with the conclusion that this result "would seem to explain why Jamaicans punch above their weight among sprinters". Similarly definitive statements were made by other reporters continuing the story after Bolt's 100 metre victory; one rare exception was a fairly well-balanced piece in Slate.
The stories take advantage of a widespread perception - by no means totally unjustified, but nonetheless controversial - that Jamaicans and other groups of West African ancestry have a genetic advantage when it comes to raw muscle power. Having apparent scientific evidence to support this perception is a reporter's dream; the headlines write themselves.
So, how good is this scientific evidence? Does the "Actinen A" gene (whatever that is) actually influence sprinting performance? And if so, does it explain the difference in explosive power between Jamaicans and the rest of the world? The answers, as it turns out, are "probably" and "not really".
The ACTN3 gene and muscle performance
At this point I probably should confess to having a more than casual interest in this story: I was one of the authors on the first study showing an association between this gene and elite athlete status back in 2003, and this gene has been the central focus of my research for a good part of the last six years. (The opinions I express here are purely my own, by the way, and in no way are meant to represent the views of my research institute.)
The ACTN3 gene encodes a protein called α-actinin-3 ("Actinen A" is a misnomer of uncertain origin propagated by lazy reporters), which is found within the fast fibres of muscle - the cells that are required for generating rapid, forceful contraction in activities such as sprinting and weightlifting. Interestingly, the human ACTN3 gene comes in two forms in the general population: there's a normal, functional version called 577R, and a "defective" version called 577X, which contains a single base change that prevents the production of α-actinin-3. People who have two copies of the 577X version (I'll refer to them as X/X) produce absolutely no α-actinin-3 in their fast muscle fibres.
These people don't suffer from muscle disease as a result of this deficiency - in fact, there's a pretty good chance that you're one of them. The frequency of the 577X variant differs around the world, but overall somewhere between one-sixth and one-quarter of the world's population (at least a billion people worldwide) are X/X, and therefore completely deficient in α-actinin-3.
So lack of α-actinin-3 clearly doesn't destroy your muscle; however, over the last five years we and other groups have assembled evidence suggesting that it does influence how good your muscle is at generating explosive power. We first showed in 2003 that X/X individuals are significantly under-represented among elite Australian sprint/power athletes, suggesting that the absence of α-actinin-3 in X/X individuals is detrimental to optimal muscle power generation. This association has since been replicated in four separate athlete studies by groups in Europe and the US; there is also weaker but reasonably consistent evidence that α-actinin-3 deficiency results in slightly higher endurance capacity, both in human athletes and in a mouse model generated by our group. In addition, several groups have reported that X/X individuals in the general population display lower muscle strength and reduced sprint performance.
Importantly, the latter two studies suggest that the proportion of the variance in strength and sprint performance in the general population explained by the ACTN3 variant is around 2-3%. So for most of us lazy slobs this gene has a pretty trivial effect - almost completely drowned out by noise from the effects of diet, exercise levels and other genes. (Certainly there are dozens or even hundreds of other genes influencing physical performance, some of which - like the ACE gene - have been fairly well-studied, but most of which are completely unknown and uncharacterised; and environmental factors play about as large a role as genes do in traits like muscle strength and cardiorespiratory performance.)
However, even 2-3% can make a striking difference at the very elite level: of the 51 Olympic-level sprint/power athletes analysed in our original study and a follow-up analysis in Greek athletes not a single individual was X/X (compared to about 10 expected). In fact, X/X Olympian sprint athletes are unusual enough that identifying a single Spanish Olympic short-distance hurdler with α-actinin-3 deficiency was enough to warrant its own publication.
So the absence of α-actinin-3 means very little to most of us, but to a young athlete craving 100 metre Olympic superstardom it could make all the difference in the world. The same could be said of many other genetic variants, of course; Olympic sprinters, essentially, are those unlikely individuals at the vanishing edge of the probability distribution for whom nearly every genetic coin has come up heads.
Does the ACTN3 gene explain Jamaican sprinting prowess?
The underlying argument here is intuitively simple: (1) variation in the ACTN3 gene is strongly associated with elite sprint athlete status; (2) the "sprint" version of ACTN3 is more common in Jamaicans than in individuals of European ancestry; therefore (3) this variant may well play a role in the increased sprinting prowess of Jamaicans relative to Europeans. At first blush this sounds pretty convincing; however, while ACTN3 may play some role in the disproportionate success of Jamaican sprinters, I'd argue that it's likely to be a pretty small one. Here's why:
- The difference in frequency between Jamaicans and Europeans is not as great as it would appear. The articles quoted above describe the proportion of individuals who have two copies of the 577R ("sprint") version of the gene; a more appropriate comparison is the proportion of individuals who have at least one copy of 577R (that is, including both R/R and R/X individuals), since it's only the complete absence of α-actinin-3 that is reliably associated with reduced sprint performance. This starts to look less impressive: it's 98% in Jamaicans compared to about 82% in Europeans. In other words, in both populations a sizeable majority of individuals have an ACTN3 status compatible with elite sprint performance.
- The ACTN3 frequency reported for the Jamaicans by Morrison is not unique to Jamaicans, nor is it particularly surprising - our group has previously reported virtually identical frequencies in individuals from both West Africa (the ancestral source of the bulk of the Jamaican gene pool) and East Africa, in a collaboration with the same group at the University of Glasgow that Morrison has been working with on the Jamaican study. In fact, that study showed that an even higher frequency of α-actinin-3 expression (99%) is found in Kenya - in members of tribes whose members dominate international long-distance events, but have a notable dearth of representatives in track sprinting; we have more recently found similarly low frequencies in populations across sub-Saharan Africa. There's simply no clear relationship between the frequency of this variant in a population and its capacity to produce sprinting superstars.
- Finally, when Usain Bolt was pacing restlessly at the starting line of the 100 metre sprint - even in the very first round of Olympic heats - the very low frequency of X/X individuals among Olympic sprinters means he was lined up against a group of athletes who almost certainly all express α-actinin-3! In other words, while the ACTN3 variant may have played a small role in getting Bolt to the Olympics, it can't possibly explain the astonishing advantage he has over his competitors.
Beyond "the gene for speed"
I'm certainly not arguing here that genetics doesn't play any role in Bolt's success - or in the remarkable over-representation of West African descendents in Olympic short-distance track events, or the similarly impressive skew towards East Africans among marathon runners. In fact I think most geneticists would be staggered if this was the case, even though direct evidence for underlying genes is currently very thin on the ground.
Rather, my point is that an excessive emphasis on ACTN3 as a major explanation for Jamaican success does a grave disservice to the complex interplay of genetic and environmental factors required for top-level athletic performance. This suggestion goes against everything we've learnt about the genetics of complex traits from recent genome-wide association studies, which have revealed that quantitative traits (like height and body weight) are frequently influenced by dozens to hundreds of genes, each of small effect; if anything, it's likely that athletic performance will be even more genetically complex than these traits. The ACTN3-centred argument also dismisses the importance of Jamaica's impressive investment in the infrastructure and training system required to identify and nurture elite track athletes, the effects of a culture that idolises local track heroes, and the powerful desire of young Jamaicans to use athletic success to lift themselves and their families out of poverty.
It is almost certainly true that Usain Bolt carries at least one of the "sprint" variants of the ACTN3 gene, but then so do I (along with around five billion other humans worldwide). Indeed, I'm fortunate enough to be lugging around two "sprint" copies - but that doesn't mean you'll see me in the 100 metre final in London in 2012. Unfortunately for me, it takes a lot more than one lucky gene to create an Olympian.
(Image: Phil McElhinney.)
Tuesday, August 19, 2008
Misha Angrist reviews personal genomics
Blogger, Personal Genome Project participant and Assistant Professor Misha Angrist has a concise and extraordinarily readable article on the current state of personal genomics at Technology Review. Here's the penultimate paragraph:
Subscribe to Genetic Future.
This is where we are in the era of personal genomics: some modest amusement, a few interesting tidbits, a bit of useful information, but mostly the promise of much better things to come. The more people are allowed--encouraged, even--to experiment, the sooner that promise can be realized.I find myself in complete agreement. Anyone interested in the field should go read the rest.
Saturday, August 16, 2008
Venter's exome, and the challenge of rare variants for personal genomics
A team led by J. Craig Venter from the J. Craig Venter Institute has just published another paper on J. Craig Venter's favourite topic: J. Craig Venter.This study follows up on last year's publication of the complete sequence of Venter's genome, this time reporting a detailed analysis of a small but quite informative fraction of the genome: the exome, which consists of all of the pieces of DNA (called exons) that directly code for protein molecules.
The exome is a favoured target of geneticists. There are two major reasons for this: firstly, the exome is enriched for functional sequence, whereas non-coding DNA has a much higher fraction of non-functional junk; and secondly, we understand protein-coding DNA much better than we do non-coding DNA. If a novel mutation alters a protein sequence, we have algorithms that can predict (with moderate accuracy) how likely it is to alter the function of the cell. In contrast, for most mutations in non-coding DNA we have almost no way to predict whether they are functional or not. So, like the drunkard looking for his keys under the lamp-post because the light is better there, geneticists are inclined to look hardest at the regions where they actually have some chance of finding something they can understand.
Venter's mutations
The article (which is open access, so you can read it yourself) has a number of interesting factoids about Venter's protein-coding genome that are highly relevant to personal genomics:
- The authors identified 10,389 variants predicted to alter protein sequences;
- Of these, most are common (they estimate that 80-85% are present at a frequency of over 5% in the general population);
- About 1,500 of these variants are likely to actually significantly alter protein function, based on the SIFT prediction algorithm - these are the variants most likely to play a role in shaping human variation and common disease risk;
- A variant is twice as likely to be functionally damaging if it is rare (frequency less than 5%) than if it is common (frequency over 5%);
- Several quite unambiguously protein-damaging mutations were also found (74 would introduce an abnormal "stop" signal, while others create "frame-shifts" that alter large regions of an encoded protein), but many of these fall in genes with poor annotation that may well be non-functional;
- Venter carries seven known disease-associated variants, all present in only one copy (i.e. heterozygous);
- The interpretation of all of these data in terms of making actual health predictions is remarkably problematic, an ominous sign for the ~20 wealthy folks getting their genome sequenced by Knome this year.
Even if a gene is known to be involved in disease, it is difficult to understand if a variant in the gene will have a phenotypic effect. We found that 99% of the [protein-altering variants] in disease genes could not be characterized by current literature. Different mutations in the same gene can cause different phenotypic effects [49], thus making it difficult to interpret possible phenotypes. Furthermore, some variants have phenotypic effects only under certain environments (see SOD2 and BDNF in Table 2 and [48]). Also, when looking at complex phenotypes, multiple variants in coding and non-coding regions are likely to be involved [63]–[66]. This genetic complexity, as well as exposure to various environmental factors, will need to be taken into account in assessing risk for various diseases.In other words, it will be quite some time before we can use a genome sequence to make realistic predictions about overall health (except for the unlucky few who carry mutations unambiguously associated with disease, such as a CAG repeat expansion in the HTT gene - in which case the predictions will tend to be dire). The next few years will be interesting times indeed for personal genomics companies, as their ability to generate oodles of genetic data with cheap sequencing increases exponentially faster than their capacity to explain what the data actually mean.
The challenge of rare variants
I want to draw particular attention to the implications of point 4 above (the fact that rare mutations are the most likely to alter protein function, and thus to have an effect on disease risk). The evolutionary basis for this association is trivially clear: if a variant has a serious negative effect on health then in most cases natural selection will keep it at a low frequency in the population, since really sick people tend to have fewer kids. Disease-causing variants can reach high frequencies under certain conditions (if they also provide benefits under certain situations, or if the disease only hits its victims after they've already reproduced, for instance) but all else being equal, evolution's scythe means that you're far more likely to find disease-causing variants at the rare rather than the common end of the spectrum.
The reason this is so problematic is that rare disease-causing variants are also the hardest to find and characterise. I've mentioned a few times that the current crop of genome-wide association studies (GWAS), while reasonably well-powered to detect common disease-causing variants, have virtually no ability to find rare causal variants - even if these variants explain the majority of disease risk. This probably goes some way to explaining why even massive GWAS are capturing only a small proportion of the overall genetic risk for most common diseases.
This arises primarily because the chips used in current GWAS only efficiently "tag" common variants. However, even once this technological barrier is lifted it will still be fiendishly difficult to assign function to rare variants: because there will be many millions of these variants, each at a low frequency, the sample sizes required to find those few associated with disease risk will be mind-bogglingly large - we're talking cohorts of millions of people, all with large-scale sequence data and well-collected information on environment and health. I have no doubt such studies will eventually be done, but it will take many years before we see the results.
And of course, even with such massive cohorts, the rarest variants (those restricted to single families, or even just a few isolated individuals) will still slip through the statistical cracks - but such variants may well be the most important features in the genome sequence of any given individual, the ones disrupting that crucial tumour-suppressor gene or messing with neurotransmitter expression levels. If you have one of these nasty variants, you'll want to know about it, and you'll want to know what it does.
Beyond geneticsUltimately, geneticists will have to deal with such variants using non-genetic methods. For instance, for many genes it may eventually be possible to create experimental assays that allow researchers to rapidly test whether a novel variant disrupts protein function; the mouse embryonic stem cell assays that can be used to test novel variants in the breast cancer gene BRCA2 are a proof of principle, as well as a demonstration of just how challenging this process will be.
More broadly and ambitiously, we need to build and refine models of how human beings operate at a molecular level, integrating data from many fields of biology. If we understand which proteins interact within which cells, how these interactions influence protein dynamics, and where the binding sites for each interaction lie, we will have a much better chance of inferring the effect of an isolated change in protein sequence on overall cellular function and thus human health. Moving beyond the exome into non-coding DNA will require even more subtle and complex models including protein-DNA binding, the regulation of DNA modification and conformation, and the effects of non-coding RNA.
In other words, ultimate personal genomics - the extraction of every byte of useful predictive information out of an individual's genome sequence - will require nothing less than an atomic-level understanding of the operation of the human machine. Now that is an effort I'd like to see Google throw its weight behind...
(Venter image from Wikimedia Commons.)
Ng, P.C., Levy, S., Huang, J., Stockwell, T.B., Walenz, B.P., Li, K., Axelrod, N., Busam, D.A., Strausberg, R.L., Venter, J.C., Schork, N.J. (2008). Genetic Variation in an Individual Human Exome. PLoS Genetics, 4(8), e1000160. DOI: 10.1371/journal.pgen.1000160
Tuesday, August 12, 2008
How well does your genome predict your postcode?
Well, it's far from GPS precision, but the concordance between this genetic map of Europe (below left) and the physical sampling locations of populations throughout Europe (below right) is pretty good for a first draft:
Dienekes has an excellent discussion of the technical details, while Razib has labelled a plot showing all of the individuals in the study to make it easier to assess the degree of scatter and overlap.
The take-home message: rather than being one homogeneous mass, Europeans in fact show considerable population substructure, such that genetic information can be used to roughly predict geographical ancestry. An analysis of just a few hundred thousand genetic markers (i.e. less than is currently offered by personal genomics companies 23andMe or deCODEme) would be more than adequate in most cases to distinguish a Pole from a Parisian, or a Swede from a Spaniard. (To be more precise, it would be sufficient to discriminate between individuals for whom most ancestors were natives of these regions; recent migrants will obviously be misclassified.)
What drove these genetic differences? Mostly it will have been chance - random increases or decreases in the frequency of markers throughout the genome accumulated over a few millennia of genetic isolation. But at least some of these differences have been driven by natural selection: for instance, the lactase gene LCT, which has been subject to strong selection to allow lactose digestion in adults in populations reliant on dairy agriculture, represents 9 out of the top 20 most differentiated markers; a marker in the gene HERC2, which is associated with eye colour variation and has been under selection in Europeans and Asians, comes in at number 19.
This indicates that at least some of the genetic - and thus physical and possibly behavioural - differences between the various European populations stem from evolutionary adaptation to their local environments.
I'll leave the technical commentary to Dienekes, but I do want to make one important point: the accuracy of the map will have been limited by the fact that the markers used in this study represent sites of common variation; data from large-scale genome sequencing will generate far, far better maps. The major reason for this is that sequencing will provide information on rare, highly spatially-restricted variants - many of which will be limited to single families and thus be extremely informative about geographical ancestry.
Basically, if you had complete genome sequences from enough Europeans you could reconstruct the genetic map of Europe with exquisite precision. In addition to empowering genetic genealogists, researchers could use deviations between the genetic and physical maps to make powerful inferences about historical migration events and recent episodes of natural selection. With any luck, this is the sort of data that will simply fall out from large-scale population genomic studies being conducted over the next decade or so.
Update: Kambiz at Anthropology.net puts these results in a broader scientific context.
Lao et al. (2008). Correlation between Genetic and Geographic Structure in Europe. Current Biology DOI: 10.1016/j.cub.2008.07.049
Image source: Figure 1 from Lao et al.
Subscribe to Genetic Future.

Dienekes has an excellent discussion of the technical details, while Razib has labelled a plot showing all of the individuals in the study to make it easier to assess the degree of scatter and overlap.
The take-home message: rather than being one homogeneous mass, Europeans in fact show considerable population substructure, such that genetic information can be used to roughly predict geographical ancestry. An analysis of just a few hundred thousand genetic markers (i.e. less than is currently offered by personal genomics companies 23andMe or deCODEme) would be more than adequate in most cases to distinguish a Pole from a Parisian, or a Swede from a Spaniard. (To be more precise, it would be sufficient to discriminate between individuals for whom most ancestors were natives of these regions; recent migrants will obviously be misclassified.)
What drove these genetic differences? Mostly it will have been chance - random increases or decreases in the frequency of markers throughout the genome accumulated over a few millennia of genetic isolation. But at least some of these differences have been driven by natural selection: for instance, the lactase gene LCT, which has been subject to strong selection to allow lactose digestion in adults in populations reliant on dairy agriculture, represents 9 out of the top 20 most differentiated markers; a marker in the gene HERC2, which is associated with eye colour variation and has been under selection in Europeans and Asians, comes in at number 19.
This indicates that at least some of the genetic - and thus physical and possibly behavioural - differences between the various European populations stem from evolutionary adaptation to their local environments.
I'll leave the technical commentary to Dienekes, but I do want to make one important point: the accuracy of the map will have been limited by the fact that the markers used in this study represent sites of common variation; data from large-scale genome sequencing will generate far, far better maps. The major reason for this is that sequencing will provide information on rare, highly spatially-restricted variants - many of which will be limited to single families and thus be extremely informative about geographical ancestry.
Basically, if you had complete genome sequences from enough Europeans you could reconstruct the genetic map of Europe with exquisite precision. In addition to empowering genetic genealogists, researchers could use deviations between the genetic and physical maps to make powerful inferences about historical migration events and recent episodes of natural selection. With any luck, this is the sort of data that will simply fall out from large-scale population genomic studies being conducted over the next decade or so.
Update: Kambiz at Anthropology.net puts these results in a broader scientific context.
Lao et al. (2008). Correlation between Genetic and Geographic Structure in Europe. Current Biology DOI: 10.1016/j.cub.2008.07.049
Image source: Figure 1 from Lao et al.
Thursday, August 7, 2008
BREAKING NEWS
Hopefully I now have the attention of at least a small proportion of my RSS subscribers; here's a friendly reminder:
Genetic Future has moved and you need to update your RSS feed by clicking HERE.
This feed will be inactivated shortly, and this domain will become an archive site.
Daniel.
Subscribe to Genetic Future.
Genetic Future has moved and you need to update your RSS feed by clicking HERE.
This feed will be inactivated shortly, and this domain will become an archive site.
Daniel.
The challenges of psychiatric genetics
Back in April I posted on the elusive genetics of bipolar disorder, a crippling psychiatric condition affecting over 2% of the population in any given year.The major message from that article is that although bipolar disorder is massively influenced by genetic factors (around 85% of the variation in risk is thought to be due to genetics) we still don't really have the faintest idea exactly which genes are involved. This is despite three reasonably large genome-wide association studies involving over 4,000 bipolar patients in total, which generated weak and contradictory results and failed to provide a single compelling candidate for genetic variation underlying this disease.
This disappointing result has also held largely true for other psychiatric conditions with strong genetic components, such as schizophrenia, major depression and autism. Genetic studies of these conditions have had some success identifying rare mutations that underlie severe cases, but the vast majority of the genetic variants contributing to risk remain undiscovered.
There are several reasons why genome-wide association studies can fail to yield significant harvests of disease-associated genes. I summed these up with respect to bipolar disease as follows:
The researchers are surely hoping that small effect sizes are the major problem, since this is the easiest problem to remedy (simply increase sample sizes). Disease heterogeneity - in other words, multiple diseases with distinct causes that all converge on a bipolar end-point - also seems like a particularly plausible explanation given the complexities of mental illness. It's also likely that various types of genetic variants that are largely invisible to existing SNP chips, like rare variants and copy-number variation, are important.The same story probably holds largely true for other psychiatric conditions. In this week's issue of Nature, a news article and an editorial both tackle the challenges of psychiatric genetics, and lay out the ambitious strategies currently being pursued by researchers around the world to overcome them.
Small effect sizes
The first hurdle that I describe above is the fact that most of the variants underlying these conditions probably have very small effect sizes (only increasing risk by less than 20%). Such variants will only be identified by cranking up sample sizes immensely, an approach that has yielded some limited success for other genetically complex traits such as height and obesity. The Nature news feature has a table listing some of the major collaborative efforts currently collecting genetic information from the very large cohorts required to dissect out the basis of these conditions:

In most cases, these samples are being built up by pooling results from multiple different studies, often gathered by groups from around the world. As sample sizes increase the power of studies to detect small-effect variants grows. The effect of sample size on the power of genome-wide association studies is illustrated in the graph below from a recent review by Peter Visscher*:

Take a single genetic variant that explains just 0.5% of the variance in the risk of a psychiatric disorder. With a sample size of 5,000 individuals with that disorder you still have a mere 50% chance of detecting that variant. Double your sample size, and that probability jumps to a near-certainty of detection - and your power of detecting even smaller-effect variants (explaining, say, 0.2% of the risk) starts to climb to respectable levels.
By staring at those curves for a while, and bearing in mind that many of the variants found by recent genome-wide association studies explain well under 0.2% of the risk variance, you will quickly start to appreciate why researchers are pushing for ever-larger disease populations to work with. With truly enormous samples on the order of 50 to 100 thousand patients - not out of the question for international consortiums studying reasonably common diseases such as bipolar - the power to detect even very weak risk variants becomes reasonable.
If there are common genetic variants contributing to the risk of these diseases, such large collaborative studies will eventually find them; so long, of course, as they can tackle the next (and potentially far more serious) problem of disease heterogeneity.
Complex, heterogeneous diseases
The second major problem I mentioned with analysing the genetic basis of these diseases is that they are complex, multifactorial, and extremely difficult to diagnose and classify. Psychiatric conditions are probably the most difficult area of medicine to draw hard boundaries: many symptoms are shared by multiple conditions, and many patients display a diffuse constellation of clinical signs that makes a clean diagnosis impossible.This complexity and heterogeneity is the basis of considerable tension between geneticists and neuroscientists, which is explored in the Nature editorial. Basically, to build up those massive sample sizes shown above geneticists are forced to lump together patients with a variety of clinical symptoms, thus essentially ignoring the complexity inherent in these conditions - a failure that neuroscientists find inexcusable. In turn, geneticists (like myself) get seriously annoyed by the tendency of neuroscientists to make big, bold claims about disease mechanisms based on studies with tiny sample sizes.
Both sides make reasonable criticisms. As I said in the quote from my previous article above, it seems likely that disease heterogeneity - that is, multiple diseases states with the same broad end point being simplistically lumped together - plays a major role in the failure of genome-wide association studies of psychiatric conditions; at the same time, the scientific value of much of the "sexy" neurobiology currently being published (e.g. functional MRI finds that conservatives have lower activity in "compassion" centres of the brain, or whatever) is sometimes highly questionable. Both sides of this scuffle have something to learn from their opponents.
The editorial argues, sensibly, that geneticists and neuroscientists just need to start getting along. The ideal situation is one in which rigorous clinical assessments are used to generate patient cohorts that are as homogeneous as possible that can then be subjected to large-scale genetic analysis. One especially promising avenue is the use of "endophenotypes" - that is, simple and easily quantifiable traits that are sometimes but not always associated with a particular disease. Cleanly defined endophenotypes, such as very specific dysfunctions of brain activity, may prove much more amenable to genetic dissection than the larger, more complex diseases they are associated with.
Comprehensively tackling the genetic of psychiatric conditions will require a forceful and combined approach drawing on the clinical expertise of neuropsychiatrists and the experience of geneticists in unravelling the genetic mechanisms of complex traits. To some extent this is happening already (no large genetics consortium would be naive enough to embark on a multi-million dollar project without consulting clinical experts) - but obviously there is considerable room for improvement.
Moving beyond common SNPs
Current genome-wide association studies currently rely largely on the use of single-letter variations in DNA called single nucleotide polymorphisms (SNPs), mainly because these are easy to analyse and can be simultaneously analysed in their hundreds of thousands using chip-based assays. For various reasons almost all of the SNPs on current genome-wide association chips are common sites of variation, present at a frequency of 5% or more in the population. However, recent studies have made it look increasingly likely that a large proportion of the genetic risk of common diseases lies in types of genetic variation that cannot be detected using common SNPs: rare variants, and large-scale rearrangements of DNA known as structural variation.The approaches required to capture these variants are already pretty well-known, although they remain expensive and technically challenging. In an ideal world, genome-wide association studies would be truly genome-wide - in other words, they would utilise the entire DNA sequence of all of the patients and controls in the sample to find every possible genetic variant that might contribute to disease. Unfortunately, such an approach is currently out of reach, for several reasons:
- The cost of DNA sequencing is still too high;
- The computing power required to analyse the unbelievable volumes of data generated by such a project would be astronomical;
- Statistical issues associated with examining so many data-points from each patient and control would greatly increase the required sample sizes, driving costs and computational requirements up even higher; and
- Our ability to predict the effects of most genetic variants on human biology - which would be important for understanding which of the millions of rare variants found in such a study are actually harmful - is still far too weak.
Both approaches have their limitations. The success of the candidate gene approach will be constrained by researchers' ability to identify the genes most likely to be involved in a particular disease - but in fact our currently severaly limited understanding of disease genetics is precisely why we need to study this issue in the first place! (In the Nature news piece, Harvard's Steven Hyman memorably describes this approach as "like packing your own lunch box and then looking in the box to see what's in it.") And while chip-based detection of structural variation is rapidly increasing in resolution, it's extremely difficult to determine which of the variants identified in a study are disease-causing and which are harmless polymorphisms - this is currently done probabilistically, by showing that there is an enrichment of new variants in disease cases compared to controls, but this approach cannot tell you which of the identified variants are actually causative.
From psychiatric genetics to genetic psychiatry?
There are several important reasons researchers are interested in the genetics of mental illness: identifying causal genes helps to dissect out the molecular pathways involved in disease, and may help to pull out otherwise invisible sub-types of a disease; studying "extreme" mental phenotypes may illuminate the genetic basis of variation in cognition and personality traits in "normal" people; and, perhaps most importantly, by identifying the genes underlying psychiatric diseases we may be able to target at-risk individuals for monitoring and intervention, potentially heading off severe disease before it takes hold.
In the headlong pursuit of these goals the field of psychiatric genetics has developed an unfortunate reputation built on bold claims made with limited evidence, and literally hundreds of reported associations that have completely failed to stand up to replication. Just a couple of years ago the shiny new tools of large-scale genomics promised an end to this ignoble period in the history of the field; unfortunately, the introduction of larger samples, higher genomic coverage and increased statistical rigour has not brought the desired clarity to the field, but rather seems to have increased the levels of confusion and uncertainty.
If anything, that crucial third goal - using genetic to predict the risk of mental illness - now appears further away than it did just a couple of years ago. Back in early 2007 we didn't have many convincing genetic predictors of mental illness, but at least it was possible to imagine that emerging genomic technologies might identify a small core set of large-effect variants that would help clinicians to predict disease risk. Right now we still don't have many useful genetic predictors, and that illusion of hope is gone.
In summary: while there's no doubt that these conditions do have a strong genetic basis, it's now abundantly clear that this basis is frighteningly complex, with common variants of moderate-to-large effect - the types of variants that would be most useful for risk prediction - being essentially absent. It's going to take many years, massive cohorts, the clever application of new genomic technologies, and a willingness from both neuroscientists and geneticists to listen to one another to move this field forward.
(Brain scan image from Science Photo Library.)
* Thanks to reader Chris for providing me with the citation, which I had carelessly misplaced!
Saturday, August 2, 2008
Genetic Future is moving, and so am I
Genetic Future is moving to a shiny new home at ScienceBlogs. This domain will remain as an archive site, but for fresh content you will need to update your links as follows:
New URL: http://scienceblogs.com/geneticfuture/
New RSS feed: http://feeds.feedburner.com/scienceblogs/geneticfuture
Some of you familiar with the ScienceBlogs network might be wondering if this move heralds a transition into left-wing political blogging, but don't worry: my articles will continue to be focused on reporting advances in human genomics and critiquing the genetic testing industry.
Just a few weeks after the transition I'll also be physically moving from Sydney to a new life in Cambridge, UK. Posting on the new site will be light during this move and regulars will notice a few recycled posts to fill in the awkward silences, but bear with me - in a couple of weeks there will be plenty of fresh human genetics goodness.
Hope to see you all at the new domain,
Daniel.
Subscribe to Genetic Future.
New URL: http://scienceblogs.com/geneticfuture/
New RSS feed: http://feeds.feedburner.com/scienceblogs/geneticfuture
Some of you familiar with the ScienceBlogs network might be wondering if this move heralds a transition into left-wing political blogging, but don't worry: my articles will continue to be focused on reporting advances in human genomics and critiquing the genetic testing industry.
Just a few weeks after the transition I'll also be physically moving from Sydney to a new life in Cambridge, UK. Posting on the new site will be light during this move and regulars will notice a few recycled posts to fill in the awkward silences, but bear with me - in a couple of weeks there will be plenty of fresh human genetics goodness.
Hope to see you all at the new domain,
Daniel.
Subscribe to:
Posts (Atom)