Showing newest 20 of 21 posts from March 2008. Show older posts
Showing newest 20 of 21 posts from March 2008. Show older posts

Monday, March 31, 2008

Will information on risk genes actually change behaviour?

A recent news focus in Science is well worth a read if you're lucky enough to have an institutional subscription. If you don't have a subscription, my post a couple of days ago on the problems with genome-wide association studies covers some of its major points: the article describes the disappointing results from even extremely well-powered genome-wide association studies (in type 2 diabetes, for instance) and lists copy number variation, rare variants, population differences and epistasis as potential explanations.

The article also tackles the fascinating question of whether information on the types of small-effect risk genes identified by recent genome-wide studies is likely to have much of an effect on patient behaviour and disease outcomes.

Two years ago, behavioral epidemiologist Colleen McBride and human geneticist Lawrence Brody [...] offered volunteers in Detroit, Michigan, the chance to learn whether they carried deleterious variants for eight health conditions, including diabetes, colon cancer, and osteoporosis. Because the variants are common, virtually everyone was expected to harbor at least a couple. Those monitoring the study's safety "were really worried, literally, that people were going to jump off bridges" when they learned that their risk of disease was increased, says McBride.

Among the 300 or so who have participated, that hasn't happened--quite the opposite. "They're not having big emotional responses," says McBride. The researchers are tracking the volunteers to see whether the information affects decisions to reduce disease risk, such as seeking out a smoking-cessation program or consulting with a nutritionist.

Behavioral specialists have shifted from worrying about the devastating effects of learning about these new genetic risks to wondering whether the information will make any impression at all.

Patient indifference to genetic risk factors wouldn't come as a huge surprise to any clinician who has ever tried to convince a high-risk patient to cut down on dangerous activities for the sake of their health: smokers keep smoking even though the increased risk of heart and respiratory diseases is well-publicised.

However, it remains an open possibility that obtaining a personalised genetic risk profile will serve as a more effective spur to behaviour changes than generic "smoking is bad" messages from clinicians and public health organisations. The article mentions a study currently in progress to test this possibility in a cohort of smokers with a family history of Crohn's disease (smoking doubles the risk of this disease).

It's amazing that these sorts of studies haven't been done yet, given how important they are: if it turns out that genetic information doesn't reduce risk behaviour in the real world, then the impact on public health of the hundreds of millions of dollars spent on complex disease genetics may actually be very small. I'm hopeful that this won't be the case - but it would be nice to have some actual evidence one way or the other...


Subscribe to Genetic Future.

Eye on DNA interviews Knome CEO

Hsien-Hsien Lei from Eye on DNA has an exclusive interview with Jorge Conde, CEO of Knome - the company that offers whole-genome sequencing to customers for a cool $350,000.

As I've said before, the first Knome customers will be getting a pretty rough deal: a vast sum of money forked out for a pretty minimal return in terms of useful information, given our currently dismal understanding of most of the genome. Conde does his best to make this prospect sound more attractive:
...these early adopters will also be pioneers in the personal genome revolution and will be amongst the first people in history to be fully sequenced. These participants will be on the cutting edge of science and medicine. They will have access to the latest information as it becomes available and those that are willing to learn as we learn (and can appreciate risk prediction and the changing nature of our scientific understanding) will be best positioned to benefit.

Certainly, the early adopters will experience the warm glow of the pioneer. And it's true that they'll have their genome sequence in hand to take advantage of each new research finding that pops up over the next five years. But by the time we have enough genetic information to make a genome sequence seriously useful - in, say, five to ten years - the cost of sequencing will be down by three orders of magnitude. That's when I'll be buying my sequence!

Of course, Dan Stoicescu and other Knome early adopters didn't decide to purchase their sequences through a cold, logical cost-benefit analysis. Stoicescu explained in a recent NY Times article that he views his purchase as "a kind of sponsorship" - in other words, his over-spending will pave the way for affordable genome sequencing for the rest of us.

In any case, as sequencing costs plummet the real money is going to lie in sequence interpretation - translating six billion DNA letters into useful medical information, and then conveying that complex information to a customer in terms they can understand. Conde's interview suggests that Knome has invested heavily in this process, which should put them in a good position to compete with the inevitable flotilla of genome sequencing companies that pop up over the next five years.


Subscribe to Genetic Future.

Saturday, March 29, 2008

Why do genome-wide scans fail?

The successes of genome-wide association studies (GWAS) in identifying genetic risk factors for common diseases have been heavily publicised in the mainstream media - barely a week goes by these days that we don't hear about another genome scan that has identified new risk genes for diabetes, lupus, cardiac disease, or any of the other common ailments of Western civilisation.

Some of this publicity is well-founded: for the first time in human history, we have the power to identify the precise genetic differences between human beings that contribute to variation in disease susceptibility. If we can document all of the factors, both genetic and environmental, that result in common disease we will be able to target early interventions to the individuals who are most susceptible. Every GWAS success brings us closer to the long-awaited era of personalised medicine.

But while the media trumpet the successes of genome scans, little attention is paid to their failures. The fact remains that despite the hundreds of millions of dollars spent on genome-wide association studies, most of the genetic variance in risk for most common diseases remains undiscovered. Indeed, some common diseases with a strong heritable component, such as bipolar disease, have remained almost completely resistant to GWAS.

Where is this heritable risk hiding? It now seems likely that it's lurking in a number of different places, with the fraction of the risk in each category varying from disease to disease. This post serves as a generic list of the dark regions of the genome currently inaccessible to GWAS, with some discussion of the techniques that will likely prove useful in mapping risk variants in these areas. I'll be referring back here over the next few months as I discuss both successful and unsuccessful gene-hunting expeditions in a variety of diseases. I consider this list a work in progress and would welcome any suggestions from readers on expanding and refining it.

Alleles with small effect sizes
The problem: The ability to simultaneously examine hundreds of thousands of variants throughout the genome is both the strength and the weakness of the GWAS approach. The power of GWAS is that they provide a relatively unbiased examination of the entire genome for common risk variants; their weakness is that in doing so, they swamp the signal from true risk variants with statistical noise from the vast numbers of markers that aren't associated with disease. To separate true signals from noise, researchers have to set an exceptionally high threshold that a marker needs to exceed before it is accepted as a likely disease-causing candidate. That reduces the problem of false positives, but it also means that any true disease markers with small effects are lost in the background noise.

The solution: This seems to be one problem that will need to be solved, at least to some extent, with sheer brute force. By increasing the numbers of samples in their disease and control groups researchers will steadily dial down the statistical noise from non-associated markers until even disease genes with small effects stand out above the crowd. As the cost of genotyping (and sequencing) tumbles ever downward such an approach will become more and more feasible; however, the logistical challenge of collecting large numbers of carefully-ascertained patients will always be a serious obstacle.

Rare variants
The problem: Current genome scan technology relies heavily on the "common disease, common variant" (CDCV) assumption, which states that the genetic risk for common disease is mostly attributable to a relatively small number of common genetic variants. This is largely an assumption of convenience: firstly, our catalogue of human genetic variation (built up by efforts such as the HapMap project) is largely restricted to common variants, since rare variants are much harder to identify; and secondly, chip-makers have restrictions on how many different SNPs they can analyse on a single chip, so the natural tendency has been to cram in the high-frequency variants that capture the largest proportion of genetic variation per probe. There is also some theoretical justification for this assumption based on models of human demographic history, but these models are themselves based on numerous assumptions, and the argument may not apply equally to all common human diseases.

In any case, everyone agrees that some non-trivial fraction of the genetic risk of common diseases will be the result of rare variants, and the latest results from GWAS in a variety of diseases have failed to provide unambiguous support for the CDCV hypothesis. Whatever the proportion of variance that turns out to be explained by rare variants, current GWAS technologies are essentially powerless to unravel it.

The solution: Increasing sample sizes may help a little, but the fundamental problem is the inability of current chips to tag rare variation. Short-term, the solution will be higher-density SNP chips incorporating lower frequency variants identified by large-scale sequencing projects like the 1000 Genomes Project. However, such approaches will have diminishing returns: as chip-makers lower the frequency of the variants on their chips, the number of probes that will have to be added to capture a reasonable fraction of total genetic variation will increase exponentially, with each new probe adding only a minute increase in power.

Ultimately, the answer lies in large-scale sequencing, which will provide a complete catalogue of every variant in the genomes of both patients and controls. The problem here is not so much the sequencing itself - the costs of sequencing are currently plummeting due to massive investment in rapid sequencing technologies - but in the interpretation. Whole new analytical techniques will be required to convert these data into useful information.

Population differences
The problem: Over the last 50 to 100 thousand years modern humans have enthusiastically colonised much of the world's landmass. Each wave of expansion has carried with it a fraction of the genetic variation of its ancestral population, along with a few novel variants acquired through mutation. In each new habitat encountered, natural selection has acted to increase the frequency of variants that provided an advantage, and cull those that were harmful, while the rest of the genome passively gained and lost genetic variation. The end result is a set of human populations that, while extremely similar across the genome as a whole, can carry quite different sets of genetic variants relevant to disease. In addition, the correlation between markers close together in the genome (known as linkage disequilibrium) can also differ between populations, so that a marker that is tightly correlated with a disease variant in one population may be only weakly associated in other groups.

These differences have profound implications for disease gene mapping efforts. As a result of this variation, markers that are associated with disease in one population can never be assumed to show the same associations in other human groups (this will be especially true for rare variants, of course). Current GWAS have been dominated by subjects of Western European ancestry, and our understanding of genetic risk variants in non-European populations is almost non-existent. In addition, these differences mean that mixing people with different ancestries together in a disease cohort can seriously confound the identification of causative genes - in certain situations, such mixing can greatly increase the risk of false positive findings.

The solution: For GWAS results to be universally applicable, they will need to be performed in cohorts from a wide range of populations. Data-sets such as the HapMap project, the Human Genome Diversity Panel and the powerful new 1000 Genomes Project will provide information about the patterns of genetic variation in diverse populations that is needed to design the assays for GWAS. A greater challenge will be collecting the large numbers of ancestry-homogeneous samples - both well-validated disease patients and healthy controls - required for GWAS approaches to be successful. This problem is likely to be particularly acute for African populations, where linkage disequilibrium is lower and genetic diversity much higher than in other regions (thus requiring larger numbers of markers and individuals to identify disease variants); and of course, in Africa and much of the rest of the world, local governments typically have much more pressing issues than genome scans to spend their limited health budgets on.

Epistatic interactions
The problem: Most current genetic approaches assume that genetic risk is additive - in other words, that the presence of two risk factors in an individual will increase risk by the sum of the two factors by themselves. However, there's no reason to expect that this will always be the case. Epistatic interactions, in which combined risk is greater (or less) than the sum of the risk from individual genes, are difficult to identify with genome scans and even harder to untangle. If epistasis is strong, then just a few genes - each with a weak effect by itself, well below the threshold of a scan - could in concert explain a large chunk of genetic risk. Such a situation would be largely invisible to current approaches.

The solution: Large sample sizes, and clever analytical techniques. I'm not going to attempt a more detailed answer as this area is well outside my knowledge zone - but fortunately, it's an active area of research (see, for instance, the Epistasis Blog). I'd welcome any comments from people who know more about epistasis than I do about the likely scope of this problem and the methods that will be used to resolve it.

Copy number variation
The problem: One of the great surprises of the last five years has been the discovery of widespread, large-scale insertions and deletions of DNA, known as copy number variations (CNVs), in even healthy genomes. CNVs are now known to account for a substantial fraction of human genetic variation, and have been shown to play a role in variation in human gene expression and in human evolution. It seems highly likely that CNVs will be responsible for a non-trivial proportion of common disease risk.

However, our understanding of these variants is still in its infancy. The chips currently used in GWAS, which interrogate single base-pair variations between individuals known as SNPs, can be used to detect a small proportion of CNVs indirectly (by looking for distortions of signal intensity or inheritance patterns), and may effectively "tag" a fraction of the remainder (by using SNPs that are very close to the CNV, and therefore tend to be inherited along with it). However, the vast majority of copy number variation remains invisible to current GWAS technology.

The solution: High-resolution tiling arrays - chips containing millions of probes, each of which binds to a small region of the genome - can be used to explore CNVs in some areas of the genome, but they break down for the large fraction of the genome containing repetitive elements. Ultimately, the complete detection of CNVs from patients and controls will require whole-genome sequencing, preferably using methods with much longer read lengths than the current crop of rapid sequencing technologies.

Epigenetic inheritance
The problem: Not all inherited information is carried in the DNA sequence of the genome; a child also receives "epigenetic" information from its parents in the form of chemical modifications of DNA that can alter the expression of genes - and thus physical traits - without changing the sequence. Although epigenetic inheritance is known to occur, the degree to which it influences human physical variation and disease risk is essentially totally unknown.

All existing technologies used in GWAS are based on DNA sequence, and thus don't detect epigenetic variation. It is even invisible to full-genome sequencing.

The solution: It first needs to be established that epigenetically inherited variations do actually contribute a non-trivial fraction of human disease risk. If so, techniques currently being developed to identify these variants in a high-throughput fashion could be used to perform EWAS (epigenome-wide association studies).

Disease heterogeneity
The problem: Some "diseases" are actually simply collections of symptoms, which may stem from multiple, distinct genetic causes. Lumping patients with fundamentally different conditions into a single patient cohort for a GWAS is a recipe for failure: even if there are strong genetic risk factors for each one of the separate conditions, each of these will be drowned out by the noise from the other, unrelated diseases. The problem is that for some diseases - particularly mental illnesses, where causation lurks deep within the complex and poorly-understood human brain - the knowledge and tools required to separate patients into distinct sub-categories simply may not exist yet.

The solution: The geneticists can't fix this one - it will take a combined effort from clinicians and medical researchers to break down complex diseases into useful diagnostic categories, which can then each be subjected to separate genetic analysis. In the cancer arena, conditions previously lumped together as one entity have now been separated using new technologies such as gene expression arrays; similar approaches will no doubt prove fruitful in a range of other diseases, although the inaccessibility of brain tissue will make it more difficult to apply such approaches to mental illness.

The future of genetic association studies
Current chip-based technologies for genome-wide analysis, while having some success in identifying the lowest-hanging genetic fruit for many common diseases, seem to have already started to run up against barriers that are unlikely to be overcome by simply increasing sample sizes. These technologies should really be regarded as little more than a place-holder for whole-genome sequencing, which should become affordable enough to use for large-scale association studies within 3-5 years.

The application of cheap, rapid sequencing technology is likely to generate a harvest of new disease genes that far exceeds the yield of current GWAS, by providing simultaneous access to both the rare variants and copy number variations that are inaccessible to current chip-based approaches. However, building a more complete catalogue of the heritable variants that drive common disease risk will require more than just cheap sequencing: it will also take advances in clinical diagnostics to better sub-categorise patients into homogeneous groups, as well as new and powerful analytical approaches to cope with the torrent of sequence data, and to efficiently identify epistatic interactions between disease variants. To have any chance of picking out variants of small effect from whole-genome sequencing data sample sizes will have to be enormous - massive cohorts currently being assembled, such as the 500,000-person UK Biobank and a similar NIH-funded study currently in the works, will provide essential raw material for the selection of participants. Naturally, to be applicable to humanity as a whole, cohorts will need to be gathered separately from many different human populations.

Finally, epigenetic variation remains a wild-card of uncertain significance, which will need to be tackled with a different set of high-throughput technologies (although it's likely that many of these will feed on advances in high-throughput sequencing).

Although I probably sound pretty negative about GWAS, I want to emphasise that the current problems are the result of technological limitations that will soon disappear. Barring global catastrophe, within the lifetimes of most of those reading this post we will have a near-complete catalogue of the genetic variants influencing the risk of most of the common diseases that plague the industrialised world (and, hopefully, many of those that plague the rest of humanity). Together with parallel advances in medical science, this catalogue will provide an unprecedented ability to predict, treat and potentially completely eliminate a host of common diseases. It will also bring social and ethical challenges of unprecedented magnitude - but that's a topic for another post...


Subscribe to Genetic Future.

Friday, March 21, 2008

Another editorial on personal genomics

I only have time to quote one paragraph from this recent editorial on "recreational genomics" in the European Journal of Human Genetics; the full text is freely available, so check it out.
It looks like we are getting more and more confident about some genetic associations, and can estimate individual risks of complex diseases more precisely that a year ago. Although the average inhabitant of Netherlands now has a lifetime risk of developing type II diabetes of 13%, for some people this might be 10 or 17% after testing. Whether this makes any difference to people is not known. Effective interventions to reduce a risk of 17% to the population average of 13% are not yet available. Whether paying US$300 or even $1000 helps to motivate people to follow-up their individual lifestyle advice is not known either.

Subscribe to Genetic Future.

Thursday, March 20, 2008

Is GINA a good thing?

The DNA blogosphere has been sporadically abuzz with chatter about the Genetic Information Nondiscrimination Act (GINA), which appears to be almost universally regarded as a good thing. In fact, a Nature editorial last week proclaimed that "the entire scientific and medical community is adamantly supportive of this bill" and implied that a failure to pass the GINA would have had disastrous consequences for the field of genetics as a whole:

Otherwise, the enormous research and clinical progress being made in the nascent era of personalized medicine will come crashing to a halt because people — despite the efforts of George Church — will remain rightly wary of taking genetic tests.

A letter to Nature this week from Spanish scientist and Biopolitical blogger

I, for one, do not support this bill.

Better information allows better matching of people and jobs, and of people and insurance policies. The purpose of firms is to produce goods and services efficiently, and information helps to improve efficiency. The purpose of insurance is to manage risk, and information availability lowers risk.

You fear that the use of genetic information by employers and insurers will lead to social inequality — or, in other words, you trust that ignorance will preserve equity and fairness. There are better ways to deal with social inequality than to force ignorance upon workers, employers and insurers. And a better informed, more efficient, wealthier society creates better conditions for everyone to live decent and productive lives, whatever our genetic make-up.

I'm not necessarily supporting this view - although I'm certainly not as unambiguously positive about GINA as others appear to be - but rather throwing it out there for discussion.

Discussion:
The OpenHelix blog makes some excellent points.



Subscribe to Genetic Future.

Wednesday, March 19, 2008

Perlegen gains access to 4 million patient records

From a GenomeWeb Daily News story:
Perlegen Sciences said today it will work with an undisclosed electronic medical records company to search through data from four million patients for genetic markers that could help predict patient response to certain medical treatments.

Perlegen said it has reached an agreement under which it will have exclusive access to a database of US medical records, from which it will target patients who will be sought for DNA samples.

Sounds like the idea is to identify groups of patients suffering from specific diseases, ask them to volunteer a DNA sample, and then compare their genetic profiles to matched sets of healthy controls. The power of this approach will of course depend on a number of factors: how accurate and detailed the medical records are, how willing patients are to provide DNA samples, and so on.

Nonetheless, with four million patients to choose from, Perlegen will have a data-set of unprecedented size to use for chasing common disease variants. No doubt we'll be hearing more about this project over the next few years.

Subscribe to Genetic Future.

Saturday, March 15, 2008

Welcome GNXPers

Some recent general human genetics posts that might appeal:

The genetics of bone length.

Climate genes: positive or balancing selection?

How many harmful mutations do you carry?

Shared obesity variant in kids of European and African ancestry.

Of course, you could always just subscribe to Genetic Future (RSS).

Cheers to Razib for the link.

Gene expression in fat/skinny twin pairs

A couple of days ago I discussed the association of obesity with variants in the FTO gene in children of European and African ancestry. Continuing on the obesity theme is this intriguing study published this week in PLoS Medicine, which examines a small but potentially highly informative group of subjects: a set of 14 pairs of identical twins classed as "discordant for obesity", which means that one is substantially fatter than the other.

Discordant identical twin pairs are an endless source of fascination for researchers, who collect and categorise them like rare butterflies in vast twin repositories. In this study, the twins provided an opportunity to look at the factors that lead to obesity despite a shared childhood environment and nearly (see below) identical genomes. The researchers ran the twins through a panel of tests, including scans for body composition, screens for blood insulin levels, and extensive analyses of gene expression profiles in fat.

The two most interesting results: the fat twins had fewer mitochondria (the structures in our cells responsible for converting food into useful energy) in their fat, and also some clear differences in gene expression in their fat cells. The mitochondrial result is completely novel, and may indicate a derangement of energy metabolism in the fat cells of obese individuals - but it's unclear whether this represents a cause or an effect of obesity. The gene expression profiles show a range of differences, with the obese twins had higher levels of expression for some genes involved in inflammation, and lower levels for other pathways (particularly for mitochondrial genes involved in amino acid metabolism). Overall, these findings are intriguing rather than transformative, and it's going to be tough to untangle the chains of causation.

So, where are these differences coming from? If you'd asked me that a few months ago I'd have guessed that this variation was largely environmental in origin, with perhaps some stochastic epigenetic differences (chemical modifications of DNA that alter gene expression) accounting for the rest, but now I'm not so sure: a recent article in the American Journal of Human Genetics shows the extant to which even identical twins can differ at the genetic level due to mutations occurring during development or adult life. These mutations are not necessarily subtle, with one of the twins analysed in the study showing a whopping 1.6 million base pair deletion on chromosome 2 in 70-80% of his blood cells. (For more on this study, I'd recommend John Hawks and the NY Times.)

It seems unlikely that these mutations are responsible for most of the observable differences between typical pairs of identical twins - many of which you can instead blame on the psychological scars left by childhoods dressed in matching outfits - but they may well be enriched in highly discordant pairs such as the ones analysed in the obesity study. Unfortunately for the twin collectors, it's going to take much bigger cohorts than this to demonstrate that with any degree of confidence.

Subscribe to Genetic Future.

ResearchBlogging.orgPietiläinen, K.H., Naukkarinen, J., Rissanen, A., Saharinen, J., Ellonen, P., Keränen, H., Suomalainen, A., Götz, A., Suortti, T., Yki-Järvinen, H., Orešič, M., Kaprio, J., Peltonen, L. (2008). Global Transcript Profiles of Fat in Monozygotic Twins Discordant for BMI: Pathways behind Acquired Obesity . PLoS Medicine, 5(3), e51. DOI: 10.1371/journal.pmed.0050051

Images from here and here.

Friday, March 14, 2008

Engineered virus creates bigger, stronger muscles

A new article in PNAS describes the generation of super-muscled mice through genetic engineering. Genes that act to inhibit the protein myostatin - a well-known inhibitor of muscle growth - were introduced into the mice using synthetic viruses. By inhibiting the inhibitor, the researchers were able to induce an increase in both muscle size and strength that lasted for up to two years (a mouse's lifespan is only about three years, so that's pretty impressive). The treatment bulked up muscle both in normal mice and in a mouse model of a common muscle disease, Duchenne muscular dystrophy.

The authors write in their conclusion:
The striking ability of [one of the inhibitors] to provide gross and functional long-term improvement to dystrophic muscles in aged animals warrants its consideration for clinical development to treat musculoskeletal diseases, including older [muscular dystrophy] patients.

This is a very early but potentially promising result for patients with muscular dystrophy, a particularly nasty disease. Of course, sprint athletes keen for a potentially undetectable boost to their muscle power will also be watching with interest...


Subscribe to Genetic Future.

23andMe laboratory delays

In an anonymous LiveJournal entry, "fdmts" quotes an email allegedly received from 23andMe:
Dear FDMTS,

We wanted to acknowledge that you've been waiting longer than anticipated to receive your data from 23andMe. Our sincerest apologies for the delay; the laboratory analysis process typically takes 4-6 weeks, but we are experiencing a backlog that is resulting in longer than predicted processing times - up to about 10 weeks. We have taken several steps to expand our laboratory capacity and the efforts are yielding results now. You will be notified via email in the next few weeks as soon as your data are ready.

We've recently added new content to our Personal Genome Service that we hope you'll enjoy. Thanks so much for your patience and for choosing to join the growing 23andMe community.

Sincerely,
The 23andMe Team

If this is a real email (and again, I emphasise that this is an anonymous and unsourced web comment, so it may well not be), this could mean one of two things:

  1. 23andMe has experienced much higher-than-expected demand for its personal genomics service; or
  2. 23andMe is having problems with their genotyping technology.

It's impossible to know which without further information, but both are intrinsically very interesting (at least to me!). Anyone else out there experiencing delays in getting their 23andMe results back? Anyone from 23andMe care to comment?

Subscribe to Genetic Future.

Thursday, March 13, 2008

Shared obesity variant in kids of European and African ancestry

I'll be talking more about obesity genetics later this week; for now, check out this paper published this week in PLoS ONE (open access). The FTO gene is the poster child for genetic variants underlying human variable traits: it has a reasonably strong effect on body weight and obesity risk, and has been replicated ad nauseum in a variety of cohorts, including in children. What's missing at this stage is an understanding of exactly which variant within this gene underlies the effect on body weight.

This study takes a clever approach to zoom in on the causative variant, using obese children and non-obese controls from two populations with different ancestral backgrounds (one of European and one of African ancestry). That's a powerful strategy: the pattern of genetic variation around the FTO gene is likely to be quite different between these two populations, and markers that stand out in both populations must thus be physically close to the true causative variant.

The study looked at eleven markers within the FTO gene and found two that - in Europeans - are very closely linked to the marker previously associated with obesity ("closely linked" means that these markers are almost always inherited together, so that knowing someone's sequence at one marker allows you to predict, with very high confidence, their sequence at the other marker).

Reassuringly, both of these markers were significantly associated with obesity in their European cohort; importantly, only one of them (rs3751812) showed a significant association in the African-American cohort. None of the other eleven markers - including the variant previously strongly associated with obesity in other studies - showed a signal in the Africans. The implication is that rs3751812 is very close to the real genetic variant underlying the obesity association in this region.

If this result holds up - and it's best not to put too much weight on it (heh) until it's been replicated in a separate African-ancestry cohort - it's quite an exciting finding, because it gives us a much better idea of where the precise obesity-related variant in this region actually lies. It also illustrates the currently hugely under-utilised power of using individuals with African ancestry for fine-mapping: Africans typically have much lower associations between nearby markers, so if you find a marker associated with a given trait, it's likely to be pretty close to the real trait-affecting variation.

I've checked out the positions of these markers using the UCSC Genome Browser. All three of the markers - the two looked at in this study, and the original obesity marker - are found within a few thousand bases of each other in the first intron of the FTO gene. Both the original marker and the new marker (i.e. the one that is associated in Africans) are found in highly evolutionarily conserved regions, suggestive of regions that may regulate the expression of this gene. It seems likely that the causal variant rests in one of these conserved islands; no doubt these regions are being closely examined for clues as I write this.

You can bet we'll be hearing a lot more about this gene during 2008. In fact, I can guarantee you'll hear more about it from me later this week...

Subscribe to Genetic Future.

Grant, S.F., Li, M., Bradfield, J.P., Kim, C.E., Annaiah, K., Santa, E., Glessner, J.T., Casalunovo, T., Frackelton, E.C., Otieno, F.G., Shaner, J.L., Smith, R.M., Imielinski, M., Eckert, A.W., Chiavacci, R.M., Berkowitz, R.I., Hakonarson, H., Maedler, K. (2008). Association Analysis of the FTO Gene with Obesity in Children of Caucasian and African Ancestry Reveals a Common Tagging SNP. PLoS ONE, 3(3), e1746. DOI: 10.1371/journal.pone.0001746

The genetics of bone length

Height is one of the most genetically determined traits known in humans - around 80% of the variation in height in the general population (at least in Western countries, where nutrition is not limiting) is driven by genetic factors. This strong genetic component, along with the ease of measurement, has made height an attractive target for genome-wide association studies looking to identify the precise genes responsible.

I think it would not be an exaggeration to describe these studies as a resounding disappointment. The first genome-wide association for height, published in September last year, examined 4,921 individuals and found a single significant and replicable variant associated with height (in the HMGA2 gene) that explained a meagre 0.3% of the variation in the general population. The second such scan, published in January this year, examined 6,669 individuals. This study confirmed the HMGA2 finding and identified one further significant signal, this time near the GDF5 and UQCC genes, which again explained less than 0.5% of the total variation. I saw an abstract at the American Society of Human Genetics meeting last year (as yet unpublished, as far as I can tell) in which a genome scan was reported for 10,737 individuals, which pulled out a total of 8 associated variants which together explain just 3% of the variance in height.

To summarise these results: to date genome-wide scans for height, even extremely well-powered ones with more than 10,000 participants, have identified variants responsible for less than 5% of the variation in this trait - despite height being a trait that is largely genetically determined and varies substantially between humans. What's going on?

I'm not sure if anyone really knows the answer to this question, but it's likely that there are a number of factors in play here. One possibility is that many of the genes regulating height may be acting exclusively on one of the separate body components that make up total height - say, leg length, or spine curvature. By looking at total height rather than each of these components separately researchers might be drowning out the signal from such genes, and thus missing important pieces of this curious genetic puzzle.

A new paper in PLoS ONE (open access) tests this possibility directly. The researchers used images from whole body dual energy X-ray absorptiometry (DEXA; see image on left, from here) to measure the lengths of the spine, femur (thigh), tibia (calf), humerus (upper arm) and radius (forearm) on two sets of twins: 1,157 identical and 2,594 non-identical, all of them female. For measuring genetic markers, the group used a low-density set of just 400 microsatellite markers arrayed across the genome rather than the high-density SNP chips used for most modern genome scans.

Their first finding was that the different bone lengths they examined in their study were significantly correlated with one another - in other words, if you have a long thigh, it's highly likely that you also have a long forearm. However, these correlations are by no means perfect. Even the tightest association - between calf and thigh length - only had a correlation score of 0.78, meaning that knowledge of thigh length only allows you to predict around 78% of the variance in calf length. For the other bone lengths the correlations were even weaker, particularly between spine length and all of the limb bones (where the tightest correlation was still less than 0.25). That's good, because it means that these measurements are at least partially independent of one another. This is pretty important to know: if all of these components were perfectly correlated with one another, then you wouldn't capture any extra information by looking at each one separately than you would by just looking at height as a whole.

Also promising were the heritability measurements: all of the bone lengths were between 57 and 73% determined by genetic factors. That means that each of these individual components of height are each strongly regulated by genetics, just as height itself is, an important prerequisite for a genome-wide scan to have any chance of success.

After all this promising groundwork, the genome scan itself is pretty perfunctory: only two of the 400 markers they analysed showed any sign of an association with bone lengths, with one "highly suggestive" signal for spine length and one "suggestive" signal for thigh bone length, both on chromosome 5 but in different regions. To be honest, the results of this scan are neither particularly convincing nor useful, so I'm not going to dissect them any further.

The major useful findings from this study are that the independent components of height are both partially independent and highly heritable (i.e. influenced by genetics). That sets the stage for a serious genome-wide analysis of these traits. Essentially, these detailed measurements would need to be repeated in a cohort of unrelated individuals, matched as closely as possible for ancestry, age (age contributed to 11% of the variation in spine length in this study, so it's an important variable to control), and other variables, and including both males and females (as opposed to this study, which analysed only females). Then a genome scan would need to be performed using a much denser marker set: the million or more markers found in modern SNP chips rather than the paltry 400 microsatellites analysed here. Such a study would have a good shot at capturing a more impressive proportion of variation than the less than 5% harvested so far from scans for total height.

Then of course there are the inevitable follow-up studies that I keep going on about in this blog: large-scale sequencing studies to identify rare variants, targeted scans for copy-number variation not picked up by SNP chips, and analyses of heritable epigenetic modifications (chemical changes to DNA that don't affect its sequence). Given that this is a trait where common variants appear to account for only a tiny proportion of the total variation (based on the genome-wide scans decribed above) such studies assume extra importance.

Of course, one could ask why we should even care about the genetics of height - surely resources would be much better spent on disease-related traits? There's some truth to that, but I think height serves as an important model for common human phenotypic variation in general. Variation in height is common and obvious, and we know that this trait is strongly genetic. In the process of unravelling its genetic architecture we will learn important lessons that will apply to other complex traits, including those associated with health and common disease.

Subscribe to Genetic Future.

Chinappen-Horsley, U., Blake, G.M., Fogelman, I., Kato, B., Ahmadi, K.R., Spector, T.D., Gibson, G. (2008). Quantitative Trait Loci for Bone Lengths on Chromosome 5 Using Dual Energy X-Ray Absorptiometry Imaging in the Twins UK Cohort. PLoS ONE, 3(3), e1752. DOI: 10.1371/journal.pone.0001752

Wednesday, March 12, 2008

Odd Google hits

Every now and then I randomly sample the searches visitors have used to stumble across this site. Although most are disappointingly predictable, every now and then there's one that makes me smile: something cute, like "identical twin butterflies"; slightly sad, like "one leg longer than the other is it genetic?" (I imagine a lop-sided Googler deciding whether or not to reproduce); or downright incomprehensible, like "what genetic gene is stronger?"

As of today I have a new favourite:
why cant science answer what will i do with my genetic power
This makes me imagine Wolverine or Hiro, rather plaintively searching the internet for answers. Sorry, guys - this site won't help you, at least until someone does a genome-wide association study for time distortion abilities...


Subscribe to Genetic Future.

Sunday, March 9, 2008

Next-generation sequencing: the movie

It's propaganda, but it's pretty - Pacific Biosciences has a neat promotional video demonstrating how their next-generation sequencing technology works. PacBio is making big claims about the power of their new sequencing machines, and indicating that they'll be on the market sequencing entire human genomes in minutes by 2010. We'll see.

HT: Aminopop.

Subscribe to Genetic Future.

Climate genes: positive or balancing selection?

A while back I posted about an elegant recent study of the correlation between climate and genetic variation in metabolic disease genes. I've been thinking quite a bit about this study over the last week, and it strikes me that the results aren't as easy to interpret as I initially thought. Basically, I think it's likely that the results of the study do indicate that natural selection has acted on these metabolic genes in response to climate (or some correlated variable, like resource availability); what isn't clear is exactly what type of natural selection.

In order to explain the complication I'll first give a background on the different types of natural selection that can act on a gene.

Types of natural selection
Natural selection (hereafter simply "selection") is broadly divided into three major categories: negative, positive and balancing selection.

Negative selection is the most common, and probably the least interesting. It acts on genetic variants that reduce the reproductive fitness of organisms, decreasing their frequency until they (usually) disappear completely from the population.

Positive selection is much less common, and much more interesting. It occurs in the rare case that a genetic variant (or allele) actually increases the fitness of organisms that carry it. This means that carriers tend to have more offspring, on average, than non-carriers - so positively selected alleles will tend to increase in frequency over time until they reach "fixation" (that is, they're carried by every single individual in the population). Almost all of the recent genome-wide scans for selection in humans (for instance here, here and here) have been looking for examples of positive selection.

Balancing selection is a special case of selection, which results in multiple versions of the same gene all being maintained at some frequency within the population, sometimes over a very long period of time. Classically, balancing selection occurs in two situations:
  1. heterozygote advantage, when the fitness of heterozygotes (individuals carrying two different alleles) is higher than that of either of the two homozygotes (individuals who carry two identical alleles); or
  2. frequency-dependent selection, when genetic variants are advantageous if they're at low frequency, but become less beneficial as they reach a higher frequency.
The classic example of heterozygote advantage is the sickle cell anaemia allele, which protects against malaria in heterozygotes but results in severe disease in homozygotes. Individuals who carry one copy of the two different versions of the gene are thus fitter than homozygotes for the normal copy (who are susceptible to malaria) or homozygotes for the sickle cell copy (who die in childhood from anaemia).

Frequency-dependent selection appears to play a role in maintenance of extreme levels of genetic diversity in the human major histocompatibility complex (MHC) genes. I'm not going to discuss it much here since it's not really relevant to the rest of the post, but anyone interested in more details can check out good old Wikipedia.

The critical point I want to make is that while positive selection will usually tend to increase the frequency of an allele until it reaches 100% frequency, balancing selection can result in a situation where an allele reaches a stable frequency that is less than 100%. For a case of heterozygote advantage, the stable frequency will be the point at which the selective advantage of heterozygotes is cancelled out by the selective disadvantage of homozygotes.

How does this apply to the climate genes paper?
The authors of the paper assessed whether selection for climate adaptation had acted on each genetic variant by asking a simple question: does the frequency of this variant correlate with the climate experienced by each population? In other words, if you drew a graph with allele frequency on one axis and some climate variable on the other, would there be a significant linear relationship between the two parameters?

An example of one of the correlations they saw in their study is shown below: the pie charts on the map show the frequency of a variant in the RAPTOR gene, while the map is coloured by winter maximum temperature; the graph shows allele frequency (on the y axis) plotted against a summary statistic of winter climate (on the x axis), with populations colour-coded by geographical region. You can see that there's a reasonable linear trend on the graph.



The authors, quite reasonably, interpret a significant linear relationship as evidence of climate-driven adaptation - in other words, the frequency of these variants has been altered by natural selection as they provided an advantage or disadvantage under particular climatic conditions.

But what type of selection are we talking about? The authors suggest that their data are "consistent with a signal of spatially varying positive selection" - in other words, positive selection for something like, say, cold resistance, which will obviously vary in intensity between different geographical regions. Under this model, a cold-resistance variant will tend to increase faster in populations living in colder areas than in those living in warmer areas, and if a cold-adapted population moves into warmer climates (like those Siberians moving down into South America) the variant will presumably tend to decrease in frequency again as the old, warm-adapted allele increases.

What this means is that once a population settles down in a particular region, cold-resistance alleles will either head steadily towards 100% (if the region is cold), move steadily down towards 0% (if the region is warm), or drift around slowly under no particular selection until it reaches 0% or 100% by chance (if the region is somewhere in the middle).

Would this type of selection actually result in a neat linear trend, like that seen for the RAPTOR gene? Well, it might, if the timing was just right, but it's by no means a necessary outcome. There are at least three variables in play here, each of which will have some effect on the current frequency of a positively selected allele: the strength of selection, the starting frequency of the allele in that population, and the amount of time the population has existed in its current environment. For positive selection to result in a clean linear correlation between allele frequency and a climate variable, the latter two factors would have to have had a negligible impact, so that most of the variation is determined by selection intensity.

I think that's pretty unlikely given what we know about human population history: native Americans, for instance, are the descendants of a cold-adapted population living in Siberia that only relatively recently moved down into the warmer climates of central America; selection has not yet had much time to act in these populations. In contrast, humans in Southern Asia have been in their current climate much longer, giving selection more time to do its work. Thus for variants under positive selection, current frequency will be substantially affected by historical contingencies, and the correlation between allele frequency and selective strength will be rough at best.

Another (less convincing) reason to be sceptical of positive selection is that the genes reported in this paper don't seem to show the classic genetic signatures of this type of selection. Positive selection can be detected using a measure known as the integrated haplotype score (iHS), as described in this study. iHS for most of the genes in the human genome in Europeans, East Asians and West Africans can be analysed using this handy online browser; of the sixteen genes that are most significant in the climate genes study, only one (LEPR) shows significant evidence of positive selection, and one other (MAPK1) is a marginal case. The others show no striking evidence of recent positive selection.

The iHS test is imperfect in a number of ways - it certainly misses some types of positive selection, and it may not be sensitive enough to pick up very local selection (e.g. selection acting on just a few smaller populations within East Asia, rather than East Asia as a whole). But the absence of clear signals of positive selections for these genes adds to the case that simple positive selection may not be the major player here1.

If not positive selection, then what?
You've probably already guessed my hypothesis: at least some of the genes pulled out from this study (and probably the ones with the tightest correlations) have been the targets of balancing selection. Balancing selection could be acting on climate genes in different ways, but my first guess would be heterozygote advantage.

In this model, individuals who carry one copy of the (say) cold-resistance allele - that is, heterozygotes - are favoured in cold climates, but those who carry two copies (homozygotes) suffer from some type of deficit that makes them less fit than the heterozygotes. There are various plausible ways that this could happen: for instance, in creating its cold-resistance effects the variant might disrupt an important normal function of the gene, so that people need to have at least one copy of the "normal" version to be healthy.

Under this scenario the frequency of the cold-adapted allele will increase in cold climates, until it reaches a high enough frequency that the benefits of being cold-adapted are outweighed by the increased risk of having sick homozygote kids (since at high allele frequencies, you're more likely to mate with another heterozygote). The equilibrium frequency will depend on two factors: the fitness benefit of being a heterozygote, and the fitness cost of being a homozygote. Since heterozygotes will be fitter in colder climates, this equilibrium frequency will be higher in those places than in warm climates2.

This model would result in each population reaching a stable allele frequency that is correlated with the local temperature, regardless of its starting frequency and how long the population had been subjected to that particular environment - so long as there has been enough time for the population to reach equilibrium. This scenario is much more likely to result in a linear correlation between allele frequency and climate variables than a simple positive selection model.

So what type of "deficit" might homozygotes suffer from, for the genes explored in this paper? Well, remember what class of genes were specifically targeted in this analysis: genes known to be associated with metabolic disease. This gene set may well be enriched for variants which provide an adaptive boost to novel climates or food resources in the heterozygous state, but which increase the risk of metabolic disease (be it mild or severe) when homozygous.

So what?
Note that I'm not saying that positive selection hasn't played a role in climate adaptation - in fact, I'm sure it's played a much bigger role than balancing selection - but rather that the straightforward correlation of allele frequency with climate variables used in this paper has much more power to detect balancing selection, so many of the genes they pick up are likely to be due to balanced polymorphisms. Efficiently detecting genetic adaptation to climate driven by positive selection will require a more complex approach - but this is a topic for another day.

A heterozygote advantage situation should be seen as an imperfect and temporary solution to the problem of adaptation, one that evolution uses during the earliest phase of adaptation in the absence of anything better. Ultimately, human populations are likely to stumble across more elegant long-term solutions: fixation (by positive selection) of more subtle "tweaking" mutations that adapt the population to its local climate, without the troublesome handicap of also causing disease in homozygotes. Once this has happened, studies such as this one based on simple allele frequency would be completely unable to uncover climate adaptation genes.

Update: Razib at Gene Expression comments. Hi to new readers from ScienceBlogs - if you like what you see, you can always subscribe to Genetic Future.

1 The lack of an iHS signal also suggests that these alleles have been hanging around in the human population for quite a while; the early phase of balancing selection on a newly arisen allele (a sharp increase in frequency before equilibrium is reached) will create a signal that is identical to that created by positive selection. So if this is balancing selection, either it acted recently on standing neutral variation, or it's been acting for a long time - longer than the 10-50,000 years required for an iHS signal to decay to background noise.

2 This assumes that the fitness cost to homozygotes is independent of climate, which will often but not always be true.


ResearchBlogging.orgHancock, A.M., Witonsky, D.B., Gordon, A.S., Eshel, G., Pritchard, J.K., Coop, G., Di Rienzo, A. (2008). Adaptations to Climate in Candidate Genes for Common Metabolic Disorders. PLoS Genetics, 4(2), e32. DOI: 10.1371/journal.pgen.0040032

Blizzard image from here.

 Subscribe to Genetic Future.

Saturday, March 8, 2008

deCODE's lay-offs: market saturation, or just one struggling company?

Eye on DNA's Hsien-Hsien Lei notes the recent lay-off of around 60 employees from deCODE genetics, the company responsible for personal genomics venture deCODEme. In a press-release quoted by Hsien, deCODE CEO Kari Stefansson suggests that this is a broader omen of things to come within the personal genetics community:
It is natural for us to operate the company in such a way that we can make the money that we have last longer than what we had expected to begin with. These are very simple and clear operational standpoints and it would even be wise for other companies in our community to follow our example.

Hsien goes on to suggest that deCODE's troubles may stem from a premature expansion of the personal genetics market in the absence of real consumer demand:

Perhaps this is an indication that the market is starting to experience saturation in the number of companies and services being offered yet has not seen a concomitant rise in the number of consumers willing to pay for personal genomic services.

I'm sceptical that the personal genetics market is anywhere near saturation at this stage. Instead, deCODE's problems seem to stem from poor strategic decisions (or, more charitably, from long-term investments in R&D that are yet to bear fruit). Bear in mind that this is a company that has never actually made a quarterly profit, and has declared total losses of $600 million since it was founded in 1996. This quarterly report from September 2007 tells the story:

We incurred a net loss of $63.1 million and $62.2 million for the nine-months ended September 30, 2007 and 2006, respectively, and $85.5 million for the year ended December 31, 2006, and had an accumulated deficit of $598.8 million at September 30, 2007. We have never generated a profit and we have not generated revenues except for payments received in connection with our research and development collaborations with Roche, Merck and others, from contract services, Emerald BioSystems products and instruments, and under grants. [...] It may be several years before product revenues materialize, if they do at all. As a result, we expect to incur net losses for several years. If the time required to generate product revenues and achieve profitability is longer than we currently anticipate or the level of losses is greater than we currently anticipate, we may not be able to continue our operations.

I don't get the feeling that deCODE's fundamental problem is market saturation - it's just running out of money because it's failed to turn a huge investment in research into marketable items. But hey, I'm no market analyst, and I'd be very interested in hearing about this from experts (calling David Hamilton?).

Hsien finishes with an excellent point about the need for personal genetics companies to sell the wider public on the utility of their products:

In any case, while 23andMe and Knome focus on the rich, famous, and elite, there is a great need to show the general public how genetic testing of all types is relevant to their everyday lives. There aren’t enough millionaires like Dan Stoicescu to fund the entire personal genomics market. Until genetic testing is widely adopted for a variety of commercial uses by a greater segment of the consumer population, the pot of profits will not be big enough to share. In 2008, we will surely see companies drop out and others consolidate.

Subscribe to Genetic Future.

Friday, March 7, 2008

New genes for celiac disease

Celiac disease is a relatively common and unpleasant auto-immune disease of the gut in which the body's immune system attacks proteins in wheat, rye and barley, leading to inflammation and intestinal damage whenever these foods are eaten. Around 1% of individuals of European ancestry suffer from this disease, and there is a strong genetic contribution to risk.

In an advance online publication (full text for subscribers only) in Nature Genetics, a European consortium describes a fruitful approach to identifying novel genetic risk factors for this disease. Members of this group had previously performed a genome-wide scan for susceptibility genes, which confirmed a previously well-known effect of variation in the HLA gene cluster (one of the usual suspects in almost all auto-immune diseases) as well as identifying a novel genetic region containing the immune system genes IL2 and IL21. However, a large fraction of the genetic risk remained unaccounted for.

One of the problems with genome-wide scans is what is known as the multiple-testing problem: when you're looking at hundreds of thousands of genetic markers at once, a disease variant has to have an incredibly strong effect to stand out from the crowd - otherwise its signal is drowned out in the statistical noise from all the other markers. In the previous genome scan the authors found thousands of markers that looked as though they might have some association with celiac disease, but weren't strong enough to be statistically significant.

That set the stage for this paper. Basically, the authors took 1,020 of those likely candidates and looked at them in a brand new set of 1,643 celiac patients and 3,406 controls. Because they were now looking at a smaller number of markers, they could in theory identify disease-associated variants with much smaller effects - and indeed they did, reaping a harvest of no fewer than seven previously unknown regions that were clearly associated with disease risk.

Reassuringly, six of the regions identified contain genes known to be involved in immune function, and the authors have plausible mechanistic explanations for two of the associated markers: one alters the sequence of a protein in a way that is likely to alter its function, and another variant has a significant effect on the levels of expression of a nearby gene.

However, while the markers associated with celiac disease were common (with a frequency of more than 10%) they each had a very subtle effect on disease risk, raising the odds of suffering from the disease by only 19-34%. In total these variants only explain 3-4% of the total genetic risk for this disease; even when you add that to the ~35% which is explained by variation in the HLA region, the majority of the genetic disease risk remains unaccounted for.

It's clear from this study just how far we still have to go to define the genetic basis of complex conditions like celiac disease. Simply adding more and more numbers to these traditional types of association studies is likely to have diminishing returns - much of the heritable risk comes from areas that simply aren't explored by standard genome-scan approaches:
  1. rare variants of moderate to large effect, which are completely invisible to traditional genome scans that rely on common markers;
  2. other types of genetic variation that can't be seen clearly by existing chips, such as copy number differences; and
  3. non-genetic heritable factors, such as epigenetic modifications.

Identification of these components of disease risk will require new approaches: large-scale sequencing for rare variants, targeted scans for copy number variation, and brand new applications to identify epigenetic changes. We'll see these approaches becoming more and more common as the technology evolves over the next few years.


Subscribe to Genetic Future.

ResearchBlogging.orgHunt, K.A., Zhernakova, A., Turner, G., Heap, G.A., Franke, L., Bruinenberg, M., Romanos, J., Dinesen, L.C., Ryan, A.W., Panesar, D., Gwilliam, R., Takeuchi, F., McLaren, W.M., Holmes, G.K., Howdle, P.D., Walters, J.R., Sanders, D.S., Playford, R.J., Trynka, G., Mulder, C.J., Mearin, M.L., Verbeek, W.H., Trimble, V., Stevens, F.M., O'Morain, C., Kennedy, N.P., Kelleher, D., Pennington, D.J., Strachan, D.P., McArdle, W.L., Mein, C.A., Wapenaar, M.C., Deloukas, P., McGinnis, R., McManus, R., Wijmenga, C., van Heel, D.A. (2008). Newly identified genetic risk variants for celiac disease related to the immune response. Nature Genetics DOI: 10.1038/ng.102

Thursday, March 6, 2008

The moral manipulation of Gattaca

A man is given strong medical evidence that he will die from a heart attack if exposed to the exertions of space travel, thus risking his own life and the lives of his crewmates. Ignoring this evidence, he fakes his way into astronaut training - and inexplicably, we cheer him on. How did the makers of Gattaca steer us towards this bizarre response? Philosopher Neven Sesardic explains, in a fascinating essay (PDF) that touches on issues highly relevant to personal genomics.

One quote to chew over:
Contrary to what the movie is trying to tell us, a more detached analysis leads us back to the common sense belief that a more detailed knowledge of our genetic predispositions would indeed severely narrow our choices. Were this kind of information to become massively available, it would be rational for many people to abandon their previous career plans and reconsider what they want to do with their lives.

Obviously we don't know enough yet about genetics to be advising people on future careers based on a genome scan - but at some point, it's likely that we'll be able to make at least some probabilistic inferences about predispositions and skills very early in life based on genetic information. So long as we ensure that we rely only on accurate information, this isn't a bad thing. It simply means that people will have more information with which to make important decisions.

HT: Black Belt Bayesian.



Subscribe to Genetic Future.

Tuesday, March 4, 2008

Knome customer featured in NY Times

An article in the NY Times entitled "Gene Map Becomes a Luxury Item" introduces us to Dan Stoicescu, one of the first two customers to fork out $350,000 to get their genome sequenced by the personal genomics company Knome (pronounced "know-me" - get it?).

The article is well worth a read for anyone interested in the future of personal genomics, but one theme stands out for me:
Biologists have mixed feelings about the emergence of the genome as a luxury item. Some worry that what they have dubbed “genomic elitism” could sour the public on genetic research that has long promised better, individualized health care for all. But others see the boutique genome as something like a $20 million tourist voyage to space — a necessary rite of passage for technology that may soon be within the grasp of the rest of us.

I'm firmly in the second camp. In a previous post about Knome I noted that, "The willingness of wealthy early adopters to pay excessive amounts for untested technology is a big driver of progress" - in other words, Stoicescu and his fellow Knome customer are subsidising the costs of technology development that will eventually make genome sequencing cheaper for you and me.

Many new technologies start off as expensive yuppie toys, and rapidly tumble in price until they become accessible to the rest of the world. Worrying about "genomic elitism" is like someone back in 1981 worrying about "portable computer elitism". If a technology has broad appeal and utility - and in a few year's time, personal genomics will have both those things in spades - the price will come down quickly. You'll be chatting to your next-door neighbour about your kids' DRD4 genotypes before you know it.

Unfortunately for the early adopters, we currently know so little about the function of most genetic variants that their full sequence won't give them much more information than they could get from 23andMe or deCODEme, for 0.3% of the price. It's true that they're likely to find a few severe recessive disease variants, but these will have little or no effect on their own health, and are unlikely to affect their children (unless they're unfortunate enough to mate with someone who also carries a mutation in the same gene). There's also a low probability that they'll find something really nasty like a Huntington's disease mutation. But overall, the expected utility of this information is low - certainly not worth the $350,000 price tag, unless you're wealthy enough to not have to worry about that kind of money.

The true value of a genome sequence - identifying and deciphering the thousands of small changes that influence our risk of both rare and common diseases during our lifetime - won't come until we have complete sequences from hundreds of thousands of people, along with thorough medical information to find associations between variants and diseases (cue the Personal Genome Project). By that time, costs for full genome sequencing will be dramatically lower - hell, even poor scientists like me will be able to afford it!


Like what you just read? Subscribe to Genetic Future.

Sunday, March 2, 2008

Genetic associations with adult obesity not replicated in kids

This short article in BMC Medical Genetics provides a useful cautionary tale. The authors took 10 different genetic variants that had previously been associated with obesity in adults and determined whether these variants were associated with body mass index (BMI) differences in a large sample of children (5,000 in total).

The short answer: nope. Not one of the variants showed any significant association with BMI after the appropriate corrections had been made.

There's a couple of possible explanations here. The first is that some of these variants don't really alter BMI at all, even in adults (in other words, that previous studies suggesting this were wrong) - given how many genetic associations fail to stand up to large-scale replication studies, this may be true for a substantial fraction of these genes.

Secondly, it's possible that these variants affect the risk of adult obesity but have little or no effect in kids. Childhood obesity is certainly substantially genetic - the most recent estimate is that almost 80% of the variation in BMI and waist circumference in 8-11 year olds is determined by genetic factors - but it may well be that different genes play a role in obesity risk at different stages in the life cycle. Perhaps some people genetically doomed to be fat little kids will find it easier to lose weight later in life, and vice versa of course!

Unfortunately, this study didn't look at the best recent genetic variant associated with BMI and obesity risk, which lies in the FTO gene. This variant has previously been shown to be associated with obesity in both children and adults, indicating (not unexpectedly) that there is at least some overlap between genetic risk factors for obesity between the two age groups.

Subscribe to Genetic Future (RSS).

ResearchBlogging.orgHaworth, C.M., Butcher, L.M., Docherty, S.J., Wardle, J., Plomin, R. (2008). No evidence for association between BMI and 10 candidate genes at ages 4, 7 and 10 in a large UK sample of twins. BMC Medical Genetics, 9(1), 12. DOI: 10.1186/1471-2350-9-12