Gene of the Week Redux?

Biopolitical Times
Image of a diverse group of people

Participants in All of Us

At Biopolitical Times, we used to enjoy skewering what we called the Gene of the Week: the all-too-common concept that some specific gene “caused” some particular emotion or behavior. We quit when the “candidate gene era” of biological science flunked the reality test, by the early 2010s. Unfortunately, as in the old horror movie, the idea, suitably mutated, may be coming back.

The idea of finding the connection — if any — between a particular genetic variation in the human genome and a complex behavior or trait had been deeply questioned for years, and finally seemed to fade. But a replacement was waiting in the wings: the analysis of genome-wide association studies (GWAS). This approach involves comparing the genomes of a large number of people who have a particular trait or disease with the genomes of a large number of people who don’t; if there are consistent differences between the two groups, they may suggest genetic markers that can theoretically be used to predict the presence of that trait or disease.

There are a number of difficulties with this approach, starting from the problem it fails to solve, namely that in all but the rarest cases no single gene causes any particular disease or condition. The suggested genetic markers, however, may suggest that a cluster of genes, sometimes hundreds of them, taken together are somehow correlated with a particular outcome. That still does not explain how they are connected, let alone how the proteins they specify lead to the result in any one individual.

However, some scientists suggest that if enough data is collected, it may theoretically be possible to estimate the probability of developing a particular trait or disease based on the aggregate of genomic variations in that individual. This is called (rather hopefully) a “polygenic risk score” (PRS) that estimates one person’s chances of developing a particular result, which could be a deadly disease or something as mundane as a child’s future height or future educational attainment. Note that these apparently scientific assessments say nothing about causation; they just suggest correlation. It’s a kind of black-box theory involving mysterious algorithms. But if enough data is gathered, proponents suggest, they will be able to fully explain the variation among individuals and make sense of the complexity.

All this would of course be impossible without increased computing power and especially the dramatic decline in the cost of sequencing a human genome: from about $100 million in 2001 to about $500 now. Also needed: a lot of data, not just DNA but medical and life-style data from a very large number of people. Governments, particularly in the UK and US, have invested substantial sums in gathering this data from hundreds of thousands of volunteers.

In the US, “All of Us” (announced in 2015) is a program of the National Institutes of Health (NIH) that aims to gather data from a million people and recently released 100,000 whole genome sequences from a diverse population. About half of these participants self-identify with a racial or ethnic minority group, which represents a modest but significant step in the right direction, in that previously over 90 percent of those included in large genomic studies were of European descent. The platform also links to the U.S. Census Bureau’s American Community Survey, which may suggest possible environmental influences, and offers participants access to their own data, including genetic ancestry results. A tsunami of analytical papers from the 1500 scientists who have registered to access the data seems inevitable.

We have already seen a flood of papers based on data from the UK Biobank, which has been functioning since 2006 and is funded in part by governments and in part by charities and multinational companies. It has both genetic data and health information from half a million people, and 200,000 sequenced genomes were released for research in November 2021. The UK data, some of which was previously available to researchers, has already been mined for about 2700 scientific papers. Some of those are not, strictly speaking, GWAS analyses but rely on the physical and medical data supplied by participants, for instance comparing reported diet with incidence of cardiovascular disease. Others relate genomic analysis to apparently social actions, such as the age at which people first wore spectacles, or the relationship of coffee drinking to dementia (studies seem to offer contradictory results).

There are reasons to be skeptical. To take one specific example, the risk of developing schizophrenia is somewhat greater for people with a family history of the disease, implying that there is some genetic component. But if one identical twin develops schizophrenia, that does not necessarily mean that the other will; there is a 1 in 2 chance they won’t. Hundreds of genes may be involved, and “most people with a close relative who has schizophrenia will not develop the disorder themselves.” A recent evaluation of polygenic risk scores predicting the outcomes of schizophrenia in diagnosed patients found that they are no more useful than the notes in their medical file.

Meanwhile, polygenic risk scores are already being commercialized. Shockingly, they are being marketed to people who intend to become parents and understandably want their future child to have the “best” genes. Unscrupulous or over-confident salespeople make that promise, but they know full well that the “choice” they sell is safe — no one will ever know the abilities of any embryo that is not implanted and brought to term. This is provoking a widespread backlash, for example in a Nature editorial last week that called the development “alarming” and accompanied an article — written by employees of two testing companies and one fertility clinic — that suggested the technology may eventually be of some use. At the same time, in Nature Medicine, Josephine Johnston and Lucas J. Matthews made an urgent call for “a frank assessment of their profound ethical implications” in an article titled:

Polygenic embryo testing: understated ethics, unclear utility

Analyzing complex genomic data requires not just care but also common sense. In February, Dorit Barlevy et al. also discussed the limitations of polygenic risk scores in the context of reproductive decision making:

Individuals with a PRS in the top 1% for T1D [Type 1 diabetes] have a 30-fold increased chance of developing the condition compared to the general population; however, given that the population prevalence of T1D is approximately 0.3%, these “high-risk” individuals still have only approximately a 9% chance of developing T1D in their lifetimes (Sharp et al. 2019). In other words, such offspring would still have more than a 90% chance of not developing T1D.

The math here is more generally applicable. There is a serious danger that such analyses, whether of embryos or adults, will generate both false negatives and false positives, missing many people who are seriously at risk and scaring many people who may not get sick.

Genome-wide association studies and polygenic risk scores are being offered up as sophisticated approaches to genomic analysis. But taking a step back, are they really more meaningful than the “gene-for” claims of yesteryear? And taking one more step back, are they the best use of scientific expertise to improve individual and public health? It may be that governments and scientists alike are putting their resources in the wrong direction. Healthcare, housing, and education may be better investments. And we really do not want to revive the Gene of the Week.