Embryo selection has been in the news lately. The phrase tends to conjure sci-fi images of wealthy parents designing immaculate children, but I find that media discourse often glosses over the practical limitations of these technologies, which make them much less dramatic than they appear. I wanted to unpack what we actually can and can’t do with embryo selection today, which I hope will provide a more grounded foundation for discussing its ethics.
Firstly, a few definitions. Embryo selection is, broadly, the practice of screening and implanting an embryo with favorable traits. There are a few ways to determine whether it has such traits:
Monogenic screening (or PGT-M) checks your embryos for a single, clearly defined genetic mutation. This type of testing has been used since the 1990s to screen for serious, life-altering genetic diseases, such as cystic fibrosis or Huntington's disease.
Polygenic screening (or PGT-P) is a newer technology that only became commercially available in 2019. It can be used to assess an embryo's risk for conditions influenced by multiple genes – such as diabetes or schizophrenia – rather than just one gene. But, because we don't always know which genes to look at, researchers use statistical models, based on population studies, to estimate which genetic variants are likely to influence a certain trait. These variants are then compiled into a polygenic risk score (or PRS).
For example, we know schizophrenia has a strong genetic component, but there isn't a single "schizophrenia gene" that's responsible for it. Unlike Huntington's disease, which is caused by a mutation of the HTT gene, researchers have to approximate which combination of genetic variants might influence someone's propensity to have schizophrenia, and assign weights to each one.
Polygenic screening – and the underlying risk scores, which researchers have developed since the late 2000s – has attracted controversy because it could be applied to a wider set of traits like IQ.1 But, on a technical level, monogenic and polygenic screening are just two different methods for predicting an embryo's traits. Both methods can be used to screen for genetic diseases that could severely impact a child's life. Most parents probably wouldn't want their child to have Huntington's disease (monogenic) or schizophrenia (polygenic).
Both methods also have practical constraints that aren’t often discussed, which is what I want to discuss here.
Problem #1: Parents can’t make enough embryos to screen beyond the essentials
The first issue is that parents can't currently make enough embryos for elective trait selection to matter very much. A single embryo retrieval – in which eggs are retrieved, fertilized, and matured to the blastocyst stage – plus genetic testing can cost around $30,000 and might yield 3-5 embryos. For each embryo transfer, the live birth (i.e. "success”) rate is perhaps in the 50-60% range, so multiple transfers are often needed. And these numbers are on the higher end of successful outcomes.
Fertility clinics also grade each embryo on its quality, which is correlated with the likelihood of a successful implantation. Most use the Gardner grading system, which looks at a blastocyst's size and cell quality. A 6BC, for example, is less likely to have a successful transfer versus a 4AA.
Parents may also want to consider the sex of the embryo, not just for elective reasons (wanting to have a boy or a girl), but for certain genetic diseases. A male with a BRCA1 mutation, which causes a high predisposition for breast cancer, still has an much lower lifetime risk (1-5%) than a female (60-80%).
These two filters alone shrink the pool considerably. Now, try adding just a single genetic trait to screen for, and the list of viable embryos gets even shorter. Imagine you are screening for Huntington's disease, which typically affects 50% of embryos:
EMBRYO #1: M, 4AA, mutation yes
EMBRYO #2: M, 5BB, mutation no
EMBRYO #3: F, 3AA, mutation yes
EMBRYO #4: F, 6BC, mutation yes
EMBRYO #5: F, 4AA, mutation no
Half your embryos have the mutation, so you wouldn't want to implant them. But of the embryos who don't have the mutation, only one (#5) is considered high grade, which a fertility clinic will prioritize because it’s more likely to result in a live birth. In this example, we didn't even have the luxury of considering the embryo's sex.
A world in which parents are "shopping" among lots of different traits in their embryos simply doesn't exist today. You would need to create dozens or hundreds of embryos to make this possible, which current IVF methods don't produce. Researchers are exploring new possibilities in this realm, such as generating embryos from stem cells, but these are still highly experimental, and inevitably raise a different set of ethical questions. For now, embryo selection is tightly constrained by biology and numbers.
Problem #2: With polygenic screening, you're unlikely to find a genetic outlier
The “not enough embryos” problem also impacts the usefulness of polygenic screening. Polygenic risk scores are comprised of lots of genetic variants, which interact with each other in unpredictable ways. Statistically, on a bell curve of all possible genetic combinations between two parents, most of your embryos will end up somewhere near the middle (meaning, the middle of what you would have produced anyway through natural conception) for a given trait, with only small variations.
When you only have, say, 5 or 10 embryos to choose from, you only get a few “shots” along this bell curve, and it’s unlikely you’ll uncover anything interesting. Finding a genetic outlier is like sticking your hand in a jar of green jelly beans and rummaging blindly around, hoping to grab the one red jelly bean: it’s mostly wishful thinking.
Problem #3: Polygenic risk scores are not set in stone
We tend to think of our DNA as something that is hard-coded into our bodies. Therefore, any information we receive about our genes must be irrefutably true, right? Wrong. Some genetic information is clear, especially once we know what to look for. But for the most part, our understanding of how genetic variants map to traits is complex and incomplete, and these insights are always changing. Genomics researchers can’t just look at our DNA, like some soothsayer blowing the dust off an ancient text, and tell us what they see. Instead, they use statistics to make educated guesses about which genes seem likely to influence a certain trait, and to what degree.
You can think of this as similar to how people use the English alphabet. Sure, it only has 26 letters, and everyone knows what these letters are, but people still find new ways of rearranging them – words, sentences, essays, books – every day, and we’re not going to run out of new combinations anytime soon. Developing a polygenic score is kind of like that, too.
Warren Weaver, a director at the Rockefeller Foundation in the 1950s, once described the difference between problems of simplicity, disorganized complexity, and organized complexity:
Problems of simplicity are a two-variable problem. Monogenic screening falls under this category: “Does this embryo have X mutation?” has a direct and straightforward answer.
Problems of disorganized complexity deal with a large number of variables, but which act in predictable ways that are independent of their relationship to each other. Population studies used in genomics research are an example of this: “What percent of this population carries Y variant?” is a complex question, but can be answered with statistical analysis.
Problems of organized complexity involve multiple variables, each of whose significance can change depending on its relationship to other variables. Polygenic screening is this type of problem. A genetic variant might be more or less important depending on which other variants are present (which could amplify or mitigate its impact), as well as other factors like a person’s sex, lifestyle, or environment. And we don’t always know what these dependencies even are.
A polygenic risk score is just a model that captures our best-guess thinking at the time. Like any model, it can have wide error bars. With time, we can probably reduce these errors quite a bit, to the point that we're comfortable making consequential decisions. But we’re nowhere close to that world for every polygenic trait in question.
Many polygenic scores are trained on European ancestry and don’t always generalize to other populations. Standards are still emerging in this field: researchers don’t all use the same data sets or methods to create these scores. And companies don’t have to disclose how their proprietary scores were developed, nor how accurate or predictive they are.
High uncertainty might be fine if we were dealing with a pure research problem. But when genetic testing companies offer their customers a neatly-wrapped polygenic risk score, it can seem more certain than it actually is. I've had several services give me conflicting insights about my risk for various polygenic traits, based on limited or differing interpretations of the data. If one service tells me I have a high risk for, say, Type 2 diabetes, and another tells me I have a low risk, which one should I believe? How should I change my behavior based on this information? If these services can't help customers understand what they should actually do differently, having access to these data insights might be more counterproductive than not having them at all.
For technologies like embryo selection, which utilize this research, I think we should hold ourselves to a high degree of confidence before making decisions about elective traits. We are, of course, always updating our collective scientific body of knowledge2, and sometimes we have to act even when we're not very confident about what we know. I can understand why someone might pursue a treatment with 55% odds of success, for example, if they’re facing a rare and aggressive type of cancer and have no alternatives. But it’s another story to make life-altering choices for ourselves, or our children, based on highly speculative data, when the default choice might have been just fine. At the very least, companies should be transparent about the uncertainty we have around these scores today, so consumers can make informed decisions.
Do these issues mean that none of this technology matters?
Not at all. Firstly, I do think it is useful for parents to do genetic testing of themselves before having kids, which can surface any major genetic issues. This is a simple saliva test that you can do at your doctor, costs a couple hundred dollars, and is often covered by insurance (especially if you have a family history of disease) or even subsidized by test providers.
Embryo selection, too, has clear value when families have known genetic risks that could significantly impact the quality of a child's life, such as hemophilia or hereditary cancer. In these cases, monogenic screening can mean the difference between having children versus not. Despite the panic about designer babies, most embryo screening today is being used for exactly this: avoiding passing severe, life-altering diseases on to one’s children.
Genomics researchers are now trying to extend this research to more common, but still consequential, conditions like diabetes and heart disease. This pursuit also seems valuable to me – such conditions are among the leading causes of death in the United States – even if I'm not yet convinced they’re ready for consumer primetime.
Finally, we shouldn't get too hung up on embryo selection itself, because it represents just one way of solving for a bigger ambition. Gene editing, for example, could someday enable us to correct harmful mutations directly, which would bypass the need to create and choose from a handful of embryos. However we get there, though, I think the underlying imperative is still worth pursuing: finding ways to prevent genetic disease and help more people live healthier lives.
If anyone is interested in diving further into this topic, I enjoyed The Genetic Lottery: Why DNA Matters for Social Equality, by Kathryn Paige Harden, which I found both balanced and informative.
I'm reminded of the recommendation to avoid giving babies peanut-based foods, which was common advice for my parents’ generation when they were raising me. Years later, researchers realized that avoiding early exposure worsened, rather than reduced, peanut allergies. Now, parents like me are explicitly told to give their babies peanut-based foods as early as they can: the exact opposite recommendation that my parents received.
This is one of the clearest descriptions I’ve seen on this topic - in the middle of this right now and sharing w my partner. Big thank you for this write up ❤️