Long stretches of DNA previously dismissed as “junk” are in fact crucial to the way our genome works, an international team of scientists said on Wednesday night.
It is the most significant shift in scientists’ understanding of the way our DNA operates since the sequencing of the human genome in 2000, when it was discovered that our bodies are built and controlled by far fewer genes than scientists had expected. Now the next generation of geneticists has updated that picture.
The results of the international Encode project will have a huge impact for geneticists trying to work out how genes operate. The findings will also provide new leads for scientists studying conditions such as heart disease, diabetes and Crohn’s disease that have their roots partly in glitches in the DNA. Until now, the focus had largely been on looking for errors within genes themselves, but the Encode research will help guide the hunt for problem areas that lie elsewhere in our DNA sequence.
Ewan Birney of the European Bioinformatics Institute near Cambridge, one of the principal investigators in the Encode project, said: “In 2000, we published the draft human genome and, in 2003, we published the finished human genome and we always knew that was going to be a starting point. We always knew that protein-coding genes were not the whole story.”
For years, the stretches of DNA between our 20,000 or so protein-coding genes — more than 98 percent of the genetic sequence in each of our cells — was written off as “junk” DNA. Already falling out of favor in recent years, this concept will now be consigned to the history books.
Encode is the largest single update to the data from the human genome since its final draft was published in 2003 and the first systematic attempt to work out what the DNA outside protein-coding genes does. The researchers found that it is far from useless: Within these regions, they have identified more than 10,000 new “genes” that code for components that control how the more familiar protein-coding genes work. Up to 18 percent of our DNA sequence is involved in regulating the less than 2 percent of the DNA that codes for proteins. In total, Encode scientists say, about 80 percent of the DNA sequence can be assigned some sort of biochemical function.
Scientists know that while most cells in our body contain our entire genetic code, not all of the protein-coding genes are active. A liver cell contains enzymes used to metabolize alcohol and other toxins, whereas hair cells make the protein keratin. Through some mechanism that regulates its genes, the hair cell knows it should make keratin rather than liver enzymes, and the liver cell knows it should make the liver enzymes and not the hair proteins.
“That control must have been somewhere in the genome, and we always knew that — for some individual genes — it was an element sometimes quite far away from the gene,” Birney said. “But we didn’t have a genome-wide view to this. So we set about working out how we could discover those elements.”
The results of the five-year Encode project are published today across 30 papers in the journals Nature, Science, Genome Biology and Genome Research. The researchers have mapped 4 million switches in what was once thought to be junk DNA, many of which will help them better understand a range of common human diseases, from diabetes to heart disease, that depend on the complex interaction of hundreds of genes and their associated regulatory elements.
“Regulatory elements are the things that turn genes on and off,” said Mike Snyder of Stanford University, who was a principal investigator in the Encode consortium. “Much of the difference between people is due to the differences in the efficiency of these regulatory elements. There are more variants, we think, in the regulatory elements than in the genes themselves.”
Genes cannot function without these regulatory elements. If regulation goes wrong, malfunctioning genes can cause diseases including cancer, atherosclerosis, type 2 diabetes, psoriasis and Crohn’s disease. Errors in the regulation of a gene known as Sonic Hedgehog, for example, are thought to underlie some cases of human polydactyly in which individuals have extra toes or fingers.
Anne Ferguson-Smith, of Cambridge University, said: “They also have important implications for the growth and development of embryos and fetuses during pregnancy. These are the kinds of elements that make your tissues and organs grow properly, at the right time and place, and containing the right kinds of cells.”
Encode scientists found that 9 percent of human DNA is involved in the coding for the regulatory switches, although Birney thinks the true figure may turn out to be about 20 percent.
“One of the big surprises is that we see way more [regulatory] elements than I was expecting,” he said.
The project has identified about 10,000 stretches of DNA, which Encode scientists have called non-coding genes, that do not make proteins but, instead, a type of RNA — the single-stranded equivalent of DNA. There are many types of RNA molecules in cells, each with a specific role, such as carrying messages or transcribing the DNA code in the first step of making a protein. However, the 10,000 non-coding genes carry instructions to build the large and small RNA molecules that regulate the actions of the 20,000 protein-coding genes.
The results have already shed light on previous, massive studies of genetic data. In recent years, scientists have compared the genetic code of thousands of people with a specific disease (such as diabetes, bipolar disorder, Crohn’s disease or heart disease) with the DNA code of thousands of healthy people, in an attempt to locate mutations that could account for some of the risk of developing that disease.
These so-called genome-wide association studies (GWAS) have identified scores of locations in the DNA that seem to raise a person’s risk of developing a disease — but the vast majority are nowhere near protein-coding genes. That makes sense if regions previously thought of as “junk” are actually vital for controlling the expression of protein-encoding genes.
Indeed, there is a big overlap between the locations identified by GWAS and the regulation switches identified in Encode.
“When I first saw that result, I thought it was too good to be true — we’ve done the analysis five different ways now and it still holds up,” Birney said.
Understanding some of these regulatory elements could help explain some of the environmental triggers for diseases.
Crohn’s disease, for example, is a long-term condition that causes inflammation of the lining of the digestive system that affects up to 60,000 people in the UK, but scientists cannot fully explain why some people suffer from it and others do not, even when they all have the genetic mutations associated with an elevated risk. One hypothesis is that the disease could be triggered by a bacterial infection.
“Maybe there’s a place in the middle of nowhere [in the DNA], not close to a protein-coding gene, that if you have one variant, you’re more sensitive to this bacterium; if you have another variant, you’re less sensitive,” Birney said. “So you get Crohn’s disease probably because you have the more sensitive type and that particular bacterial infection occurred at a time when you were vulnerable.”
The Encode consortium’s 442 researchers, in 32 institutes around the world, used 300 years of computer time and five years in the lab to get their results. They examined a total of 147 types of tissue — including cancer cells, liver extracts, endothelial cells from umbilical cords and stem cells derived from embryos — and subjected them to about 100 different experiments, recording which parts of DNA code were activated in which cells at which times.
Encode will prove useful not only for scientists, but also for those who want a more personalized approach to medicine.
Snyder said: “We’re in an era where people are starting to get their genomes sequenced — with Encode data we could start mapping regulatory information.”
This means that the individual differences in people’s diseases can be more effectively targeted for treatment.
Tim Hubbard of the Wellcome Trust Sanger Institute in Cambridge, said: “Diseases have been defined by the medical profession observing symptoms. [But] we know, for example, that breast cancer is not one disease, but there’s multiple types of breast cancer with all sorts of different mechanistic processes going wrong.”
“A given drug only works in about a third of the people you give it to, but you don’t know which third,” he said. “If you knew the relationship between a person’s genome and which drugs work for them and which ones they shouldn’t take because it gives them side-effects, that would improve medicine.”
Understanding exactly how each type of cell in the body works — in other words which genes are switched on or off at different stages of its function — will also be useful in future stem-cell therapies. If doctors want to grow replacement liver tissue, for example, they will be able to check that it is safe by comparing the DNA functions of their manufactured cells with data from normal liver cells.
Birney said that the decade since the publication of the first draft of the human genome has shown that genetics is much more complex than anyone could have predicted.
“We felt that maybe life was easier beforehand and more comfortable because we were just more ignorant. The major thing that’s happening is that we’re losing some of our ignorance and, indeed, it’s very complicated,” he said. “You’ve got to remember that these genomes make one of the most complicated things we know: ourselves. The idea that the recipe book would be easy to understand is hubris. I still think we’re at the start of this journey, we’re still in the warm-up, the first couple of miles of this marathon.”
US President Donald Trump and Chinese President Xi Jinping (習近平) were born under the sign of Gemini. Geminis are known for their intelligence, creativity, adaptability and flexibility. It is unlikely, then, that the trade conflict between the US and China would escalate into a catastrophic collision. It is more probable that both sides would seek a way to de-escalate, paving the way for a Trump-Xi summit that allows the global economy some breathing room. Practically speaking, China and the US have vulnerabilities, and a prolonged trade war would be damaging for both. In the US, the electoral system means that public opinion
They did it again. For the whole world to see: an image of a Taiwan flag crushed by an industrial press, and the horrifying warning that “it’s closer than you think.” All with the seal of authenticity that only a reputable international media outlet can give. The Economist turned what looks like a pastiche of a poster for a grim horror movie into a truth everyone can digest, accept, and use to support exactly the opinion China wants you to have: It is over and done, Taiwan is doomed. Four years after inaccurately naming Taiwan the most dangerous place on
In their recent op-ed “Trump Should Rein In Taiwan” in Foreign Policy magazine, Christopher Chivvis and Stephen Wertheim argued that the US should pressure President William Lai (賴清德) to “tone it down” to de-escalate tensions in the Taiwan Strait — as if Taiwan’s words are more of a threat to peace than Beijing’s actions. It is an old argument dressed up in new concern: that Washington must rein in Taipei to avoid war. However, this narrative gets it backward. Taiwan is not the problem; China is. Calls for a so-called “grand bargain” with Beijing — where the US pressures Taiwan into concessions
Wherever one looks, the United States is ceding ground to China. From foreign aid to foreign trade, and from reorganizations to organizational guidance, the Trump administration has embarked on a stunning effort to hobble itself in grappling with what his own secretary of state calls “the most potent and dangerous near-peer adversary this nation has ever confronted.” The problems start at the Department of State. Secretary of State Marco Rubio has asserted that “it’s not normal for the world to simply have a unipolar power” and that the world has returned to multipolarity, with “multi-great powers in different parts of the