UNIVERSITY PARK – Scientists at Penn State are leaders of a team that is the first to report the genome-wide sequence of an extinct animal, according to Webb Miller, professor of biology and of computer science and engineering and one of the project’s two leaders. The scientists sequenced the genome of the woolly mammoth, an extinct species of elephant that was adapted to living in the cold environment of the northern hemisphere.
They sequenced 4 billion DNA bases using next-generation DNA-sequencing instruments and a novel approach that reads ancient DNA highly efficiently. More information about this project is on the Web.
“Previous studies on extinct organisms have generated only small amounts of data,” said Stephan C. Schuster, Penn State professor of biochemistry and molecular biology and the project’s other leader. “Our dataset is 100 times more extensive than any other published dataset for an extinct species, demonstrating that ancient DNA studies can be brought up to the same level as modern genome projects.”
The researchers suspect that the full woolly-mammoth genome is more than 4 billion DNA bases, which they believe is the size of the modern-day African elephant’s genome. Although their dataset consists of more than 4 billion DNA bases, only 3.3 billion of them — a little over the size of the human genome — currently can be assigned to the mammoth genome. Some of the remaining DNA bases may belong to the mammoth, but others could belong to other organisms, like bacteria and fungi, from the surrounding environment that had contaminated the sample. The team used a draft version of the African elephant’s genome, which currently is being generated by scientists at the Broad Institute of MIT and Harvard, to distinguish those sequences that truly belong to the mammoth from possible contaminants.
“Only after the genome of the African elephant has been completed will we be able to make a final assessment about how much of the full woolly-mammoth genome we have sequenced,” said Miller. The team plans to finish sequencing the woolly mammoth’s genome when the project receives additional funding.
The team sequenced the mammoth’s nuclear genome using DNA extracted from the hairs of a mammoth mummy that had been buried in the Siberian permafrost for 20,000 years and a second mammoth mummy that is at least 60,000-years-old. By using hair, the scientists avoided problems that have bedeviled the sequencing of ancient DNA from bones because DNA from bacteria and fungi, which always are associated with ancient DNA, can more easily be removed from hair than from bones. Another advantage of using hair is that less damage occurs to ancient DNA in hair because the hair shaft encases the remnant DNA like a biological plastic, thus protecting it from degradation and exposure to the elements.
The researchers previously had sequenced the woolly mammoth’s entire mitochondrial genome, which codes for only 13 of the mammoth’s roughly 20,000 genes but is relatively easy to sequence because each of the mammoth’s cells has many copies. In their most recent project, the team sequenced the mammoth’s nuclear genome, which codes for all the genetic factors that are responsible for the appearance of an organism. The two methods combined have yielded information about the evolution of the three known elephant species: the modern-day African and Indian elephants and the woolly mammoth. The team found that woolly mammoths separated into two groups around 2 million years ago, and that these groups eventually became genetically distinct sub-populations. They also found that one of these sub-populations went extinct approximately 45,000 years ago, while another lived until after the last ice age, about 10,000 years ago. In addition, the team showed that woolly mammoths are more closely related to modern-day elephants than previously was believed.
“Our data suggest that mammoths and modern-day elephants separated around 6 million years ago, about the same time that humans and chimpanzees separated,” said Miller. “However, unlike humans and chimpanzees, which relatively rapidly evolved into two distinct species, mammoths and elephants evolved at a more gradual pace,” added Schuster, who believes that the data will help to shed light on the rate at which mammalian genomes, in general, can evolve.
The team’s new data also provide additional evidence that woolly mammoths had low genetic diversity. “We discovered that individual woolly mammoths were so genetically similar to one another that they may have been especially susceptible to being wiped out by a disease, by a change in the climate, or by humans,” said Schuster. While members of the team previously ruled out humans as a cause of extinction for at least one of the Siberian sub-populations — the group appears to have gone extinct at least 45,000 years ago at a time when there were no humans living in Siberia — much debate still remains regarding the causes of extinction for the other group and for those populations that lived in other places, such as North America.
Currently, the team is searching the mammoth’s genome for clues about its extinction. “For example,” said Miller, “most animal genomes contain integrated viral sequences and, though these are not directly associated with disease, evidence of multiple recent integration events could indicate a perturbation of virus-host interaction that might be responsible for disease. Alternatively, it might turn out that long generation times and limited outbreeding result in accumulation of deleterious genetic mutations. We are considering a number of possible causes of extinction.”
The new data are allowing the Penn State team to begin looking for genetic causes of some of the mammoth’s unique characteristics, such as their adaptation to extremely cold environments. For instance, the team already has identified a number of cases in which all previously sequenced mammals, except mammoths, have the same protein segment. “One has to wonder whether a particular protein that has remained the same in animals for several billion years of combined evolution and then became different in mammoths could result in a mammoth-specific trait,” said Miller.
Investigating the unique characteristics of woolly mammoths and why they went extinct are just some of the many tasks that the research team plans to pursue now that they have access to such a large quantity of sequence data. “This really is the first time that we have been able to study an extinct animal in the same detail as the ones living in our own time,” said Schuster.
Another significant aspect of the study is that it was completed by a small group of scientists at a relatively low cost and over a short period of time, whereas previous reports of modern mammalian genome sequences — including human sequences — have taken millions of dollars and several years of analysis by large groups of scientists to complete. Miller hopes that after he completes a few additional genome projects he can produce computer software that will enable others to perform low-cost mammalian genome analysis, and Schuster already is preparing to decode extinct genomes at an even faster pace.
Schuster hopes that lessons learned from the mammoth genome about why some animals go extinct while others do not will be useful in protecting other species from extinction, such as the Tasmanian devil, whose survival is threatened by a deadly facial cancer. “In addition,” added Schuster, “by deciphering this genome we could, in theory, generate data that one day may help other researchers to bring the woolly mammoth back to life by inserting the uniquely mammoth DNA sequences into the genome of the modern-day elephant. This would allow scientists to retrieve the genetic information that was believed to have been lost when the mammoth died out, as well as to bring back an extinct species that modern humans have missed meeting by only a few thousand years.”
In addition to being members of the faculty of Penn State’s Department of Biochemistry and Molecular Biology, Miller and Schuster are researchers associated with Penn State’s Center for Comparative Genomics and Bioinformatics. The study also involved researchers from the Severtsov Institute of Ecology and Evolution and the Zoological Institute in Russia, the University of California, the Broad Institute, the Roche Diagnostics Corp. and the Sperling Foundation in the United States. This research was funded by Penn State, Roche Applied Sciences, a private sponsor, the National Human Genome Research Institute and the Pennsylvania Department of Health.