Genome Evolution | First, a Bang Then, a Shuffle

Picture an imperfect hall of mirrors, with gene sequences reflecting wildly: That's the human genome. The duplications that riddle the genome range greatly in size, clustered in some areas yet absent in others, residing in gene jungles as well as within vast expanses of seemingly genetic gibberish. And in their organization lie clues to genome origins. "We've known for some time that duplications are the primary force for genes and genomes to evolve over time," says Evan Eichler, director of the bioinformatics core facility at the Center for Computational Genomics, Case Western Reserve University, Cleveland.

For three decades, based largely on extrapolations from known gene families in humans, researchers have hypothesized two complete genome doublings--technically, polyploidization--modified by gene loss, chromosome rearrangements, and additional limited duplications. But that view is changing as more complete evidence from genomics reveals a larger role for recent small-scale changes, superimposed on a probable earlier single doubling. Ken Wolfe, a professor of genetics at the University of Dublin, calls the new view of human genome evolution "the big bang" followed by "the slow shuffle."

"There has been a lot of debate about whether there were two complete polyploid events at the base of the vertebrate lineages. The main problem is that vertebrate genomes are so scrambled after 500 million years, that it is very difficult to find the signature of such an event," explains Michael Lynch, a professor of biology at Indiana University, Bloomington, With accumulating sequence data from gene families, a picture is emerging of a lone, complete one-time doubling at the dawn of vertebrate life, followed by a continual and ongoing turnover of about 5-10% of the genome that began in earnest an estimated 30-50 million years ago. Short DNA sequences reinvent themselves, duplicating and sometimes diverging in function and dispersing among the chromosomes, so that the genome is a dynamic, ever-changing entity.

Duplication in the human genome is more extensive than it is in other primates, says Eichler. About 5% of the human genome consists of copies longer than 1,000 bases. Some doublings are vast. Half of chromosome 20 recurs, rearranged, on chromosome 18. A large block of chromosome 2's short arm appears again as nearly three-quarters of chromosome 14, and a section of its long arm is also on chromosome 12. The gene-packed yet diminutive chromosome 22 sports eight huge duplications. "Ten percent of the chromosome is duplicated, and more than 90% of that is the same extremely large duplication. You don't have to be a statistician to realize that the distribution of duplications is highly nonrandom," says Eichler.

The idea that duplications provide a mechanism for evolution is hardly new. Geneticists have long regarded a gene copy as an opportunity to try out a new function while the original sequence carries on. More often, though, the gene twin mutates into a nonfunctional pseudogene or is lost, unconstrained by natural selection because the old function persists. Or, a gene pair might diverge so that they split a function.

Some duplications cause disease. A type of Charcot-Marie-Tooth disease, for example, arises from a duplication of 1.5 million bases in a gene on chromosome 17. The disorder causes numb hands and feet.

INFERRING DUPLICATION ORIGINS A duplication's size and location may hold clues to its origin. A single repeated gene is often the result of a tandem duplication, which arises when chromosomes misalign during meiosis, and crossing over distributes two copies of the gene (instead of one) onto one chromosome. This is how the globin gene clusters evolved, for example. "Tandem duplicates are tandemly arranged, and there may be a cluster of related genes located contiguously on the chromosome, with a variable number of copies of different genes," says John Postlethwait, professor of biology in the Institute of Neuroscience at the University of Oregon, who works on the zebrafish genome.

In contrast to a tandem duplication, a copy of a gene may appear on a different chromosome when messenger RNA is reverse-transcribed into DNA that inserts at a new genomic address. This is the case for two genes on human chromosome 12, called PMCHL1 and PMCHL2, that were copied from a gene on chromosome 5 that encodes a neuropeptide precursor. Absence of introns in the chromosome 12 copies belies the reverse transcription, which removes them.¹ (Tandem duplicates retain introns.)

The hallmarks of polyploidy are clear too: Most or all of the sequences of genes on one chromosome appears on another. "You can often still see the signature of a polyploidization event by comparing the genes on the two duplicated chromosomes," Postlethwait says.

Muddying the waters are the segmental duplications, which may include tandem duplications, yet also resemble polyploidy. "Instead of a single gene doubling to make two adjacent copies as in a tandem duplication, in a segmental duplication, you could have tens or hundreds of genes duplicating either tandemly, or going elsewhere on the same chromosome, or elsewhere on a different chromosome. If the two segments were on different chromosomes, it would look like polyploidization for this segment," says Postlethwait. Compounding the challenge of interpreting such genomic fossils is that genetic material, by definition, changes. "As time passes, the situation decays. Tandem duplicates may become separated by inversions, transpositions, or translocations, making them either distant on the same chromosome or on different chromosomes," he adds.

QUADRUPLED GENES Many vertebrate genomes appear to be degenerate tetraploids, survivors of a quadrupling--a double doubling from haploid to diploid to tetraploid--that left behind scattered clues in the form of genes present in four copies. This phenomenon is called the one-to-four rule. Wolfe compares the scenario to having four decks of cards, throwing them up in the air, discarding some, selecting 20, and then trying to deduce what you started with. Without quadruples in the sample, it is difficult to infer the multideck origin. So it is for genes and genomes.

"How can you tell whether large duplications built up, or polyploidy broke down? People are saying that they can identify blocks of matching DNA that are evidence for past polyploidization, which have been broken up and overlain by later duplications. But at what point do blocks just become simple duplications?" asks Susan Hoffman, associate professor of zoology at Miami University, Oxford, Ohio.

The idea that the human genome has weathered two rounds of polyploidy, called the 2R hypothesis, is attributed to Susumu Ohno, a professor emeritus of biology at City of Hope Medical Center in Duarte, Calif.²The first whole genome doubling is postulated to have occurred just after the vertebrates diverged from their immediate ancestors, such as the lancelet (Amphioxus). A second full doubling possibly just preceded the divergence of amphibians, reptiles, birds, and mammals from the bony fishes.

Evidence for the 2R hypothesis comes from several sources. First, polyploidy happens. The genome of flowering plants doubled twice, an estimated 180 and 112 million years ago, and rice did it again 45 million years ago.³ "Plants have lots of large blocks of chromosomal duplications, and the piecemeal ones originated at the same time," indicating polyploidization, says Lynch. The yeast Saccharomyces cerevisiae is also a degenerate tetraploid, today bearing the remnants of a double sweeping duplication.⁴

Polyploidy is rarer in animals, which must sort out unmatched sex chromosomes, than in plants, which reproduce asexually as well as sexually. "But polyploidization is maintained over evolutionary time in vertebrates quite readily, although rarely. Recent examples, from the last 50 million years ago or so, include salmonids, goldfish, Xenopus [frogs], and a South American mouse," says Postlethwait. On a chromosomal level, polyploidy may disrupt chromosome compatibility, but on a gene level, it is an efficient way to make copies. "Polyploidy solves the dosage problem. Every gene is duplicated at the same time, so if the genes need to be in the right stoichiometric relationship to interact, they are. With segmental duplications, gene dosages might not be in the same balance. This might be a penalty and one reason why segmental genes don't survive as long as polyploidy," Lynch says.

Traditional chromosome staining also suggests a double doubling in the human genome's past, because eight chromosome pairs have near-dopplegängers, in size and band pattern.⁵ A flurry of papers in the late 1990s found another source of quadrupling: Gene counts for the human, then thought to be about 70,000, were approximately four times those predicted for the fly, worm, and sea squirt. The human gene count has since been considerably downsized.

Finally, many gene families occur in what Jurg Spring, a professor at the University of Basel's Institute of Zoology in Switzerland, dubs "tetrapacks."⁶ The HOX genes, for example, occupy one chromosome in Drosophila melanogaster but are dispersed onto four chromosomes in vertebrate genomes.⁷ Tetrapacks are found on every human chromosome, and include zinc-finger genes, aldolase genes, and the major histocompatibility complex genes.

"In the 1990s, the four HOX clusters formulated the modern version of the 2R model, that two rounds of genome duplication occurred, after Amphioxus and before bony fishes," explains Xun Gu, an associate professor of zoology and genetics at Iowa State University in Ames. "Unfortunately, because of the rapid evolution of chromosomes as well as gene losses, other gene families generated in genome projects did not always support the classic 2R model. So in the later 1990s, some researchers became skeptical of the model and argued the possibility of no genome duplication at all."

THE BIG BANG/SLOW SHUFFLE EVOLVES Human genome sequence information has enabled Gu and others to test the 2R hypothesis more globally, reinstating one R. His group used molecular-clock analyses to date the origins of 1,739 duplications from 749 gene families.⁸ If these duplications sprang from two rounds of polyploidization, the dates should fall into two clusters. This isn't exactly what happened. Instead, the dates point to a whole genome doubling about 550 million years ago and a more recent round of tandem and segmental duplications since 80 million years ago, when mammals radiated.

Ironically, sequencing of the human genome may have underestimated the number of duplications. The genome sequencing required that several copies be cut, the fragments overlapped, and the order of bases derived. The algorithm could not distinguish whether a particular sequence counted twice was a real duplication, present at two sites in the genome, or independent single genes obtained from two of the cut genomes.

Eichler and his group developed a way around this methodological limitation. They compare sequences at least 15,000 bases long against a random sample of shotgunned whole genome pieces. Those fragments that are overrepresented are inferred to be duplicated.⁸ The technique identified 169 regions flanked by large duplications in the human genome.

Although parts of the human genome retain a legacy of a long-ago total doubling, the more recent, smaller duplications provide a continual source of raw material for evolution. "My view is that both happen. A genome can undergo polyploidy, duplicating all genes at once, but the rate of segmental duplications turns out to be so high that every gene will have had the opportunity to duplicate" by this method also, concludes Lynch. It will be interesting to see how the ongoing analyses of the human and other genome sequences further illuminate the origins and roles of duplications.

References
1. A. Courseaux, J.-L. Nahon, "Birth of 2 chimeric genes in the Hominidae lineage," Science, 291:1293-7, 2001.

2. S. Ohno, Evolution by Gene Duplication, Heidelberg, Germany: Springer-Verlag, 1970.

3. J. Bennetzen, "Opening the door to comparative plant biology," Science, 296:60-3, 2002.

4. A. Wagner, "Asymmetric functional divergence of duplicated genes in yeast," Molec Biol Evol, 19:1760-8, October 2002.

5. D.E. Comings, "Evidence for ancient tetraploidy and conservation of linkage groups in mammalian chromosomes," Nature, 238:455-7, 1972.

6. J. Spring, "Genome duplication strikes back," Nat Genet, 31:128-9, 2002.

7. D. Larhammar et al., "The human hox-bearing chromosome regions did arise by block or chromosome (or even genome) duplications," Genome Res, 12:1910-20, December 2002.

8. X. Gu et al., "Age distribution of human gene families shows significant roles of both large-and small-scale duplications in vertebrate evolution," Nat Genet, 31:205-9, 2002.

9. J.A. Bailey et al., "Recent segmental duplications in the human genome," Science, 297:1003-7, Aug. 9, 2002.

"We've known for some time that duplications are the primary force for genes and genomes to evolve over time," says Evan Eichler, director of the bioinformatics core facility at the Center for Computational Genomics, Case Western Reserve University, Cleveland.

Except, as you say, some of us are making a point of being clueless. Some of us have most definitely not "known for some time" how evolution can account for increases in complexity. You couldn't get this information into a creationist skull if you put it into a notched bullet and shot it in.

Dumb question on my part: In general, duplication/polyploidization confers no particular advantage to sexually reproducing organisms, but may do so (or not confer disadvantage) in certain instances?

Creationists insist that evolution cannot increase biological complexity. Evolutionists point to gene duplication and the subsequent hijacking of function & refinement thru selection. This article points out just how rampant such duplication-driven complexity increase has been.

Then they won't show up on this thread.

Dumb question on my part: In general, duplication/polyploidization confers no particular advantage to sexually reproducing organisms, but may do so (or not confer disadvantage) in certain instances?

I think it's mostly that the polyploid offspring, if they survived, would never find a compatible mate to produce offspring with. (Unlike in plants, which are more likely to find a compatible mate. See here for a good explanation of plant polyploidy.)

Ah yes we can all see how this will eventually get to the Gettysburg Address. Just a matter of time and natural selection.

Nah. If you replayed life's tape, or however that Gould quote goes. But with time & natural selection you would get something meaningful. But it would probably be in a language we've never even heard of!

Actually, the vast amount of redundant and non-functional content in the human genome, despite these duplications, is a good indication that all this new available space in the genome has not given rise to increased complexity, just imperfect copies of pre-existing information.

Sometime, read "Science and Information Theory" by Nobelist Leon Brillouin. Information (ie design specifications) do not arise from nothing. Only by transfer from other, equivalent information or intelligence. It is called "negative entropy" and is governed by the second law. Raw energy input increases entropy in a system, not negative entropy (information).

Never fails to amuse me what faith evolutionists have. For every mutation, duplication, what have you that can "find favor" so to speak with the environment there got to be how many millions of don't cares and "turkeys" produced in exactly the same manner? (Remember we got to rule out intelligent design or it ain't "science." Can't have God at the controls of the CAD system twiddling this thing ya know! So we would have to get a staggering amount of this dilapidation and noise.) Once we get to something that is significantly bigger and less prolific than insects, this whole scenario breaks down. But it's all the evolutionists have to play with.

His group used molecular-clock analyses to date the origins of 1,739 duplications from 749 gene families.

The problem with the above, and with the whole article is that there is no molecular clock. There are several reasons for this the most essential one is that we do not have any examples of half billion year old DNA, 100 million year old DNA or even million year old DNA to make comparisons to. Therefore all the samples we have (with a few exceptions that can be counted on the fingers of one hand) are of current DNA. So how can one tell how far current DNA is from millions of year old DNA if one does not have something to compare it to? The answer is one cannot. The second problem is that SUPPOSEDLY all organisms now living are equally far apart from the first life as all others, so to take one as an example of 'what is older' is totally fallacious. It is using the assumptions of the theory of evolution as to how species supposedly descended from each other to prove how species supposedly descended from each other. This is circular reasoning and utter nonsense. There are more problems with the molecular clock also. Since some creatures have much shorter generations than others, and mutations supposedly occur at each reproduction (how else could they happen!) the 'mutational clock' (for that is what is really being talked about here) should be going at a completely different speed for elephanst than for flys, yet evolutionists moronically claim that it goes at the same speed.

But with time & natural selection you would get something meaningful.

No you would not. There are supposedly some 10 million years of mutations separating man from chimps. Chimps and men differ by some 5% of their DNA (the evolutionist 1% has been proven wrong by the same man who originally made the statement). Since chimps and men have about 3 billion DNA base pairs that 5% represents some 150,000,000 favorable mutations in those ten million years. Since with all our science, all our billions in research on DNA for decades have not shown a single favorable mutation has ever happened, I think that your statement is absolutely wrong scientifically - just as evolution is completely wrong scientifically.

There are supposedly some 10 million years of mutations separating man from chimps. Chimps and men differ by some 5% of their DNA (the evolutionist 1% has been proven wrong by the same man who originally made the statement). Since chimps and men have about 3 billion DNA base pairs that 5% represents some 150,000,000 favorable mutations in those ten million years. Since with all our science, all our billions in research on DNA for decades have not shown a single favorable mutation has ever happened, I think that your statement is absolutely wrong scientifically

Stop trying to use numbers to test their ideas. It makes them furious. All other branches of science use mathematical analysis to support their conclusions, but the rules are different for evolution. For an evolutionist the standard of proof is:

"If I can imagine a way it MIGHT have happened, then you must believe the it DID happen that way, or you are a willfully ignorant bible-thumping idiot. PS- Once the way I imagined that it might have happened gets nullified by further observations, you must then believe that the next thing I imagine is the way it did happen- or else you once again are a willfully ignorant, bible thumping idiot.

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.