 |
Introduction
Validity of Tree Model
Mutations

| Introduction |
|
Diagram showing the evolutionary interrelations of a group of organisms that usually originated from a shared ancestral form. The ancestor is in the tree trunk; organisms that have arisen from it are placed at the ends of tree branches. The distance of one group from the other groups indicates the degree of relationship; that is, closely related groups are located on branches close to one another. Though phylogenetic trees are speculative, they provide a convenient method for studying phylogenetic relationships and evolution. See also phylogeny.
OR
So, what is phylogeny? It's the study of the evolution of life forms. One phylogenetic tree, also called a cladogram or a dendrogram, is displayed below. It is a tree of several life forms and their relations. In this display, time is the vertical dimension with the current time at the bottom and earlier times above it. There are ten extant species (species currently living) named from 1 through 10. The lines above the extant species represent the same species, just in the past. When two lines converge to a point, that should be interpreted as the point when the two species diverged from a common ancestral species, the point being the common ancestral species. And so it goes until eventually, some time in the past, all the species derived from just one species, the one displayed as the top point.

|

| Validity of Tree Model |
|
• For individuals within a species. The genetic material of an individual doesn't derive from a single earlier existing individual. Animals and plants that multiply by sexual reproduction receive half their genetic material from each of two parents, so a tree like this is inappropriate. For species that multiply asexually, a tree is appropriate. Even for species that usually multiply asexually—such as many one-celled creatures—the occassional exchange of genetic material through conjugation is so important that trees are inappropriate.
• For closely related species. Individuals do occasionaly mate between closely related species, and their progeny survive to contribute to the gene pool of one or both of the parent species. As the species diverge, such intermixing of genetic material becomes more rare. One solution is to treat closely related species as one larger variable species. Another is simply not to consider closely related species.
• Hybrid species. In the plant world it occasionally happens that a new tetraploid species arises from two diploid species. The two parent species need to be somewhat related for this to happen.
• Distant interaction. There are a couple of ways that genetic material from one species can find its way into a distantly related or unrelated species. Among bacteria, sometimes a bacterium of one species can injest the genetic material of a bacterium of another species and incorporate part of it into its own genetic material. Rare as this may be, the effects are significant. Sometimes viruses can inadvertantly transport genetic material from one species to another. When some viruses break out of cells of one species, they may infect other species and carry that material to them.
|

| Mutations |
|
Differences among species are the key to reconstructing the phylogenetic tree.
Species differ in the characteristics, also called characters. The characters may be observable and measurable properties of the individuals. For instance, among mammals, the numbers of the different kinds of teeth that the individuals of the species have has been a successful character to classify mammals. This character has been especially important among extinct species since fossilized teeth are commonly found.
Any characters can be used to classify species and reconstruct a phylogenetic tree of species, but some are more useful than others. If a species depends on a character for its continued survival, that character will not change, as any mutations of it will be eliminated. Call such characters essential. And most visible characters are essential for the species. This means that if we choose essential characters, any differences should count as very significant. There are, however, some difficulties with considering essential characters. If one species evolves by changing an essential characteristic, whatever ecological forces supported that change may also apply to other species, and that could lead to parallel evolution. Thus, differences or similarities in essential characters are very relevant to the reconstruction of the general shape of the phylogenetic tree, but they really can't be used to determine the relative lengths of the lines within the tree. Some species have been stable for millions of years. Others evolve very fast.
Mutations as a measure of time.
Let's concentrate on one character to begin with. Our first questions are: What is the probability p(t) that the character has some value at the beginning of a time interval of length t as it does at the end? What is the probability q(t) that the character has one value at the beginning of a time interval of length t but a different value at the end of the enterval?
Suppose that there are m different possible alternate values, and suppose that the mutation rate is r mutations per unit time interval.
Some statistical analysis (which we'll skip) gives us the answers to these questions.

Note that initially, when t = 0, p(0) is 1, while q(0) is 0 since there are no mutations in no time. Also, as t approaches infinity, p(t) and q(t) both approach 1/m, which means that in the long run, each of the m alternative values are equally probable.
Now let's assume that there are n different characters, not just one. Then E(t), the expected number of characters that are not the same at the end of a time interval of length t as they were at the beginning, is n(m –1) q(t), that is,

Here's the graph of that function when there are m = 4 alternate values for each character, there are n = 40 characters, and the mutation rate is r = 0.1. Time t is shown on the horizontal axis, while the vertical axis gives y, the expected number of character differences.

Note that when t gets large, the expected number of character differences approaches 30. We can take the inverse function of y = E(t), that is, turn this graph around, to give us an estimate for time t in terms of the observed number of character differences. Let g denote the inverse function. Then

The base of the logarithm function here is e.
The graph of t = g(y) is shown to the right with the same parameter values m = 4, n = 40, and r = 0.1. Note that as the number of expected differences approaches 30, the corresponding time approaches infinity. The observed number of differences may be near the expected number, but it's usually more or less. So the observed number of differences could easily be greater than 30. Should that happen, the best conclusion to make is that the time is very great, but can't be estimated. It would be prudent not to estimate the time when the number of differences is slightly less than 30, too.

|

 |
|
|