An algebraic metric for phylogenetic trees
(By R. Alberich, G. Cardona, F. Rosselló, G. Valiente)
Supplementary Material
- Phylogenetic trees with up to 7 taxa:
The i-th line in the file corresponding to n
holds a Newick string identifying the i-th (starting with i=0) tree with n taxa.
- Distances between all pairs of trees up to 6 taxa:
Each line is of the form "i j dRF dTR dNS",
where dRF, dTR, dNS are, respectively, the
Robinson-Foulds, transposition and nodal splitted distance between the i-th and the j-th tree with n taxa.
- Distributions of distances between all pairs of trees up to 7 taxa:
For each of the considered n, the corresponding file
contains:
- Global statistics of distances: These lines are of the form
"(dRF,dTR,dNS) k", meaning that there are k
unordered pairs of trees whose Robinson-Foulds, transposition
and nodal splitted distances are, respectively, dRF, dTR,
dNS.
- Three separate statistics for each of the Robinson-Foulds, transposition
and nodal splitted distances: These lines are of the form "d k", where k is the
number of (unordered) pairs of trees at distance d.
Some statistical parameters of these distributions are also available
here.
- Distributions of distances between all pairs of trees
within a random sample of trees with number of taxa from 8 to 14:
The syntax of these files is the same as above, including also the
size of the random sample of trees.
Some statistical parameters of these distributions are also available
here.
- Distributions of the transposition distances (with the usual ordering of
the leaves and the reversed one) between all pairs of trees
trees with number of taxa from 3 to 7:
Each line is of the form "dTR1 dTR2 k",
meaning that there are k unordered pairs whose distance with
respect to the usual ordering is dTR1 and with respect to the
reversed one is dTR2.
- Python package: A pre-release version of the python package used to
make these computations can be downloaded here.