Phylogenetics networks and modeling reticulate evolution

How can we best model reticulate evolution? For example, from genomic data can we determine how much gene transfer has occurred in organisms such as bacteria by comparing the genomes of extant species? Various techniques for building networks that visualize complex evolutionary histories have been proposed over the years. In addition to extending and applying such techniques together with evolutionary biologists, we develop new theories and constructions for phylogenetic networks.

For example, our recent work includes the development of methods for reconstructing phylogenetic networks directly from molecular data such as the NeighborNet algorithm [1] which is available as part of the software package Splitstree4 , and the QNet algorithm [2]. We have also developed methods [3] for computing consensus networks, phylogenetic networks used to summarise, for example, collections of gene trees. In case the collection of gene trees is built from patchy data, more sophisticated tools are needed to summarise them. One approach [4] developed by members of our group is implemented in the Q-imputation algorithm (PDF, 28Kb).

Another software package developed by our group, called PADRE, aims to provide tools for constructing an explicit representation of the evolutionary history of polyploid organisms such as plants . PADRE is based on our work on modelling reticulate evolution [5, 6], which includes recent theoretical results [7] concerning the complexity of computing multi-labelled trees. More recently, we have investigated the problem of much information is needed to infer reticulate evolution [8,9,10] and also investigated ways to search through the space of networks. The latter might be useful for developing e.g. Bayesian methods for network construction as such methods tend to be useful for large data sets.

Viruses are known to undergo rapid evolution making it sometimes difficult to apply standard phylogenetic tools to help understand their past. To tackle this problem, we developed the ViralNet software (www.uea.ac.uk/computing/software) which uses longitudinal mutation sequence depth data to identify potential reassortment and recombination events. In a related project, we have also developed new techniques to evaluate the size of evolution space arising from processes such as tandem duplication, as well as techniques to reverse engineer the evolutionary process from observed data [11].

References

Bryant, D. Moulton, V. NeighborNet: An agglomerative method for the construction of phylogenetic networks, Molecular Biology and Evolution, 21, 2004, 255-265.
Gruenewald, S.. Forslund, K., Dress, A., Moulton, V. QNet: An agglomerative method for the construction of phylogenetic networks from weighted quartets, Molecular Biology and Evolution, 24, 2007, 532-538.
Holland, B., Moulton, V., Consensus networks: A method for visualising incompatibilities in collections of trees, WABI 2003, Lecture Notes in Bioinformatics 2812, 165-176.
Holland, B., Conner, G., Huber, K., Moulton, V., Imputing supertrees and supernetworks from quartets, Systematic Biology, 56, 2007, 57-67.
Huber, K.T., Moulton,V., Phylogenetic networks from multi-labeled trees, Journal of Mathematical Biology, 52(5), 2006, 613-632.
Huber, K.T., Oxelman, B., Lott, M., Moulton,V., Reconstructing the evolutionary history of polyploids from multi-labelled trees, Molecular Biology and Evolution, 23, 2006, 1784-1791.
Huber, K.T., Lott, M., Moulton, V., Spillner, A., The complexity of deriving multi-labeled trees from bipartitions, Journal of Computational Biology.
Gambette P., Huber, K.T. On encodings of phylogenetic networks of bounded level, Journal of Mathematical Biology. 65(1), 2012, 157-180.
Huber, K.T., Moulton, V., Encoding and constructing 1-nested phylogenetic networks with trinets, Algorithmica, 66(3), 2013, 714-738.
Huber, K.T., van Iersel, L., Moulton, V., Wu, T., How much information is needed to infer reticulate evolutionary histories?, Systematic Biology, 64(1), 2015, 102-111.
Fotso-Chedom, D., Murcia, P. R., Greenman, C.D., Inferring the clonal structure of viral populations from time series sequencing (www.arxiv.org/abs/1407.7997).