Find us on: University of East Anglia on Facebook Follow University of East Anglia news on Twitter University of East Anglia's photostream University of East Anglia's YouTube channel
Course Search:

Phylogenetics

Phylogenetics is the reconstruction and analysis of trees and networks to describe and understand the evolution of species, populations and individuals. It is widely used in molecular biology and other areas of classification (such as linguistics), and has both led to and benefited from the development of new mathematical, statistical and computational techniques. Although the foundations of phylogenetics were laid down many decades ago, it is currently experiencing an exciting renaissance due to the wealth and types of biological data that are now becoming available.


Research

Recently, several members of our group participated in and were involved in organizing a four month programme in phylogenetics that took place at the Isaac Newton Institute, Cambridge (http://www.newton.ac.uk/programmes/PLG/). Over 200 researchers from around the world took part in this programme, and several open problems in phylogenetics developed in Cambridge may be found here (http://www.newton.ac.uk/programmes/PLG/conj.pdf).

Research in our group currently focuses mainly on the following topics:

(A) Phylogenetic networks and modeling reticulate evolution

How can we best model reticulate evolution? For example, from genomic data can we determine how much gene transfer has occurred in organisms such as bacteria by comparing the genomes of extant species? Various techniques for building networks that visualize complex evolutionary histories have been proposed over the years. In addition to extending and applying such techniques together with evolutionary biologists, we develop new theories and constructions for phylogenetic networks.

For example, our recent work includes the development of methods for reconstructing phylogenetic networks directly from molecular data such as the NeighborNet algorithm [1] (which is available as part of the software package Splitstree4 http://www.splitstree.org/), and the QNet algorithm [2] which we provide at http://www2.cmp.uea.ac.uk/~vlm/qnet/. We have also developed methods [3] for computing consensus networks, phylogenetic networks used to summarize, for example, collections of gene trees. In case the collection of gene trees is built from patchy data, more sophisticated tools are needed to summarize them. One approach [4] developed by members of our group is implemented in the Q-imputation algorithm (available from http://awcmee.massey.ac.nz/downloads_software.htm).

Another software package developed by our group, called PADRE, aims to provide tools for constructing an explicit representation of the evolutionary history of polyploid organisms such as plants. PADRE (available from http://www2.cmp.uea.ac.uk/~vlm/padre/) is based on our work on modeling reticulate evolution [5, 6], which includes recent theoretical results [7] concerning the complexity of computing multi-labelled trees.

Selected publications:


  1. NeighborNet: An agglomerative method for the construction of phylogenetic networks, D.Bryant, V.Moulton, Molecular Biology and Evolution, 21, 2004, 255-265.
  2. QNet: An agglomerative method for the construction of phylogenetic networks from weighted quartets, S.Gruenewald, K.Forslund, A.Dress, V.Moulton, Molecular Biology and Evolution, 24, 2007, 532-538.
  3. Consensus networks: A method for visualising incompatibilities in collections of trees, B.Holland, V.Moulton, WABI 2003, Lecture Notes in Bioinformatics 2812, 165-176.
  4. Imputing supertrees and supernetworks from quartets, B.Holland, G.Conner, K.Huber, V.Moulton, Systematic Biology, 56, 2007, 57-67.
  5. Phylogenetic networks from multi-labeled trees, K.T.Huber, V.Moulton, Journal of Mathematical Biology, 52(5), 2006, 613-632.
  6. Reconstructing the evolutionary history of polyploids from multi-labelled trees, K.T.Huber, B.Oxelman, M.Lott, V.Moulton, Molecular Biology and Evolution, 23, 2006, 1784-1791.
  7. The complexity of deriving multi-labeled trees from bipartitions, K.T.Huber, M.Lott, V.Moulton, A.Spillner, Journal of Computational Biology. 

(B) Biodiversity

In conservation biology it is a central problem to measure, predict, and preserve biodiversity as species face extinction. One popular approach to this is to measure the diversity of a collection of species in terms of the evolutionary diversity spanned by those species on the `tree of life', a measure that is commonly referred to as phylogenetic diversity. Building on this approach, recently there has been much interest in setting up a framework that will enable conservation biologists to also take other important aspects into account, such as dependencies between certain species or budgetary constraints.

Recently, we have developed algorithms that efficiently identify collections of species with high diversity in the presence of reticulate evolution [8] or relative to a collection of gene trees [9]. In ongoing work, we are extending these approaches to take into account geographical constraints, work that commenced in [10].

Selected publications:


  1. Computing phylogenetic diversity for split systems, A.Spillner, B.Nguyen, V.Moulton, IEEE/ACM Computational Biology and Bioinformatics, 5, 2008, 235-244
  2. Optimizing phylogenetic diversity across two trees. M. Bordewich, C. Semple, A. Spillner, Isaac Newton Institute Preprint Series. NI07068-PLG (2007).
  3. Optimizing phylogenetic diversity under constraints, V.Moulton, C.Semple, M.Steel, Journal of Theoretical Biology, 246, 2007, 186-194.  

(C) Phylogenetic combinatorics

Phylogenetic combinatorics is a branch of discrete applied mathematics concerned with the mathematical structures related to phylogenetic trees and networks such as graphs, split systems, metrics and tight spans. The goal is to develop the theory that forms the basis of phylogenetic reconstruction methods.

Our recent work includes a characterization of edge weighted phylogenetic trees in terms of conditions on a set of weighted quartets (i.e. fully resolved phylogenetic trees on just 4 leaves) [11], and the development of closure rules for generating split systems on some set from splits on subsets of that set [12].

We have also studied phylogenetic trees in terms of their edge-product space [13], and made significant steps towards developing a decomposition scheme for distances using the the theory of tight spans [14], which allows one to e.g. express genetic distances in terms of simpler ones [15]. The basis of this scheme is provided in part by the concept of cell-decomposability, which captures a natural way to decompose tight-spans [16].

Selected publications:


  1. Encoding phylogenetic trees in terms of weighted quartets, S. Gruenewald, K.T. Huber, V.Moulton, C.Semple, Journal of Mathematical Biology, 56(4), 2008, 465-477.
  2. Two new closure rules for constructing phylogenetic super-networks. S. Gruenewald, K.T. Huber, Q. Wu. Bulletin of Mathematical Biology.
  3. A regular decomposition of the edge-product space of phylogenetic trees, J.Gill, S.Linusson, V.Moulton, M.Steel, Advances in Applied Mathematics, 41, 2008, 158-176
  4. T-Theory, A.Dress, V.Moulton, W.Terhalle, The European Journal of Combinatorics, 17, 1996, 161-175.
  5. Compatible decompositions and block realizations of finite metrics, A.Dress, K.T.Huber, J.Koolen, V.Moulton, European Journal of Combinatorics. in press
  6. Characterizing cell-decomposable metrics, K.T.Huber, J.Koolen, V.Moulton, A.Spillner, The Electronic Journal of Combinatorics 15(1), 2008.

(D) Phylogenetics in Practice

Insight into the evolutionary past of organisms has profound consequences for many areas affecting our daily life, including drug development, food production, agriculture, and biodiversity conservation to name but a few.

Examples of applications where our new algorithms and methodologies have shed light include the evolutionary past of yeast [17] (which is of use in classifying yeast which cause food spoilage), the origin of the evolutionary phenomenon of polyploidy [18] (which is very common in plants, including crops such as wheat), understanding the genetic diversity and dispersal of plants [19], and reconstructing/understanding the evolution of viruses such as Hepatitis and SARS [20].

Selected publications:


  1. Exploring contradictory phylogenetic relationships in yeast, Q.Wu, S.James, I.Roberts, V.Moulton, K.Huber, FEMS Yeast Research.
  2. Untangling complex histories of genome mergings in high polyploids, A.Brysting, B.Oxelman, K.T.Huber, V.Moulton, C.Brochmann, Systematic Biology, 56, 2007, 467-476.
  3. Biogeographic interpretation of split graphs: Least squares optimization of edge lengths, R.Winkworth, D.Bryant, P.Lockhart, D.Havell, V.Moulton, Systematic Biology, 54, 2005, 56-65.
  4. Phylogenetic analysis of the full-length SARS-CoV sequences: Evidence for phylogenetic discordance in three genomic regions, G.Magiorkinis, E.Magiorkinis, D.Paraskevis, A.M. Vandamme, M. Van Ranst, V. Moulton, A.Hatzakiu, Journal of Medical Virology, 74(3), 2004, 369-372.

Collaborations

We collaborate with several researchers and groups from around the world, including the following:

  • Magnus Bordewich, University of Durham, UK
  • Jo Dicks, John Innes Centre, Norwich, UK
  • Andreas Dress and Stefan Gruenewald, PICB, Shanghai, China
  • Brent Emerson, School for Biological Sciences, UEA, UK
  • Olivier Gascuel, LIRMM, Montpellier, France
  • Barbara Holland and Pete Lockhart, Allan Wilson Center for Molecular Ecology and Evolution, Massey University, New Zealand
  • Daniel Huson, Tuebingen University, Germany
  • Steven Kelk, CWI, Amsterdam, The Netherlands
  • Jack Koolen, POSTECH Mathematics, Pohang, Korea
  • Bengt Oxleman, University of Gothenburg, Sweden
  • Ian Roberts, Institute for Food Research, Norwich, UK
  • Charles Semple and Mike Steel, Biomathematics Research Centre, University of Canterbury, New Zealand
  • Kevin Tyler, School of Medicine, UEA, UK

Our collaboration with Andreas Dress, Jack Koolen and Mike Steel is currently funded in part through the EPSRC grant "Phylogenetic combinatorics: a mathematical theory for the analysis of phylogenetic trees and networks" (PI V.Moulton, CI K.Huber).

Research Team: Dr. Katharina Huber, Prof. Vincent Moulton, Dr. Andreas Spillner

QR code for Phylogenetics

Send this page to your mobile phone by scanning this code using a 2D barcode (QR Code) reader. These can be installed on most modern Smart Phones.