With the aim of improving the classification and identification of yeast species, this project concentrated on two areas of research: finding unique defining DNA sequences and minimising the costs of physiological tests for identification.

Unique defining DNA Sequences

An effective technique was developed to search for unique DNA sequences in 702 26S rRNA genes, resulting in such sequences being found for nearly every yeast species known to date. Surprisingly, it was found that most species could be identified with a sequence of length 8.

Minimising testing costs

A number of methods were designed to minimise the costs of performing physiological tests for yeast identification.

  • Initially, it was assumed that a set of tests would be performed in parallel. In this case, simulated annealing was used to find a minimal cost test set.
  • When tests are expensive and time is not an important consideration, it makes sense to perform tests one at a time, deciding on the next test only once the results of the previous test are in. Both a greedy algorithm and a simplified GRASP were designed to construct diagnostic keys to be used in yeast identification.
  • When both test costs and time are important, tests may be performed in batches. The greedy algorithm was modified to create diagnostic keys in this new situation, minimising a linear combination of total time taken and total test cost. This algorithm automatically adjusted the batch sizes depending upon the relative importance of test cost and time.

References

  1. Reynolds, A.P. and Dicks, J.L. and Roberts, I.N. and, Algorithms for Identification Key Generation and,Proceedings of EvoBIO-2003, volume LNCS 2611, Edited by Raidl, G. and al, et, Springer-Verlag, Berlin Heidelberg, pp. 107-118, 2003
  2. de la Iglesia, B. and Wesselink, J.J. and Rayward-Smith, Determining a unique defining DNA sequence for yeast, Bioinformatics, volume 18(7), pp. 1004-1010, 2002
  3. de la Iglesia, B. and Rayward-Smith, V.J. and Wesselink, Classification/Identification on Biological Databases,MIC2001: 4th Metaheuristics International Conference, Edited by de Souza, J.P., Porto, Portugal, pp. 267-271, 2001

Research Team

Dr Beatriz de la Iglesia, Prof. Vic Rayward-Smith, Dr. Alan Reynolds