Computational biology of RNA - Groups and Centres

Many genes do not code for proteins, but rather RNAs. Indeed, given recent biochemical work describing large numbers of completely novel RNAs, including two families of snoRNAs, tmRNA, micro (mi)RNAs, small interfering (si)RNAs, and RNA-dependent editing mechanisms it is likely that there are many more RNAs carrying out a broad range of functions in the cell than was previously thought. Thus a comprehensive understanding of the biology of a cell will ultimately require knowledge of the identity of all encoded RNAs, the molecules with which they interact, and the molecular structures of these complexes.

For these reasons, the computational biology of RNA is playing an increasingly important role within functional genomics. Here at UEA we develop computational tools and algorithms for the identification of RNA genes and structural elements within genomes, the prediction of RNA structure using evolutionary and physical principles, and the analysis of RNA structure and its application to topical problems in molecular and cell biology (computational biology software).

computational biology of RNA figure 1

Non-coding RNA biology has received growing attention in recent years due to the discovery of small (s)RNAs such as short-interfering (si) and micro (mi)RNAs in plants, animals and fungi. These classes of short (20-30nt) non-coding RNAs are involved in a broad spectrum of biological pathways including regulation of gene expression, genome maintenance and defence against pathogens. Originally, traditional cloning and Sanger sequencing identified individual sRNAs, but recent developments in high-throughput short-read sequencing technologies now make it possible to obtain millions of sRNA sequences in a single experiment. We collaborate with the laboratory of Prof Tamas Dalmay (UEA) to develop tools for the analysis of high-throughput sRNA sequencing data from plant and animal species. Main goals of our current research include: prediction of novel miRNAs, comparisons of relative expression levels of known miRNAs over different samples, prediction of generic sRNA producing regions of the genome, development of databases for high-throughput sRNA sequencing experiments, identification of miRNA targets from whole-genome gene expression data and various techniques to visualise such data in a meaningful manner.

Since lack of access to bioinformatics support is a major bottleneck for many laboratories working on sRNA biology, we set a high priority for making our tools available to the research community in an easy to use form to cater for all levels of user. To this end, through BBSRC funding we produced our latest work, the UEA small RNA Workbench [1]. This is a downloadable java based suite of tools for processing, analysing and visualising small RNA data. The suite contains all of the functionality of the original toolkit wrapped into a graphical interface that allows real time user interaction with their resulting datasets as well as command line access for insertion into existing bioinformatics pipelines. In addition to the original set of web-based tools, the workbench has new tools designed for working with new experimental data created from the PARE (parallel analysis of RNA ends) technique, PAREsnip [3,13] and a tool using new methods for the prediction of sRNA producing regions of the genome that can exploit the extra information provided in multiple sample experiments to enhance existing methods for small RNA clustering we call this tool, CoLIde (Co expression based Loci Identification) [14]."

References

M. B. Stocks, S. Moxon, D. Mapleson, H. C. Woolfenden, I. Mohorianu, L. Folkes, F. Schwach, T. Dalmay, and V. Moulton, "The UEA sRNA workbench: a suite of tools for analysing and visualizing next generation sequencing microRNA and small RNA datasets.," Bioinformatics, vol. 28, no. 15, pp. 2059–2061, Aug. 2012.
S. Moxon, F. Schwach, T. Dalmay, D. Maclean, D. J. Studholme, and V. Moulton, "A toolkit for analysing large-scale plant small RNA datasets.," Bioinformatics, vol. 24, no. 19, pp. 2252–2253, Oct. 2008.
L. Folkes, S. Moxon, H. C. Woolfenden, M. B. Stocks, G. Szittya, T. Dalmay, and V. Moulton, "PAREsnip: a tool for rapid genome-wide discovery of small RNA/target interactions evidenced through degradome sequencing.," Nucleic Acids Res, vol. 40, no. 13, p. e103, Jul. 2012.
I. Mohorianu, M. B. Stocks, J. Wood, T. Dalmay, and V. Moulton, "CoLIde: A bioinformatics tool for CO-expression based small RNA Loci Identification using high-throughput sequencing data.," RNA Biol., vol. 10, no. 7, Jun. 2013.
Molnár, A. Schwach, F. Studholme, D.,Thuenemann, E.Baulcombe, D. ,miRNAs control gene expression in the single-cell alga Chlamydomonas reinhardtii Nature, 447(7148),2007, 1126-9
Mosher, R., Schwach, F. Studholme, D., Baulcombe, D., PolIVb influences RNA-directed DNA methylation independently of its role in siRNA biogenesis Proc Natl Acad Sci U S A, 105(8), 2008, 3145-50
Rusholme, R., Moxon, S., Pakseresht, N.,Moulton, V., Mannington, K., Seymour, G., Dalmay, T., Identification of novel short RNAs in tomato (Solanum lycopersicum), Planta, 3, 2007, 709-717.
Freyhult, E., Moulton, V., Clote, P. Boltzmann probability of RNA structural neighbors and riboswitch detection Bioinformatics, 23, 2007, 2054-2062.
Edvardsson, S.,Gardner, P.P., Poole, A.M., Hendy, M.D., Penny, D., Moulton,V., A search for noncoding RNAs using predicted MFE secondary structures, Bioinformatics, 19, 2003, 865-873.
Moulton, V., M.Zuker, M., Steel, M., Pointon, R.,Penny, D., Metrics on RNA secondary structures, Journal of Computational Biology, 7, 2000, 277-292.
Collins, L., Moulton, V., Penny, D., Use of RNA secondary structure for studying the evolution of RNase P and RNase MRP, Journal of Molecular Evolution, 51, 2000, 194-204.
Funding for the sRNA workbench and related tools is provided by the Biotechnology and Biological Sciences Research Council (grant code BB/E004091/1, BB/100016x/1 and BB/H023895/1)
Thody J, Folkes L, Medina-Calzada Z, Xu P, Dalmay T, Moulton V. PAREsnip2: a tool for high-throughput prediction of small RNA targets from degradome sequencing data using configurable targeting rules. Nucleic acids research. 2018 Sep 28;46(17):8730-9.
Mohorianu I, Stocks MB, Wood J, Dalmay T, Moulton V. CoLIde: a bioinformatics tool for CO-expression based small RNA L oci Ide ntification using high-throughput sequencing data. RNA biology. 2013 Jul 1;10(7):1221-30.

Research Team

Prof. Vincent Moulton, Dr Simon Moxon (TGAC), Dr. Matt Stocks