Project description

Primary Supervisor: Dr Ke Li (opens in a new window)

Foundation models trained on DNA and RNA sequences are beginning to transform genomics, but their predictions often remain difficult to interpret. A model may identify a regulatory effect, sequence feature or functional element, yet it is rarely clear which evidence supports the prediction, whether the explanation is stable, and how it should guide new biological discovery.

This PhD will develop agentic explainable AI methods for nucleotide foundation models. Building on emerging preliminary work in the AI for Bioscience group, the project will create computational methods that select, apply and evaluate explainability techniques for DNA/RNA sequence models. The central research question is how to move from model explanations to biologically grounded, testable hypotheses.

The student will investigate explanation quality across complementary criteria, including faithfulness to model behaviour, robustness to sequence perturbation, and biological plausibility. They will then develop knowledge-grounded reasoning methods that connect salient nucleotides, motifs, genes and regulatory elements to biological annotations, ontologies, knowledge graphs and literature-derived evidence. The longer-term ambition is to create an AI research assistant that can help scientists ask why a model made a prediction, what biological mechanisms may underlie it, and which follow-up experiments or sequence perturbations should be prioritised.

The project is designed to be self-contained but will benefit from existing code, datasets and expertise in foundation models, AI agentic framework, XAI, optimisation, and knowledge graphs. Possible applications include interpreting regulatory sequence features, RNA motifs, gene expression models, and other sequence-to-function prediction tasks relevant to bioscience.

The successful candidate will receive interdisciplinary training in frontier AI model training, explainable AI, knowledge graphs, genomics and open-source scientific software. This project is suitable for applicants with strong AI and/or computational skills and an interest in developing transparent, trustworthy AI systems that can accelerate biological discovery. No prior biological training is required, but enthusiasm for interdisciplinary research is essential.

Entry requirements

At least UK equivalence Bachelors (Honours) 2:1 or UK equivalence Masters degree. English Language requirement (Faculty of Science equivalent: IELTS 6.5 overall and 6.0 in each category).

Funding

This PhD project is funded by the Earlham Institute for four years. The successful candidate will receive tuition fees at Home-fee or International-fee rate, an annual tax-free maintenance stipend at the UKRI rate (£21,805 in 2026/7) and £5,000 per annum to support research training.

From Explanation to Experiment: Agentic AI for Interpreting Nucleotide Foundation Models (LIKE_E26EI)

Welcome to Norwich

Project description

Entry requirements

Funding