Editable and Traceable Language Models for Accountable Human-AI Interaction (LIZAF_U26CMP)
Key Details
- Application deadline
- 18 June 2026 (midnight UK time)
- Location
- UEA
- Funding type
- Competition Funded Project (Students Worldwide)
- Start date
- 1 October 2026
- Mode of study
- Full-time
- Programme type
- PhD
Welcome to Norwich
According to the Sunday Times, this city is one of the best places to live in the UK.
Project description
Language processing in humans and deep language models share underlying computational principles. Mechanisms for updating deep large language models, such as editing memory in a transformer to replace harmful information or inject specialised knowledge, offer new promise for designing safe, secure, and accountable artificial intelligence. However, most current high-capacity language models are typically accessed only via pretrained models, API calls, or web interfaces (e.g., ChatGPT). While convenient, this approach limits researcher’s ability to inspect or modify a model's internal behaviour and prevent the deployment of accountable models in sensitive domains. Consequently, critical questions about how language models acquire knowledge, store memory, exhibit bias, or fail (e.g., hallucination, misaligned content generation), remain scientifically unanswered.
This PhD project will address one of these critical questions. You will develop, train, and evaluate language models including transformer-based, and retrieval-augmented generative models from the ground up using high-performance computing (e.g., NVIDIA RTX 6000 ADA 48GB 4DP Graphics) and specialise datasets (e.g., parent-children interaction language). You will then evaluate your models along one or more dimensions of responsible AI, such as: safety (harmful outputs, unintended behaviours, and jailbreaks), security (robustness to adversarial inputs and data poisoning), and accountability (tracing outputs back to training data or internal representations). Finally, you will deploy your models on an embodied AI system or social robot (e.g., Furhat Robots) and conduct human-AI interaction experiments to identify where, why, and how these models succeed or fail in real-time, face-to-face conversations.
This study will deliver a grounded architecture for reliable and trustworthy language models suitable for deployment in sensitive domains such as education and healthcare.
The School of Computing Sciences (https://www.uea.ac.uk/about/school-of-computing-sciences) provides a vibrant research environment for conducting Computing and allied research and training. We collaborate with multi-national companies such as Apple, BT, the National Trust and Aviva, research institutes in the Norwich Research Park (https://www.norwichresearchpark.com), as well as other universities and industries in the UK and overseas. We are also members of the Turing University Network, a group of 65 UK universities working together to advance world-class research and build skills for the future.
The successful candidate will also be expected to contribute to Tutor activities for laboratory support on our BSc and MSc Courses in Artificial Intelligence, Data Science, and Computing Sciences commensurate with their core expertise, within the working hours permitted for full-time Postgraduate Researchers.
Entry requirements
The standard minimum entry requirement is 2:1 in Computer Science or related subject area.
Funding
This PhD project is in a competition for a funded studentship. Funding comprises ‘Home’ tuition fees, an annual tax-free maintenance stipend (2026/27 rate £20,408) for a maximum of 3 years, and £2,000 per annum to support research training activities.
)