Applications are invited for a non-clinical PhD studentship (starting October 2026 or before) based within the Department of Public Health and Primary Care, University of Cambridge.
Academic supervisor: Professor Angela Wood, Professor of Biostatistics and Health Data Science
Academic supervisor department: Public Health and Primary Care
Background: Cancer Data Driven Detection (CD3) is a new, multidisciplinary and multi-institutional strategic national research programme dedicated to using data to transform our understanding of cancer risk and enable early interception of cancers. It represents a major, multi-million-pound flagship investment funded through a strategic programme award by Cancer Research UK, the National Institute for Health and Care Research (NIHR), Engineering and Physical Sciences Research Council (EPSRC), and the Peter Sowerby Foundation; in partnership with Health Data Research UK (HDR UK) and the Economic and Social Research Councilâs Administrative Data Research UK programme (ADR UK).
Project description: Early cancer diagnosis is often challenging for patients presenting with vague, non-specific symptoms that may be linked to multiple cancer sites. This project aims to improve diagnostic decision-making in such patients by developing advanced, equitable cancer risk prediction models that effectively handle missing and incomplete symptom data recorded in electronic health records (EHRs). Missing symptom codes do not reliably indicate that the symptom did not occur, as symptom data are often incompletely captured. Symptom recording depends on multiple stagesâfrom patient recognition and communication to clinician codingâeach introducing opportunities for information loss. This missingness is not random: it is influenced by clinical factors, and additionally varies across demographic and geographic groups and may reflect broader inequalities in healthcare engagement and recording practices.
This project will systematically investigate how patterns of data completeness differ by patient and practice characteristics (e.g., age, sex, ethnicity, deprivation, and geography), how these patterns evolve over time, and how they influence cancer risk estimates. Understanding and addressing these biases is crucial to avoid exacerbating health inequalities through prediction models that disproportionately benefit advantaged groups.
Using large-scale linked electronic health record data, suitable models will be employed to identify determinants of missing data in symptoms, blood test results, and other key variables, accounting for clustering practice. These models will quantify the extent of variation and identify systematic differences in coding practices between providers and patient subgroups. Temporal analyses will assess how these patterns change over time.
Building on these findings, the project will quantify how different patterns of missingness may impact risk prediction model performance and calibration. Novel methods will be developed to incorporate incomplete or uncertain information, including delta-adjustment imputation and other approaches that explicitly model symptom recording probabilities. Emphasis will be placed on ensuring reproducibility, interpretability, and adaptability as data completeness evolves with changing healthcare practices.
The ultimate goal is to produce robust, fair, and clinically useful cancer risk prediction models that account for systematic biases in symptom data recording, ensuring that such models will benefit all patient groups equitably. The work will also contribute to the methodological literature on missing data, with wider applications in predictive modelling across healthcare. The student will gain expertise in statistical modelling, simulation, electronic health record data science, and fairness evaluationâskills directly aligned with modern data-driven cancer research and clinical translation.
Research environment
The student will be part of the multi-institutional CD3 project - a £10m CR-UK funded programme, led by Prof Antonis Antoniou in the Department of Public Health and Primary Care, Cambridge. Professor Angela Wood, an expert in missing data methods, risk prediction and use of electronic health records, will be the primary PhD supervisor. The student will also be deeply embedded and interact with world-leading groups across the UK. In particular, they will work closely with a multi-disciplinary team including Matthew Sperrin (Manchester), and Gary Abel (Exeter), and Yoryos Lyratzopoulos (University College London), bringing together expertise in biostatistics, health data science and cancer epidemiology.
The student will have the opportunity to attend the structured Early Detection Training Programme (run in partnership with the Alliance for Cancer Early Detection (ACED)), providing PhD students with a comprehensive foundation to cancer early detection.
Outcomes
- An understanding of the impact of missing symptom data on the development and validation of cancer risk prediction models
- An open-source repository of analytical tools and code to identify and correct for missing symptom data and model biases resulting from missing data
- Contribution to fairness-adjusted models as outputs from the broader CD3 programme.
- Publications in the area of defining, measuring and overcoming issues with missing symptom data
Requirements
Applicants are expected to hold at least a 2:1 undergraduate degree (or equivalent) in a relevant subject such as statistics, mathematics, computer science, engineering, or a related biomedical or population health discipline, and may also have a Masterâs degree in a quantitative or health data field. Applicants should be able to demonstrate excellent analytical and programming skills (for example in R or Python), experience working with health data, and an enthusiasm for interdisciplinary research that bridges data science, healthcare, and population health. Strong communication and teamwork skills are essential, and international applicants may need to provide evidence of English language proficiency.
We invite applications from UK and non-UK students who meets the UK residency requirements (home fees). International students who are able to cover the additional costs of all overseas tuition fees through scholarships or funding schemes will also be considered.
The studentship provides the UKRI 2026 stipend rate, currently £20,780 annually.
Further information on possible sources of support for non-UK applicants can be found at https://www.student-funding.cam.ac.uk/ as well as through external funding opportunities.
Applicants must meet the University of Cambridge entrance requirements: see https://www.postgraduate.study.cam.ac.uk/application-process/entry-requirements.
How To Apply
To apply please visit https://www.postgraduate.study.cam.ac.uk/courses/directory/cvphpdhpc and click âApply Nowâ.
Course Details: PhD in Public Health & Primary Care (Full-time)
Start Date: October 2026, Michaelmas Term (or before)
Academic Supervisor(s): Professor Angela Wood, Department of Public Health and Primary Care
Research Title: Cancer Data Driven Detection: Handling missing data in cancer risk prediction models
Please quote reference RH48155 on your application form and in any correspondence about this vacancy.
In order to apply for this opportunity, you will need:
- Details of two academic referees (references will be taken up immediately).
- Transcript(s)
- CV/resume
- Evidence of competence in English
- Statement of Interest outlining your suitability, why you are interested in a PhD in this area, your background and research interests
Interview and Selection process
The deadline for applications is
Monday 5th January 2026
Applicants will be notified of the outcome of their application by
12th January 2026
Shortlisted candidates will be invited to interview in the
week commencing 19th January 2026
Applicants will be notified of the outcome of their interview soon after.
The University actively supports equality, diversity and inclusion and encourages applications from all sections of society.
The University has a responsibility to ensure that all employees are eligible to live and work in the UK.