Large, complex datasets
Reliable and reproducible analysis of large, complex datasets
With this workstream, we use machine learning and other state-of-the-art methods to analyse large-scale linked electronic health records to inform translational research. Translational research takes the results of early stage research and applies them to humans.
Complex data with a lot of dimensions, such as digital images, can’t be efficiently analysed using standard methods. Machine learning is being rapidly adopted to analyse medical imaging, but it lacks suitably labelled data for this purpose. This leads to poor accuracy and results that can’t be reproduced.
Automated, scalable methods can overcome issues such as missing data, misclassification and confounding factors. All these issues can bias analysis, giving misleading results.
We are developing state-of-the-art methods to address bias in machine learning, alongside a large, labelled data set of images for evaluating machine learning.
We are also developing training in machine learning, including consideration of ethical issues. This work will benefit all the Bristol BRC themes.
Improving decisions on what to focus on in research using large datasets
Theme Translational data science
Workstream Large, complex datasets
Combining data and AI to predict heart problems following Covid
Theme Translational data science
Workstreams Clinical informatics platforms Large, complex datasets
Do ethnicity and coexisting health conditions impact high-risk diabetes?
Theme Translational data science
Workstreams Clinical informatics platforms Large, complex datasets
Handling missing data in large electronic healthcare record datasets
Theme Translational data science
Workstream Large, complex datasets