Bayesian Uncertainty Quantification for Clustering Problems

Project Description

The School of Mathematical Sciences of Queen Mary University of London invite applications for a PhD project commencing either in September 2020 for students seeking funding, or in January 2020 or April 2020 for self-funded students. The deadline for funded applications is the 31st of January 2020. The deadline for China Scholarship Council Scheme applications is 12th January 2020.

This project will be supervised by Dr. Williamo Yoo and Dr. Silvia Liverani.

Clustering is widely used in statistics and machine learning. In clustering, we try to group data that are similar or are close to each other to form different groups we call clusters. Dividing up the data into different clusters tells us a lot about the structure of the data and it has many applications, such as detecting galaxy clusters in astronomy, identifying communities in a social network, dividing pixels into distinct regions for border detection and object recognition.

Many clustering methods and algorithms have been proposed in the literature. Canonical examples include k-means clustering and the hierarchical Dirichlet process. Most of these methods deal with point estimate of the clusters, where one single arrangement of the clusters is deemed the best under some loss criterion. However, methods to assess the quality and the associated uncertainty of this estimate are far less explored in the literature.

Therefore, this project will investigate Bayesian uncertainty quantification for clustering, and in particular to develop the theory and methodology needed in order to build credible sets for clusters with good properties. We use the Bayesian approach because other than point estimates, it also gives estimates of uncertainty automatically once we have the posterior distribution. However, Bayesian computation is very demanding and does not scale very well with the dimension of the data, hence another important component of this project is to develop new clustering algorithms or techniques to deal with high-dimensional data.

The methods developed during this PhD will be applied to suitable datasets, such as data from environmental epidemiology and biology.

Funding Information

This project is eligible for full funding, including support for 3.5 years’ study, additional funds for conference and research visits and funding for relevant IT needs.

This project can be undertaken as a self-funded project. Self-funded applications are accepted year-round for a January, April or September start.

We welcome applicants through the China Scholarship Council Scheme (deadline for applications 12th January 2020).

Application Process

The application procedure is described on the School website. For further inquiries please contact Dr William Yoo [email protected] or Dr. Silvia Liverani [email protected]. This project is eligible for full funding, including support for 3.5 years’ study, additional funds for conference and research visits and funding for relevant IT needs. Applicants interested in the full funding will have to participate in a highly competitive selection process.

Supplementary Information

The School of Mathematical Sciences is committed to the equality of opportunities and to advancing women’s careers. As holders of a Bronze Athena SWAN award, we offer family-friendly benefits and support part-time study.

To apply for this PhD, please use the following application link: