Back to home

Multiobjective clustering

In many optimization problems, the desired solution is best described using a number of objectives, which are often conflicting. The term multiobjective optimization refers to the repertoire of optimization techniques that attempt to account for several objectives during the optimization.

Data clustering is an example of a problem where good solutions are best described using a set of different criteria that account for conflicting properties such as the compactness of clusters and the separation between clusters. Our multiobjective clustering algorithm MOCK optimizes two such objectives and finds an approximation to the Pareto front (the set of optimal trade-off), instead of a single solution. It also support the use in identifying the most promising solutions from the Pareto front. MOCK compares favourably to ensemble techniques and established clustering methods that use a single criterion during the optimization.

A snapshot of a Pareto front generated by MOCK is shown below. The algorithm (as well as supporting material for our papers) can be downloaded to the right.

MOCK has been further extended to deal with clustering of dissimilarity data and to deal with semi-superved clustering (clustering in the presence of sparse labelled data). In related work, we have also developed a multiobjective approach to the problem of unsupervised feature selection. See publications below.

Links and downloads:

  • C++ source code and Java interface of MOCK, as described in Handl and Knowles (2007).

  • Table providing Median and Interquartile Range values for the comparison of MOCKagainst k-means, average link, single link and Strehl's ensemble method over 34 different data sets.

  • Supporting material for VIENNA

  • Generators for synthetic data sets.

Publications:

  • Julia Handl and Joshua Knowles. (2007) An evolutionary approach to multiobjective clustering. IEEE Transactions on Evolutionary Computation 11(1):56-7

  • Julia Handl and Joshua Knowles. (2005) Multiobjective clustering around medoids.Proceedings of the Congress on Evolutionary Computation (CEC 2005). Volume 1, pages 632--639.

  • Julia Handl and Joshua Knowles. (2005) Improving the scalability of multiobjective clustering. Proceedings of the Congress on Evolutionary Computation (CEC 2005). Volume 3, pages 2372--2379.

  • Julia Handl and Joshua Knowles. (2005) Exploiting the trade-off -- the benefits of multiple objectives in data clustering. Proceedings of the Third International Conference on Evolutionary Multi-Criterion Optimization (EMO 2005). Pages 547-560. LNCS 3410.

  • Julia Handl and Joshua Knowles. (2004) Evolutionary multiobjective clustering. Proceedings of the Eighth International Conference on Parallel Problem Solving from Nature (PPSN VIII). Pages 1081-1091. LNCS 3242.

eXTReMe Tracker