Tori Ellison, North Carolina State University
Consensus Clustering via Linear Programming
Abstract:
Clustering is the problem of assigning elements into groups such that the similarity of elements in the same group is maximized, while the similarity of elements in different groups is minimized. There are numerous algorithms that are used to cluster data, which may yield different results. In addition, there exist algorithms that will produce varying results depending upon the initial conditions set forth by the algorithm. Since there is no general agreement on which clustering results are best, it is useful to be able to combine the results of several "input" clusterings into one consensus clustering. We propose and evaluate two consensus clustering algorithms which seek to maximize the similarity between a consensus clustering and each of the input clusterings. This is measured by the average normalized mutual information. The results of each of the input clusterings are combined to create a weighted graph. The two proposed algorithms then recluster the data represented on this weighted graph using linear programming formulations.
Advisor: Amy Langville (College of Charleston)