Hierarchical Clustering with Structural Constraints


Author
Vaggos Chatziafratis, Rad Niazadeh, Moses Charikar
Published Year
2017
Publisher
Carnegie Mellon University
Abstract
Hierarchical clustering is a popular unsupervised data analysis method. Formany real-world applications, we would like to exploit prior information aboutthe data that imposes constraints on the clustering hierarchy, and is notcaptured by the set of features available to the algorithm. This gives rise tothe problem of "hierarchical clustering with structural constraints".Structural constraints pose major challenges for bottom-up approaches likeaverage/single linkage and even though they can be naturally incorporated intotop-down divisive algorithms, no formal guarantees exist on the quality oftheir output. In this paper, we provide provable approximation guarantees fortwo simple top-down algorithms, using a recently introduced optimizationviewpoint of hierarchical clustering with pairwise similarity information[Dasgupta, 2016]. We show how to find good solutions even in the presence ofconflicting prior information, by formulating a "constraint-basedregularization" of the objective. Finally, we explore a variation of thisobjective for dissimilarity information [Cohen-Addad et al., 2018] and improveupon current techniques.