Hierarchical vs. Partitional Clustering

Difference Between Hierarchical and Partitional Clustering

Clustering is a technique of analyzing and dividing data. It is a machine learning technique. The data divided in to various groups are called clusters and the process of dividing them is called clustering. Clustering algorithm can automatically identify clusters. Two classes of clustering algorithms are Partitional and Hierarchical. Data are broken into hierarchy clusters in Hierarchical clustering algorithms.  On the other hand, data set are divided into mutually disjoining partitions in partitional algorithms.

Hierarchical Clustering

Smaller clusters are merged into the larger clusters or larger clusters are divided into smaller clusters. This is the repetition cycle of Hierarchical clustering.  It forms a hierarchy of clusters which is called dendogram. When small clusters are merged into the larger ones, it is called agglomerative clustering. It is a bottom up approach. When the larger clusters are divided into smaller ones, it is called a top down approach. In deciding which larger clusters are used for dividing into smaller ones or which smaller clusters are merged to make a larger cluster, typically the greedy approach is used. For numeric data, the most used metrics are the Euclidean distance, Manhattan distance and cosine similarity. Hamming distance is used for non-numeric data. The actual observations are not necessary for hierarchical clustering as the matrix of distances is sufficient. The clusters are visually represented by Dendogram which displays the hierarchy clearly.

Partitional Clustering

Various partitions are generated in Partitional clustering algorithms.  These partitions are evaluated by some criterion. They are also called to nonhierarchical because each instance is placed in exactly one of the mutually exclusive clusters. Only one set of clusters is the result of a normal partitional clustering algorithm.  Therefore, the user needs input in desired number of clusters called k. k-means clustering algorithm is the most commonly used partitional clustering algorithm. User needs to provide the number of clusters (k) before starting. The algorithm initiates the centers of the k partitions. Actually, k-means clustering algorithm assigns members based on the current centers. It re-estimates centers.

Difference between Hierarchical and Partitional Clustering

There are major differences between Hierarchical and Partitional Clustering. The differences are related to assumptions, running time, input and output. Generally, partitional clustering is faster than hierarchical clustering. Partitional clustering needs stronger assumptions. On the other hand, Hierarchical clustering needs only a similarity measure. Hierarchical clustering does not require any input parameters whereas partitional clustering algorithms need a number of clusters to start. Hierarchical clustering returns more meaningful and subjective division of clusters. On the other hand partitional clustering results in k clusters.

 

Category: VS  |  Tags: