Classic Hierarchical Clustering Algorithm: DIANA Implementation in MATLAB

Resource Overview

Implementation of the classic hierarchical clustering algorithm DIANA using MATLAB programming, featuring clear code structure and executable demonstrations with parameter configuration examples

Detailed Documentation

In this paper, we conduct an in-depth study of the classic hierarchical clustering algorithm DIANA and implement it using MATLAB programming. This algorithm finds extensive applications in data mining and machine learning domains. While the algorithmic implementation is straightforward and easy to comprehend, its underlying principles are remarkably complex. Our implementation considers numerous parameters and variables including distance metrics (Euclidean, Manhattan), cluster splitting criteria, and termination conditions to ensure program correctness. The MATLAB code incorporates key functions such as pdist for distance computation, linkage for hierarchical tree construction, and cluster for final group assignments. We performed multiple testing and optimization cycles including sensitivity analysis on threshold parameters and validation across diverse datasets to ensure robust performance. We believe this research provides valuable reference and guidance for related fields, particularly through our documented approach to handling divisive clustering methodology and MATLAB's clustering toolkit integration.