MATLAB Implementation of Fuzzy C-Means Clustering (FCM) with Code Examples
- Login to Download
- 1 Credits
Resource Overview
Application Context:
Many undergraduate mathematics theses involve fuzzy mathematics applications. My research focuses on exploring the effectiveness of fuzzy clustering analysis, where FCM algorithm serves as an essential component. This implementation provides MATLAB code for two iterative forms of FCM algorithm that may benefit fellow students.
Key Technology:
Fuzzy C-Means clustering (FCM), also known as fuzzy ISODATA, is a clustering algorithm that determines each data point's degree of belonging to clusters using membership values. Proposed by Bezdek in 1973 as an improvement over hard C-means clustering (HCM), FCM partitions n vectors xi (i=1,2,...,n) into c fuzzy groups and computes cluster centers to minimize the objective function.
Detailed Documentation
Application Background
In mathematics undergraduate thesis projects, many topics relate to fuzzy mathematics. My research focuses on preliminary exploration of effectiveness issues in fuzzy clustering analysis. Within this study, I emphasize two iterative forms of the FCM algorithm, which constitute an indispensable part of my thesis. To assist fellow students in better understanding, I will provide MATLAB code implementations for both iterative forms below. This should prove helpful for some students.
Key Technology
Fuzzy C-Means clustering (FCM) is a clustering algorithm that uses membership degrees to determine how much each data point belongs to different clusters, also known as fuzzy ISODATA. Proposed by Bezdek in 1973 as an improvement over the earlier hard C-means clustering (HCM) method.
FCM partitions n vectors xi (i=1,2,…,n) into c fuzzy groups and computes cluster centers for each group to minimize the dissimilarity index objective function. Compared to HCM, the main difference lies in FCM's use of fuzzy partitioning through membership values ranging from 0 to 1 to determine each data point's degree of belonging to different groups. The membership matrix U allows elements to take values between 0 and 1, but must satisfy normalization constraints where the sum of membership degrees for each data point equals 1.
FCM's objective function represents the sum of all data points' membership values multiplied by their Euclidean distance to cluster centers.
When operating in batch mode, FCM determines cluster centers ci and membership matrix U through the following steps:
Step 1: Initialize the membership matrix U with random numbers between 0 and 1, satisfying the constraints in equation (6.9). In MATLAB implementation, this involves using rand() function with proper normalization.
Step 2: Calculate c cluster centers ci (i=1,…,c) using equation (6.12). This computation typically involves weighted averages based on membership values.
Step 3: Compute the objective function using equation (6.10). The algorithm stops if the objective function falls below a predetermined threshold, or if the change compared to the previous value is below a specific threshold. This convergence check is implemented using while or for loops with break conditions.
Step 4: Calculate the new U matrix using equation (6.13) and return to Step 2. The membership update involves recalculating distances to all cluster centers.
The algorithm can also start by initializing cluster centers first, then proceeding with the iterative process. Note that since FCM doesn't guarantee convergence to optimal solutions, algorithm performance depends on initial cluster center selection. Therefore, we can use other quick algorithms to determine initial cluster centers, or run FCM multiple times with different initial centers for experimentation. In MATLAB code, this can be implemented through multiple initialization loops with comparison of final objective function values.
- Login to Download
- 1 Credits