CIG@IET - Research

Research

Fuzzy relational clustering

In the framework of fuzzy clustering, to classify N objects into C types means to define C fuzzy sets, also called fuzzy clusters, by specifying the membership degree of each object to each cluster. In fuzzy relational clustering, the problem of classifying data is solved by expressing a relation that quantifies the similarity, or dissimilarity, degree between pairs of objects. Based on such relation, objects very similar to each other, i.e., objects of the same type, will belong with high membership values to the same cluster.

In fuzzy object data clustering, on the other hand, the problem of classifying N objects into C types is typically solved by, first, finding C prototypes, which best represent the characteristics of as many groups of objects, and then building a cluster around each such prototype, by assigning each object a membership degree that is as much higher as greater its similarity degree with the prototype is. A prototype may be either a cluster centre, or the most centrally located object in a cluster, or a probability distribution, etc., depending on the type of available data and the specific algorithm adopted. It should be noted that the knowledge of prototypes, which are a condensed representation of the key characteristics of the corresponding clusters, is also an important means to drastically reduce the dimension of the set of objects.

One of the most popular object data clustering algorithms is the fuzzy C-means (FCM) algorithm, which can be applied if the objects of interest are represented as points in a multi-dimensional space. FCM relates the concept of object similarity to spatial closeness and finds cluster centres as prototypes. Several examples of application of FCM to real clustering problems have proved the good characteristics of this algorithm with respect to stability and partition quality. Further, its convergence has been formally demonstrated.

As regards fuzzy relational clustering, several algorithms have been proposed (for instance, Roubens' fuzzy nonmetric model (FNM), Windham's assignment prototype model (AP), and Hathaway, Davenport and Bezdek's relational fuzzy C-means (RFCM)). All these algorithms determine a fuzzy partition of the data set using an alternating optimization scheme to iteratively minimize appropriate objective functions based on a dissimilarity relation R.

In this framework, we have introduced a new fuzzy relational algorithm, called ARCA (Any Relation Clustering Algorithm) which has resulted to be stable without requiring any particular restrictions on the square binary relation. Further, ARCA represents a cluster in terms of a representative of the mutual relationships of the objects which belong to the cluster with a high membership value.

In ARCA each object is represented by the vector of its relation strengths with the other objects in the data set, and a prototype is an object (possibly not included in the original data set) whose relationship with all the objects in the data set is representative of the mutual relationships of a group of similar objects. Like FCM, ARCA partitions the data set minimizing the Euclidean distance between each object (strongly) belonging to a cluster and the prototype of the cluster.