Abstract
The real world data sets with multi-typed objects and multityped relations can be structured as heterogeneous information networks (HIN). Clustering is one of the most significant process in HIN since it provides useful insights of hidden patterns of objects and their complex relation structure. However, grouping multi-relational target objects without losing their rich semantics and unknown number of clusters is a challenging task. Hence, we use the meta-path concepts to compute the similarity matrix between each pair of objects by exploring the different relations to preserve their semantics. Subsequently, we employ the
Affinity Propagation (AP) clustering approach that can automatically generate clusters and corresponding exemplars (cluster center) for each object based on the similarity matrix. The basic motivation of using AP algorithm is its effectiveness, scalability and the speed on detecting community/clustering of networked data and yet it has not been applied in HIN. However, the performance of AP algorithm depends on two parameters: i) preference p and ii) damping factor λ which causes the algorithm to be non-converged and produce unsatisfactory clustering results. Although some existing methods have been developed to handle this issue, it still faces two challenges: i) slow convergence ii) high computation for finding optimal clustering. In this paper, we presented an enhanced AP (EAP) clustering approach to overcome this issue by updating their parameter values based on different strategies, to improve the AP performance on an HIN data set. The experimental results show that the
proposed method can accelerate the algorithm’s convergence to evaluate
optimal clustering compared to the other methods.
Affinity Propagation (AP) clustering approach that can automatically generate clusters and corresponding exemplars (cluster center) for each object based on the similarity matrix. The basic motivation of using AP algorithm is its effectiveness, scalability and the speed on detecting community/clustering of networked data and yet it has not been applied in HIN. However, the performance of AP algorithm depends on two parameters: i) preference p and ii) damping factor λ which causes the algorithm to be non-converged and produce unsatisfactory clustering results. Although some existing methods have been developed to handle this issue, it still faces two challenges: i) slow convergence ii) high computation for finding optimal clustering. In this paper, we presented an enhanced AP (EAP) clustering approach to overcome this issue by updating their parameter values based on different strategies, to improve the AP performance on an HIN data set. The experimental results show that the
proposed method can accelerate the algorithm’s convergence to evaluate
optimal clustering compared to the other methods.
Original language | English |
---|---|
Publication status | Accepted/In press - 15 Aug 2022 |
Event | 21st UK Workshop on Computational Intelligence: UKCI 2022 - Dept of Electronic & Electrical Engineering University of Sheffield, Sheffield , United Kingdom Duration: 7 Sep 2022 → 9 Sep 2022 Conference number: 21 http://www.sheffield.ac.uk/ukci2022 |
Workshop
Workshop | 21st UK Workshop on Computational Intelligence |
---|---|
Abbreviated title | UKCI 2022 |
Country/Territory | United Kingdom |
City | Sheffield |
Period | 7/09/22 → 9/09/22 |
Internet address |
Keywords
- heterogeneous information network
- similarity matrix
- affinity propagation clustering