Web data clustering using FCM and proximity hints from text as well as hyperlink-structure
Abstract
In this study, we use FCM clustering along with proximity hints (P-FCM) to the web pages for clustering. We provide proximity hints using a new approach of combining textual information, hyperlink structure and co-citation relations into a single similarity metric. We provide the result of web-based experiments to show the significance of proximity hints during P-FCM functioning. These observations suggest that with the combination of textual and hyperlink-structure information we can improve the clustering done by FCM. We also show that the correlation value of human clustering and our approach is very high, showing thereby the efficiency over the existing FCM algorithm. © 2008 IEEE.