On the Use of Fuzzy Clustering in Name Disambiguation

Main Article Content

Tasleem Arif

Abstract

Resolving name ambiguity has become one of the most demanding problems in this era of information overload. This also affects literature management services like digital libraries. It is important to discern ambiguous publications and authors because uncertainty about the real authors of a publication sometimes lead to wrong credits to authors or otherwise. Previous studies have tried to solve this problem by using traditional computational techniques. Soft computing techniques like rough sets, genetic algorithms, fuzzy clustering, etc. promises to be a good option one can look forward to deal with the problems of uncertainty. In this paper, we present the result of our ongoing work for resolving name ambiguity problem in digital citations. We propose a name disambiguation model that uses a mix of hard and fuzzy clustering in a two stage framework. The results of our name disambiguation approach which we obtained on DBLP data are very encouraging and we have been able to achieve very good disambiguation performance in comparison to other baseline methods. Though the results before fuzzy clustering were also very good but after fuzzy clustering the proposed method was able to improve the results. On an average the values of Precision, Recall and F1 were 96.35, 94.01 and 94.72 percent respectively.

 

Keywords: name disambiguation; ambiguous authors; two stage clustering; digital libraries

Downloads

Download data is not yet available.

Article Details

Section
Articles