Effective Analysis of Nearest Duplicate Text Document using Fuzzy Clustering Method

Main Article Content

Nancy Jasmine Goldena
Dr. S.P. Victor

Abstract

In this research article, the descriptive and validating of the most popular fuzzy clustering methods for the detection of near-duplicate text documents are framed. The fuzzy based cluster algorithm analyze the resemblance of various information that available in online through the applied two-stage protocols. In first stage, the fuzzy clustering algorithm analyze the duplicate content of the image through RGB that debased with Euclidean distance metric and in the second stage, the similarity syntax of the text mining are verified through the parametric view based on the time series for sample for each document. It is proposed to develop methods of assessing the adequacy of the criteria changes in ontologies for the development of fuzzy state space.It involves the analysis and identification of web duplicate resources through Fuzzy based method. The theoretical results are confirmed recommendations for practical use. The algorithm for filtering near-duplicate documents discussed here has been successfully implemented and it shows better accuracy than other exisiting alogirthms.


Keywords—: Fuzzy sets, Soft computing, Web mining, Information retrieval, Duplicate Document.

Downloads

Download data is not yet available.

Article Details

Section
Articles