Study of Recent Advancement in Document Clustering

Main Article Content

Durgesh Nandan Dixit
R. K. Gupta


Today data on internet is increasing at an exponential rate. Internet users are acting as a Data Producers and are pouring the internet with the lot of documents. Information retrieval (IR) is used to retrieve the more preferred information over the less preferred information. Thus Document clustering is a subset of the larger field of data clustering, which inherit concepts from the fields of information retrieval (IR) and machine learning (ML). In this paper we have analyzed the current state of document clustering research. A study of the algorithms is performed and directions for future research are also discussed.

Keywords: data mining; text mining; text clustering; k-means; non-negative factorization,PCA


Download data is not yet available.

Article Details