A Survey on Text Mining in Clustering

Main Article Content

S. Logeswari
K.Premalatha D.Sasikala

Abstract

Text mining has important applications in the area of data mining and information retrieval. One of the important tasks in text mining
is document clustering. Many existing document clustering techniques use the bag-of-words model to represent the content of a document. It is
only effective for grouping related documents when these documents share a large proportion of lexically equivalent terms. The synonymy
between related documents is ignored. It reduces the effectiveness of applications using a standard full-text document representation. This paper
emphasis on the various techniques that are used to cluster the text documents based on keywords, phrases and concepts. It also includes the
different performance measures that are used to evaluate the quality of clusters.

 

Keywords: Document Clustering, Latent Semantic Indexing, Vector Space Model, tf-idf, precision, recall, F-measure.

Downloads

Download data is not yet available.

Article Details

Section
Articles