Frequent Term Based Clustering of Stories with Semantic Analysis for Searching and Retrieval

Main Article Content

Amrut Nagasunder
Bharath Boregowda,Madhu Venkatesha, Ananthanarayana V. S.

Abstract

Effective document organizations are often those which provide a concise representation of text content in a large collection of
documents. We have considered the task of clustering of stories (documents) as a facilitation of effectual document arrangement for searching
and retrieval. We propose a novel representation for a story, based on the essential parts of speech - the nouns, verbs and adjectives. We then
perform a clustering of these story representations, resulting in a graph structure where the story representations are conjoined at nodes having
the same or synonymous noun. Such a structure can be queried for stories by giving a search string. We employ the use of a knowledge bank
throughout the system as a step to realize semantic analysis of the text. For testing the goodness of cluster, we carry out the classification test, on
two data-sets. We are able to achieve significantly high quality of clustering, with promising results in regard to memory compaction.

 

Keywords: Document Clustering, Semantic Analysis, Text Mining, Natural Language Processing

Downloads

Download data is not yet available.

Article Details

Section
Articles