Ontology Based Information Retrieval Using Vector Space Model

Main Article Content

Ankita K. Kolhe
Nayana S. Zope, Prashant Bharambe

Abstract

Information retrieval (IR) is the science of searching for documents, for information within documents and for metadata about documents, as well as that of searching relational databases and the World Wide Web. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML/TXT documents is proposed and evaluated in various circumstances. Our approach is applying the vector space model method. Increasing growth of information volume in the internet causes an increasing need to develop new semi) automatic methods for retrieval of documents and ranking them according to their relevance to the user query. This combination reserves the precision of ranking without losing the speed. Our approach exploits natural language processing techniques for extracting phrases and stemming words. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining HTML, TXT, PDF documents, (2) finding frequency of each and every word, (3) removing stop keywords, (4) applying porter stemming algorithm, to remove the suffix of every word and (5) allowing input variable document using vector dimensions. A ranking system called Information Retrieval using Vector Space Model (IRVSM) is developed to implement and test the proposed model.

Keywords: Ontology, Parsing, Indexing, Stemming, Vector Space Model, Document Ranking.

Downloads

Download data is not yet available.

Article Details

Section
Articles