Survey on Inverted Index Compression over Structured Data

B. Usharani, M.Tanooj Kumar


A user can retrieve the information by providing a few keywords in the search engine. In the keyword search engines, the query is specified in the textual form. The keyword search allows casual users to access database information. In keyword search, the system has to provide a search over billions of documents stored on millions of computers. The index stores summary of information and guides the user to search for more detailed information. The major concept in the information retrieval(IR) is the inverted index. Inverted index is one of the design factors of the index data structures. Inverted index is used to access the documents according to the keyword search. Inverted index is normally larger in size ,many compression techniques have been proposed to minimize the storage space for inverted index. In this paper we propose the Huffman coding technique to compress the inverted index. Experiments on the performance of inverted index compression using Huffman coding proves that this technique requires minimum storage space as well as increases the key word search performance and reduces the time to evaluate the query.

Keywords: Huffman Algorithm, inverted index, index compression ,keyword search, Lossless compression, Structured data, Variable length encoding.

Full Text:




  • There are currently no refbacks.

Copyright (c) 2016 International Journal of Advanced Research in Computer Science