Enhanced Speed Processing of Data using in-Memory Analytics

Main Article Content

T.Selva Divya
M.Jayanthi, V.Vinoth Kumar

Abstract

Database is used to store large amount of data. When the data size is growing, it is difficult to process using traditional data processing application or tools. Today many organizations’s information is growing and they need a large data tools to store a huge amount of data. So, big data tools arrive because of the drawback for storage and processing. Hadoop is an open source software and support many applications which support petabyte sized analytics. This paper deals with working of Hadoop. HDFS is used for data storage in Hadoop. It distributes the work to nodes and communicates with a single named server node and if the name server goes offline. HDFS must restart where it left out and it causes some latency or delay in work of the system. Spark solves the problem of HDFS. It is a column oriented distributed database and has a fault tolerant than HDFS. Spark is In-memory database where the queries of data are retrieved from RAM instead of physical disk the processing speed of spark is much faster than Hadoop system. Map reduce is used for data processing and it splits the work and reduces it into single subset.

 

Keywords: Big Data, Hadoop System, HDFS, MapReduce, Spark, Cassandra, Databases.

Downloads

Download data is not yet available.

Article Details

Section
Articles