An Optimal Solution for small file problem in Hadoop
Main Article Content
Abstract
Downloads
Article Details
COPYRIGHT
Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
- The journal allows the author(s) to retain publishing rights without restrictions.
- The journal allows the author(s) to hold the copyright without restrictions.
References
Shvachko, Konstantin, et al. "The hadoop distributed file system." Mass storage systems and technologies (MSST), 2010 IEEE 26th symposium on. IEEE, 2010.
Yuan, Yu, et al. "Performance analysis of Hadoop for handling small files in single node." Jisuanji Gongcheng yu Yingyong(Computer Engineering and Applications) 49.3 (2013): 57-60.
White, Tom. "The small files problem." Cloudera Blog, blog. cloudera. com/blog/2009/02/the-small-filesproblem (2009).
Dong, Bo, et al. "A novel approach to improving the efficiency of storing and accessing small files on hadoop: a case study by powerpoint files." Services Computing (SCC), 2010 IEEE International Conference on. IEEE, 2010.
White, T. 2010. Hadoop: The Definitive Guide. 2nd ed. O'Reilly Media, Sebastopol, CA. 41-45.
Jiang, Liu, Bing Li, and Meina Song. "THE optimization of HDFS based on small files." Broadband Network and Multimedia Technology (IC-BNMT), 2010 3rd IEEE International Conference on. IEEE, 2010.
Mackey, Grant, Saba Sehrish, and Jun Wang. "Improving metadata management for small files in HDFS." Cluster Computing and Workshops, 2009. CLUSTER'09. IEEE International Conference on. IEEE, 2009.
Luo, Min, and Haruo Yokota. "Comparing Hadoop and Fat-Btree based access method for small file I/O applications." International Conference on Web-Age Information Management. Springer Berlin Heidelberg, 2010.
Shen, Chunhui, et al. "A digital library architecture supporting massive small files and efficient replica maintenance." Proceedings of the 10th annual joint conference on Digital libraries. ACM, 2010.
Liu, Xuhui, et al. "Implementing WebGIS on Hadoop: A case study of improving small file I/O performance on HDFS." Cluster Computing and Workshops, 2009. CLUSTER'09. IEEE International Conference on. IEEE, 2009.
Shvachko, Konstantin. "Name-node memory size estimates and optimization proposal." Apache Hadoop Common Issues, HADOOP-1687 (2007).
Dong, Bo, et al. "An optimized approach for storing and accessing small files on cloud storage." Journal of Network and Computer Applications 35.6 (2012): 1847-1862.
Gupta, B., Nath, R., Gopal, G. April, 2016. A Novel Techniques to Handle Small Files with Big Data Technology. In Proceedings of Vivechana : A National Conference on Advances in Computer Science and Engineering (ACSE) held at Department of Computer Science & Applications, Kurukshetra University, Kurukshetra, Haryana, India on 29-30 April 2016.