Hadoop-HBase for large-scale data

2011259 citationsJournal Article

Authors

Mehul Nalin Vora · Tata Consultancy Services (India)

Abstract

Today we are inundated with digital data. Yet we are very poor in managing and processing it. It is becoming increasingly difficult to store and analyze data efficiently and economically via conventional database management tools. Not only that, type of data, appearing in the databases, are also changing. Now a day, binary large objects are a standard integral part of any database. Researchers, all over the globe, are baffling with analysis of these ultra large databases. Apache HBase is one such attempt. HBase is a noSQL distributed database developed on top of Hadoop Distributed File System (HDFS). In this paper, we present an evaluation of hybrid architecture where HDFS contains the non-textual data like images and location of such data is stored in HBase. This hybrid architecture enables faster search and retrieval of the data which is a growing need in any organization who are flooded with data. The paper aims at evaluating the performance of random reads and random writes of data storage location information to HBase and retrieving and storing data in HDFS respectively. We also present a comparative study of HBase-HDFS architecture with MySQL-HDFS architecture.

Topics & Keywords

Cloud Computing and Resource Management Data Stream Mining Techniques Advanced Data Storage Technologies

Publication Details

DOI: 10.1109/iccsnt.2011.6182030

Field-Weighted Citation Impact: 19.26

Command Palette

Hadoop-HBase for large-scale data

Authors

Abstract

Topics & Keywords

Publication Details