Abstract: HBase is a column-oriented NoSQL database.

This article is shared from Huawei Cloud Community " HBase Architecture: HBase Data Model & HBase Read/Write Mechanism ", author: Donglian Lin.

HBase architecture: HBase data model

As we all know, HBase is a column-oriented NoSQL database. Although it looks similar to a relational database containing rows and columns, it is not a relational database. Relational databases are row-oriented, while HBase is column-oriented. So, let us first understand the difference between column-oriented and row-oriented databases:

Row-oriented and column-oriented databases:

• Row-oriented databases store table records in row order. The column-oriented database stores table records in a series of columns, that is, the entries in the column are stored in consecutive locations on the disk.

To understand it better, let's take an example and consider the following table.
image.png

If this table is stored in a row-oriented database. It will store records like the following:

1, Paul Walker, United States, 231, Gallardo,

2, Vin Diesel, Brazil, 520, Mustang

As shown above, in a row-oriented database, data is stored based on rows or tuples.

Although a column-oriented database stores this data as:

1, 2, Paul Walker, Vin Diesel, USA, Brazil, 231, 520, Gallardo, Mustang

In a column-oriented database, all column values are stored together, just like the first column values will be stored together, then the second column values will be stored together, and the data in other columns will be stored in a similar manner.

• When the amount of data is very large, such as PB-level or EB-level, we use a column-oriented approach, because single-column data is stored together and can be accessed faster.
• Although the row-oriented approach handles a relatively small number of rows and columns relatively efficiently, the row-oriented database stores data in a structured format.
• When we need to process and analyze large amounts of semi-structured or unstructured data, we use a column-oriented approach. For example, applications that deal with online analysis and processing, such as data mining, data warehouses, applications including analysis, etc.
• Online transaction processing (for example, banking and financial fields that process structured data and require transaction attributes (ACID attributes)) use a line-oriented approach.

The HBase table has the following components, as shown in the following figure:
image.png

table: data is stored in HBase in table format. But the table here is a column-oriented format.
row keys: row keys are used to search records, making the search faster. Would you really want to know how to do it? I will explain in the architecture part of this blog.
column family: various columns are combined in one column family. These column families are stored together, which makes the search process faster because the data belonging to the same column family can be accessed together in a single lookup.
column qualifier: The name of each column is called its column qualifier.
cell: data is stored in the cell. The data is dumped into cells specifically identified by row keys and column qualifiers.
timestamp: The timestamp is a combination of date and time. Whenever data is stored, it will be stored with its timestamp. This makes it easy to search for a specific version of the data.

In a more simple and understandable way, we can say that HBase includes:

• A set of tables
• Each table has column families and rows
• The row key acts as the primary key in HBase.
• Any access to HBase tables uses this primary key
• Each column qualifier that exists in HBase represents an attribute corresponding to the object residing in the cell.

HBase Architecture: Components of HBase Architecture

HBase has three main components, namely HMaster Server, HBase Region Server, Regions and Zookeeper.

The following figure explains the hierarchical structure of the HBase architecture. We will discuss each of them individually.
image.png

Now before entering the HMaster, we will understand the Region, because all these servers (HMaster, Region Server, Zookeeper) are used to coordinate and manage the Region and perform various operations within the Region. Therefore, you would like to know what are zones and why are they so important?

HBase architecture: regions

An area contains all the rows between the start key and the end key assigned to the area. HBase tables can be divided into multiple regions, and all columns of a column family are stored in one region. Each area contains rows in sorted order.

Many regions are assigned to a Region Server, which is responsible for processing, managing, and executing read and write operations to the group of regions.

So, to end in a simpler way:

• A table can be divided into multiple areas. The area is an ordered range of rows of data stored between the start key and the end key.
• The default size of a Region is 256MB, which can be configured as required.
• The region server provides a set of regions for the client.
• A region server can provide about 1000 regions for clients.

Now starting from the top of the hierarchy, I first want to explain to you HMaster Server, which functions like the NameNode in HDFS. Then, moving down the hierarchy, I will take you through ZooKeeper and Region Server.

HBase architecture: HMaster

As shown in the figure below, you can see that HMaster processes the collection of Region Servers residing on the DataNode. Let us understand how HMaster does this.
image.png

• HBase HMaster performs DDL operations (creates and deletes tables) and assigns regions to region servers, as shown in the figure above.
• It coordinates and manages Region Server (similar to NameNode managing DataNode in HDFS).
• It assigns regions to regional servers at startup, and reassigns regions to regional servers during recovery and load balancing.
• It monitors all Region Server instances in the cluster (with the help of Zookeeper) and performs recovery activities when any Region Server is shut down.
• It provides an interface for creating, deleting and updating tables.

HBase has a huge distributed environment, and HMaster alone is not enough to manage everything. So, you would want to know what helps HMaster manage this huge environment? This is where ZooKeeper appears. After understanding how HMaster manages the HBase environment, we will understand how Zookeeper helps HMaster manage the environment.

HBase architecture: ZooKeeper-Coordinator

The following figure explains the coordination mechanism of ZooKeeper.
image.png

• Zookeeper is like the coordinator in the distributed environment of HBase. It helps to maintain the state of the servers in the cluster by communicating through the session.
• Each Region Server and HMaster Server will periodically send continuous heartbeats to Zookeeper and check which server is active and available, as shown in the figure above. It also provides server failure notification so that recovery measures can be implemented.
• As you can see from the above figure, there is an inactive server, which serves as a backup for the active server. If the active server fails, it will come in handy.
• The active HMaster sends a heartbeat to Zookeeper, while the inactive HMaster listens to notifications sent by the active HMaster. If the active HMaster fails to send a heartbeat, the session will be deleted and the inactive HMaster becomes active.
• If the Region Server cannot send a heartbeat, the session will expire and all listeners will be notified. Then HMaster performs appropriate recovery operations, which we will discuss later in this blog.
• Zookeeper also maintains the path to the .META Server, which helps any client to search any area. The Client must first check the Region Server to which a region belongs with the .META Server, and obtain the path of the Region Server.

Speaking of .META Server, let me first explain to you what is .META Server? Therefore, you can easily link the work of ZooKeeper and .META Server. Later, when I explain the HBase search mechanism to you in this blog, I will explain how the two work together.

HBase architecture: Meta table

image.png

• The META table is a special HBase catalog table. It maintains a list of all regional servers in the HBase storage system, as shown in the figure above.
• As you can see from the figure, the .META file maintains tables in the form of keys and values. Key represents the starting key of the zone and its id, and the value contains the path of the zone server.

As I have already discussed Region Server and its functions when I explained Region to you, so now we are moving down the hierarchy and I will focus on the components and functions of Region Server. Later I will discuss the mechanisms of searching, reading, and writing, and understand how all these components work together.

HBase Architecture: Components of Region Server

The following figure shows the components of the regional server. Now, I will discuss them separately.
image.png

The regional server maintains various regions running on top of HDFS. The components of the regional server are:

WAL: It can be concluded from the above figure that Write Ahead Log (WAL) is a file attached to each Region Server in a distributed environment. WAL stores new data that has not been persisted or submitted to permanent storage. It is used to recover data set failure.
Block Cache: From the above figure, it can be clearly seen that the Block Cache is located at the top of the Region Server. It stores frequently read data in memory. If the data in BlockCache is the least recently used, the data will be deleted from BlockCache.
MemStore: is the write cache. Before committing all incoming data to disk or permanent memory, it stores all incoming data. Each column family in a region has a MemStore. As you can see in the image, there are multiple MemStores in a region because each region contains multiple column families. The data is sorted lexicographically before being submitted to disk.
HFile: It can be seen from the above figure that HFile is stored on HDFS. Therefore, it stores the actual unit on the disk. When the size of MemStore exceeds, MemStore submits data to HFile.

Now that we know the major and minor components of the HBase architecture, I will explain the mechanism and their collaborative efforts here. Regardless of whether it is reading or writing, first we have to search where to read or write a file from. So, let us understand this search process, because this is one of the mechanisms that makes HBase very popular.

HBase architecture: How is search initialized in HBase?

As you know, Zookeeper stores the location of the META table. Whenever a client sends a read or write request to HBase, the following actions occur:

  1. The client retrieves the location of the META table from ZooKeeper.
  2. The client then requests the location of the Region Server of the corresponding row key from the META table to access it. The client caches this information along with the location of the META table.
  3. Then it will obtain the row position by requesting from the corresponding Region Server.

For future references, the client uses its cache to retrieve the location of the META table and the regional server of the previously read row key. Then the client will not reference the META table until and unless there is a miss due to zone movement or movement. Then it will request the META server again and update the cache.

Like every time, the client does not waste time retrieving the location of the Region Server from the META server, so this saves time and makes the search process faster. Now, let me tell you how to write in HBase. Which components are involved and how do they participate?

HBase architecture: HBase writing mechanism

The following figure explains the writing mechanism in HBase.
image.png

The writing mechanism goes through the following processes in sequence (refer to the above figure):

Step 1: Whenever the client has a write request, the client writes the data into the WAL (Write Ahead Log).
• Then append the edit to the end of the WAL file.
• The WAL file is saved in each Region Server, and the Region Server uses it to recover data that has not been submitted to disk.

Step 2: After writing the data to WAL, copy it to MemStore.

Step 3: Once the data is placed in the MemStore, the client will receive a confirmation.

Step 4: When MemStore reaches the threshold, it dumps or submits the data to HFile.

Now let us take a deeper look at the contribution of MemStore in the writing process and what is its function?

HBase writing mechanism-MemStore

• MemStore always updates the data stored in it to the sorted KeyValue in dictionary order (by dictionary method). Each column family has a MemStore, so the updates of each column family are stored in a sorted manner.
• When MemStore reaches the threshold, it will dump all data to a new HFile in a sorted manner. This HFile is stored in HDFS. HBase contains multiple HFiles for each column family.
• Over time, the number of HFiles grows as MemStore dumps data.
• MemStore also saves the last written serial number, so both Master Server and MemStore know what has been submitted so far and where to start. When the area starts, read the last serial number and start a new edit from this number.

As I have discussed many times, HFile is the main persistent storage in the HBase architecture. Finally, all data is submitted to HFile, which is the permanent storage of HBase. Therefore, let us look at the properties of HFile, which can search faster when reading and writing.

HBase architecture: HBase writing mechanism-HFile

• Writes are placed on the disk in order. Therefore, the movement of the disk read/write head is very small. This makes the write and search mechanism very fast.
• Whenever the HFile is opened, the HFile index will be loaded into the memory. This helps to find records in a single search.
• The trailer is a pointer to the metablock of the HFile. It is written at the end of the submission file. It contains information about timestamps and bloom filters.
• Bloom filter helps to search for key-value pairs, it skips files that do not contain the desired row key. The timestamp also helps to search for the version of the file, it helps to skip data.

After understanding the writing mechanism and the role of various components in making writing and searching faster. I will explain to you how the read mechanism works in the HBase architecture? Then we will turn to mechanisms that improve HBase performance, such as compression, region splitting, and recovery.

HBase architecture: read mechanism

As we discussed in the search mechanism, if the client does not have it in the cache, the client first retrieves the location of the regional server from the .META server. Then it performs the following steps in order:

• In order to read the data, the scanner first looks for the row unit in the block buffer. All the most recently read key-value pairs are stored here.
• If Scanner fails to find the desired result, it will move to MemStore, because we know this is write cache memory. There, it searches for recently written files, which have not yet been dumped into HFile.
• Finally, it will load data from HFile using Bloom filters and block cache.

So far, I have discussed the search, read and write mechanism of HBase. Now let's take a look at the HBase mechanism, which makes searching, reading, and writing fast in HBase. First, we will understand Compaction, which is one of the mechanisms.

HBase architecture: compression

image.png

HBase combines HFiles to reduce storage and reduce the number of disk seeks required for reads. This process is called compression. Compaction selects some HFiles from an area and combines them. As shown in the figure above, there are two types of compression.

  1. Secondary compression : HBase automatically selects smaller HFiles and resubmits them to larger HFiles, as shown in the figure above. This is called slight compaction. It performs merge sort to submit smaller HFiles to larger HFiles. This helps optimize storage space.
  2. Major Compaction : As shown in the figure above, in Major compaction, HBase merges and resubmits smaller HFiles in a region to a new HFile. In this process, the same column family is placed in the new HFile. It will delete deleted and expired cells in the process. It improves reading performance.

But in the process, input and output disk and network traffic may become congested. This is called write amplification. Therefore, it is usually scheduled at low peak load times.

Another performance optimization process that I will discuss now is Region Split. This is very important for load balancing.

HBase architecture: regional split

The following figure illustrates the mechanism of Region Split.
image.png

Whenever an area becomes larger, it will be divided into two sub-areas, as shown in the figure above. Each area represents exactly half of the parent area. Then report this split to HMaster. This is handled by the same Region Server until HMaster assigns them to the new Region Server for load balancing.

Next, last but not least, I will explain to you how HBase recovers data after a failure. We know that failure recovery is a very important feature of HBase, so let us understand how HBase recovers data after a failure.

HBase architecture: HBase crash and data recovery

• Whenever the Region Server fails, ZooKeeper will notify HMaster of the failure.
• Then HMaster distributes and distributes the regions of the crashed Region Server to many active Region Servers. In order to restore the MemStore data of the failed Region Server, HMaster distributes WAL to all Region Servers.
• Each Region Server re-executes the WAL to construct a MemStore for the column family of the failed region.
• Data is written to WAL in chronological order (chronological order). Therefore, re-executing the WAL means making all the changes made and stored in the MemStore file.
• Therefore, after all Region Servers have executed WAL, the MemStore data of all column families has been restored.

I hope this blog can help you understand the HBase data model and HBase architecture. Hope you like it.

Click to follow and learn about Huawei Cloud's fresh technology for the first time~


华为云开发者联盟
1.4k 声望1.8k 粉丝

生于云,长于云,让开发者成为决定性力量