Apache HBase

Monday March 27, 2017

HBase on Azure: Import/Export snapshots from/to ADLS

by Apekshit Sharma, HBase Committer.


Azure Data Lake Store (ADLS) is Microsoft’s cloud alternative for Apache HDFS. In this blog, we’ll see how to use it as backup for storing snapshots of Apache HBase tables. You can export snapshots to ADLS for backup; and for recovery, import the snapshot back to HDFS and use it to clone/restore the table. In this post, we’ll go over the configuration changes needed to make HDFS client talk to ADLS, and commands to copy HBase table snapshots from HDFS to ADLS and vice-versa.


“The Azure Data Lake store is an Apache Hadoop file system compatible with Hadoop Distributed File System (HDFS) and works with the Hadoop ecosystem.”

ADLS can be treated as any HDFS service, except that it’s in the cloud. But then how do applications talk to it? That’s where the hadoop-azure-datalake module comes into the picture. It enables an HDFS client to talk to ADLS whenever the following access path syntax is used:

adl://<Account Name>.azuredatalakestore.net/

For eg.
hdfs dfs -mkdir adl://<Account Name>.azuredatalakestore.net/test_dir

However, before it can access any data in ADLS, the module needs to be able to authenticate to Azure. That requires a few configuration changes. These we describe in the next section.

Configuration changes

ADLS requires an OAuth2 bearer token to be present as part of request’s HTTPS header. Users who have access to an ADLS account can obtain this token from the Azure Active Directory (Azure AD) service. To allow an HDFS client to authenticate to ADLS and access data, you’ll need to specify these tokens in core-site.xml using the following four configurations:



To find the values for dfs.adls.oauth2.* configurations, refer to this document.

Since all files/folders in ADLS are owned by the account owner, it’s ACL environment works well with that of HDFS which can have multiple users. Since the user issuing commands using the HDFS client will be different than what’s in Azure’s AD, any operation which checks for ACL will fail. To workaround this issue, use the following configuration which will tell the HDFS client that in case of ADLS requests, assume that the current user owns all files.


Make sure to deploy the above configuration changes to the cluster.

Export snapshot to ADLS

Here are the steps to export a snapshot from HDFS to ADLS.

  1. Create a new directory in ADLS to store snapshots.

$ hdfs dfs -mkdir adl://appy.azuredatalakestore.net/hbase

$ hdfs dfs -ls adl://appy.azuredatalakestore.net/

Found 1 items

drwxr-xr-x   - systest hdfs          0 2017-03-21 23:43 adl://appy.azuredatalakestore.net/hbase

  1. Create the snapshot. To know more about this feature and how to create/list/restore snapshots, refer to HBase Snapshots section in the HBase reference guide.

  2. Export snapshot to ADLS

$ sudo -u hbase hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot <snapshot_name> -copy-to adl://appy.azuredatalakestore.net/hbase


17/03/21 23:50:24 INFO snapshot.ExportSnapshot: Copy Snapshot Manifest

17/03/21 23:50:48 INFO snapshot.ExportSnapshot: Export Completed: snapshot_1

  1. Verify that the snapshot was copied to ADLS.

$ hbase snapshotinfo -snapshot <snapshot_name> -remote-dir adl://appy.azuredatalakestore.net/hbase

Snapshot Info


  Name: snapshot_1

  Type: FLUSH

 Table: t

Format: 2

Created: 2017-03-21T23:42:56

  1. It’s now safe to delete the local snapshot (one in HDFS).

Restore/Clone table from a snapshot in ADLS

If you have a snapshot in ADLS which you want to use either to restore an original table to a previous state, or create a new table by cloning, follow the steps below.

  1. Copy the snapshot back from ADLS to HDFS. Make sure to copy to ‘hbase’ directory on HDFS, because that’s where HBase service will look for snapshots.

$ sudo -u hbase hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot <snapshot_name> -copy-from adl://appy.azuredatalakestore.net/hbase -copy-to hdfs:///hbase

  1. Verify that the snapshot exists in HDFS. (Note that there is no -remote-dir parameter)

$ hbase snapshotinfo -snapshot snapshot_1

Snapshot Info


  Name: snapshot_1

  Type: FLUSH

 Table: t

Format: 2

Created: 2017-03-21T23:42:56

  1. Follow the instructions in HBase Snapshots section of HBase reference guide to restore/clone from the snapshot.


The Azure module in HDFS makes it easy to interact with ADLS. We can keep using the commands we are already know and our applications that use the HDFS client just need a few configuration changes. What what a seamless integration! In this blog, we got a glimpse of the HBase integration with Azure - Using ADLS as a backup for storing snapshots. Let’s see what the future has in store for us. Maybe, a HBase cluster fully backed by ADLS!


All the techniques of the management you can learn here easily with the help of the where is my computer windows 10 which explain you about the techniques of the http://mycomputerwindows10.com and you should click on the given link.

Posted by ank on December 21, 2017 at 07:57 AM GMT #

no longer an examination reader, but I had a friend in excessive college take the AP u.s.records check and at the exam portion of the essay he simply drew a big fat penis at the whole page and became http://www.wewriteessay.co.uk/ it in, he even took a picture of it mid-test, a few balls on that guy.

Posted by suzain lian on April 24, 2018 at 08:03 AM GMT #


Posted by asd on May 26, 2018 at 04:28 PM GMT #

<a title="click here" href="http://google.com">click here</a>

Posted by asd on June 20, 2018 at 03:27 PM GMT #

the nice http://google.com/ post thanks

Posted by game on August 20, 2018 at 10:50 AM GMT #

If you've a lot of pdfs and you need to convert them to work, then try using https://www.altoconvertpdftoword.com/ because it converts everything at once and it's very easy to do. It will be nothing short to do.

Posted by Haley on August 25, 2018 at 01:22 PM GMT #

I was also trying to take snapshots but I was unable to take the snapshots. Then I started searching on the internet and I landed on your website and these steps help me to take the snapshots.

Posted by Play Store on October 23, 2018 at 06:58 AM GMT #


Posted by https://www.youtube.com/watch?v=5UmeN3iFY3c on November 05, 2018 at 09:58 AM GMT #


Posted by https://www.youtube.com/watch?v=5UmeN3iFY3c on November 05, 2018 at 10:00 AM GMT #


Posted by https://www.youtube.com/watch?v=5UmeN3iFY3c on November 05, 2018 at 10:01 AM GMT #

The Apache HTTP Server, informally called Apache, is free and open-source cross-stage web server programming, discharged under the terms of Apache License 2.0. Apache is produced and kept up by an open network of designers under the protection of the Apache Software Foundation

Posted by clara albert on November 07, 2018 at 10:19 AM GMT #

Here you shared all the details of this software and I didn't get any idea from this. It is better to update the site so that we can learn more. http://rainbowpowdercoatings.com/

Posted by hellen jos on November 16, 2018 at 05:27 AM GMT #

The article provide a detailed description for the apache software foundation http://www.photoeditingindia.com . When a person need random, real-time read/write access to their big data they use this one. The article make the interaction more comfortable for everyone. Thank you for this useful article.

Posted by angelina jolie on December 21, 2018 at 09:59 AM GMT #

This organization is working for those young developer who wants to boost their career programming and development. So it is time to help them in each every circumstance.

Posted by Road Traffic Accident Solicitors UK on December 28, 2018 at 06:41 AM GMT #

This is great

Posted by Homepage on January 02, 2019 at 10:33 AM GMT #

I have gone through the overview and configuration changes described above and understood them. I thought that it is very difficult to understand but after reading it felt it very simple. http://datahut.co/data-extraction-services/

Posted by christeenjos on January 16, 2019 at 11:48 AM GMT #

Apache is just a web server program - when you start it up, it just listens on port 80 (for HTTP) and/or port 443 (for HTTPS) by default - or whatever other ports you specify in the configuration file.

Posted by best school on January 29, 2019 at 10:12 AM GMT #

You try to make me cry but I cried to much. U want me to suffer & breaking heart.

Posted by happy wheels on February 18, 2019 at 02:44 AM GMT #

The Apache HTTP Server, casually called Apache, is free and open-source cross-arrange web server programming, released under the terms of Apache License 2.0. Apache is delivered and kept up by an open system of creators under the insurance of the Apache Software Foundation

Posted by Business logo design on February 19, 2019 at 05:09 AM GMT #

Post a Comment:
  • HTML Syntax: NOT allowed



Hot Blogs (today's hits)

Tag Cloud