Apache Sentry
Getting Started with Sentry in Hive
Apache Sentry (incubating) is a highly modular system for providing fine-grained role based authorization to both data and metadata stored on an Apache Hadoop cluster. It currently works out of the box with Apache Hive and Cloudera Impala. In this blog post, you will learn how to use Sentry with Hive.
Sentry uses a policy provider to define the access control to Hive. Sentry currently ships with a file-based policy provider, see below for an example. A single global policy file can be used to control access to an entire HiveServer2 instance, and multiple dependent per database policy files can be linked to the global one. Lets look at the structure of policy file with an example.
Global policy file:
[groups] admin_group = admin_role dep1_admin = uri_role [roles] admin_role = server=server1 uri_role = hdfs:///ha-nn-uri/data [databases] db1 = hdfs://ha-nn-uri/user/hive/sentry/db1.ini
Per db policy file: (at hdfs://ha-nn-uri/user/hive/sentry/db1.ini)
[groups] dep1_admin = db1_admin_role dep1_analyst = db1_read_role [roles] db1_admin_role = server=server1->db=db1 db1_read_role = server=server1->db=db1->table=*->action=select
As you can see above, there are usually three sections in the global policy file:
- A [groups] section that provides group-to-role mapping
- A [roles] section that provides role-to-privileges mapping
- A [databases] (optional) section that provides database-to-per-database policy file mapping. This allows for maintaining per-database privileges separately.
Sentry provides authorization through a hook in HiveServer2. When a user makes a connection to HiveServer2, it authenticates the connecting user and persists the user information for the session. For the subsequent operations that user performs, Sentry authorizes the operation by mapping the user to the groups he/she belongs to and determining whether the group(s) have necessary privileges on the relevant objects.
Hive security landscape with Sentry
Next, lets look at how Sentry fits into the security landscape of Hive. The below infographic shows how different authentication and authorization pieces fit together.
Here are the main points to take away:
- Sentry requires that HiveServer2 be configured to use strong authentication. HiveServer2 supports Kerberos as well as LDAP (and AD) authentication mechanisms.
- At the Sentry authorization level, there are two supported forms of user-group mappings:
- HadoopGroup mapping, which uses the underlying Hadoop groups
- Hadoop groups in turn support Shell-based mapping as well as LDAP group mapping. Please note that in case of Sentry with Hive, the mapping of users to groups is performed on the HiveServer2 host
- LocalGroups, where the users and groups can be defined locally in the policy file using [users] section (for testing purposes only)
Demo
In this demo, we will be using Kerberos authentication for HiveServer2 with HadoopGroups as the Sentry group provider, which by default uses Shell mapping. We briefly go over Sentry and see how to configure and use it in this configuration. (Note: Cloudera Manager 4.7 and CDH 4.4 are shown here; for future versions, the steps will be similar.)
Conclusion
Sentry brings in fine-grained authorization support for both data and metadata in a Hadoop cluster. It is already being used in production systems to secure the data and provide fine-grained access to its users. It is also integrated with the version of Hive shipping in CDH (upstream contribution is pending), Cloudera Impala, and Cloudera Search. Also, here is a short demo if you are interested in using it with Hue.
Posted at 09:00AM Dec 05, 2013
by sravya in General |
Comments [17]
|
Posted by niranjan on January 31, 2014 at 05:59 PM GMT #
Posted by cslovak on February 01, 2014 at 06:22 PM GMT #
Posted by 140.211.11.75 on February 10, 2015 at 07:28 AM GMT #
Posted by sham on August 02, 2016 at 04:49 PM GMT #
Posted by Levi Brereton on April 26, 2017 at 05:51 AM GMT #
Posted by eddie007 on January 16, 2019 at 06:34 AM GMT #
Posted by Jassica on May 09, 2019 at 07:19 AM GMT #
Posted by film izle on June 29, 2019 at 09:00 PM GMT #
Posted by trial packs on July 19, 2019 at 08:22 PM GMT #
Posted by 192.168.0.1 login on October 14, 2019 at 05:49 AM GMT #
Posted by deanal Paul on November 21, 2019 at 10:36 AM GMT #
Posted by read more on December 07, 2019 at 02:15 PM GMT #
Posted by Worthgram on January 02, 2020 at 07:50 AM GMT #
Posted by charlie on January 06, 2020 at 10:01 AM GMT #
Posted by gasha on January 07, 2020 at 10:07 AM GMT #
Posted by martin on January 22, 2020 at 10:17 AM GMT #
Posted by view private instagram on February 17, 2020 at 03:58 AM GMT #