When you have a secure Hadoop cluster you need the ability of services to be able to authenticate and execute applications on behalf of the user.
If a service user with name service_runner wants submit a yarn job and access HDFS for user bob. The service_runner user has kerberos credentials but bob does not, but the service_runner does not have the level of access the user bob does.
It is required that user bob is used to connect to the namenode or the job tracker on a connection authenticated with service_runner kerberos credentials. In other words service_runner needs to impersonate user bob.
Host level control
<property>
<name>hadoop.proxyuser.service_runner.hosts</name>
<value>10.222.0.0/16,10.113.221.221</value>
</property>
<property>
<name>hadoop.proxyuser.service_runner.users</name>
<value>bob</value>
</property>
service_runner can impersonate bob from 10.222.0.0-15 and 10.113.221.221
group level control too open
<property>
<name>hadoop.proxyuser.service_runner.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.service_runner.users</name>
<value>*</value>
</property>
service_runner can impersonate any user on the cluster from any host
group level control close
<property>
<name>hadoop.proxyuser.service_runner.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.service_runner.groups</name>
<value>service_runner_execute</value>
</property>
service_runner can impersonate any user that is in the service_runner_execute group
So when you are on boarding a new product to your Hadoop cluster adding the recommended setting might open your cluster to security issues, especially to malicious insiders. Consider limiting to a known group of users and having your security explicit inclusion rather than open to all by default.
Top comments (0)