Configuring Hadoop settings for an HDFS connection
Use the HDFS settings in the Hadoop data instance to configure connection details for the HDFS data sets.
- In the Connection tab of a Hadoop data instance, select the Use HDFS configuration check box.
- In the User name field, enter the user name to authenticate in HDFS.
- In the Port field, enter the port of the HDFS NameNode. The default port is 8020.
- Optional: To specify a custom HFDS NameNode, select the Advanced
configuration check box.
In the Namenode field, specify a custom HDFS NameNode that is different from the one defined in the common configuration.
In the Response timeout field, enter the number of milliseconds to wait for the server response. Enter zero or leave it empty to wait indefinitely. The default timeout is 3000.
- In the KMS URI field, specify an instance of Hadoop Key
Management Server to access encrypted files from the Hadoop server. For example, for a
KMS server running on
http://localhost:16000/kms
, the KMS URI iskms://http@localhost:16000/kms
.
- Optional: To enable secure connections, select the Use authentication
check box.
In the Master kerberos principal field, enter the Kerberos principal name of the HDFS NameNode as defined and authenticated in the Kerberos Key Distribution Center, typically following the nn/<hostname>@<REALM> pattern.
In the Client kerberos principal field, enter the Kerberos principal name of a user as defined in Kerberos, typically in the following format: <username>/<hostname>@<REALM>.
In the Keystore field, enter the name of a keystore that contains a keytab file with the keys for the user who is defined in the Client Kerberos principal setting.
- Test the connection to the HDFS NameNode, by clicking Test connectivity.
Previous topic Creating an HDFS data set record Next topic About Hadoop host configuration (Data-Admin-Hadoop)