Skip to main content


         This documentation site is for previous versions. Visit our new documentation site for current releases.      
 

Configuring Hadoop settings for an HBase connection

Updated on May 17, 2024

Use the HBase settings in the Hadoop data instance to configure connection details for the HBase data sets.

By using the Hadoop infrastructure, you can process large amounts of data directly on the Hadoop cluster and reduce the data transfer between the Hadoop cluster and the Pega Platform. Hadoop configuration instances are records in the SysAdmin category and belong to the Data-Admin-Hadoop class.

Before you begin: Before you can connect to an Apache HBase or HDFS data store, upload the relevant client JAR files into the application container with Pega Platform. For more information, see HDFS and HBase client and server versions supported by Pega Platform.
  1. In the header of Dev Studio, click CreateSysAdminHadoop.
  2. On the Create Hadoop form, enter a description and a name for the Hadoop data instance.
  3. Click Create and open.
  4. On the Connection tab of a Hadoop data instance, select the Use HBase configuration.
  5. In the Client list, select one of the HBase client implementations.
    The selection of this setting depends on the server configuration.
    ChoicesActions
    REST
    Note: To use the REST implementation, an HBase server must be running.
    1. In the Port field, provide the port on which the REST gateway is set up.
      The default port is 20550.
    2. Optional: To use custom settings, select the Advanced configuration check box.
    3. Optional: In the REST host field, specify a custom REST host that is different from the one defined in the common configuration.
    4. Optional: In the Response timeout field, enter the number of milliseconds to wait for the server response. Enter zero to remove the timeout.
      The default timeout is 5000.
    Java
    1. In the Port field, provide the port for the Zookeeper service.
      The default port is 2181.
    2. Optional: To use custom settings, select the Advanced configuration check box.
    3. Optional: In the Zookeeper host field, specify a custom HBase Zookeeper host that is different from the one defined in the common configuration.
    4. Optional: In the Response timeout field, enter the number of milliseconds to wait for the server response. Enter zero to remove the timeout.
      The default timeout is 5000.
    5. Optional: To enable secure connections, select the Use authentication check box, and then go to step 6.
  6. Optional: To configure secure connections for Java, perform the following actions:
    Note: To authenticate with Kerberos, you must configure your environment. For more, see the Kerberos documentation about the Network Authentication Protocol and Apache HBase documentation on security.
    1. In the Master kerberos principal field, enter the Kerberos principal name of the HBase master node as defined and authenticated in the Kerberos Key Distribution Center, typically in the following format: hbase/<hostname>@<REALM>
    2. In the Client kerberos principal field, enter the Kerberos principal name of a user as defined in Kerberos, typically in the following format: <username><hostname>@<REALM>
    3. In the Keystore field, enter the name of a keystore that contains a keytab file with the keys for the user that you defined in the Client kerberos principal field.
      Note: The keytab file is in a readable location in the Pega Platform server, for example: /etc/hbase/conf/thisUser.keytab or c:\authentication\hbase\conf\thisUser.keytab
  7. Test the connection to the HBase master node by clicking Test connectivity.

Have a question? Get answers now.

Visit the Support Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega.com is not optimized for Internet Explorer. For the optimal experience, please use:

Close Deprecation Notice
Contact us