Skip to main content


         This documentation site is for previous versions. Visit our new documentation site for current releases.      
 

Connecting Kafka and Pega Platform

Updated on July 5, 2022

Apache Kafka is a fault-tolerant and scalable platform that you can use as a data source for real-time analysis of customer records as they occur. Create Kafka data sets to read and write data from and to Kafka topics, and use this data as a source of events, such as customer calls or messages. Your application can use these events as input for rules that process data in real time and then trigger actions.

Before you begin this task, ensure that you understand the following points:

Kafka configuration instances
A Kafka configuration is an instance of data that you create in the Data-Admin-Kafka class of your application. These rules aim to establish a client connection between Pega Platform and an external Kafka server or a server cluster. You will need to define this configuration instance when creating a Kafka data set.
Application Settings
The Application Settings rule setting allows your Kafka data set to use different topics in different environments (for example, development, staging, production), without the need to modify and save a data set rule in each environment.
Message format
There are two message formats that are currently supported for Kafka: JSON and Avro. If you choose Avro as the message format you must preconfigure an Avro schema.
Kafka data sets
Every Kafka server or server cluster to which you connect stores records in streams that are categorized as topics. To access each topic from Pega Platform, a Kafka data set rule must be created. When you configure a Kafka data set, you can choose an existing topic from the target Kafka configuration instance, or create a topic if the Kafka cluster is configured for automatic topic creation. Additionally, you have the option to specify partition keys to be applied to the data during distributed data flow runs, and to read historical Kafka records. These records refer to the records that were created before the real-time data flow run that references this Kafka data set was initiated.
  1. Ensure that a Kafka configuration instance is available in your system.
    For more information, see Creating a Kafka configuration instance.
  2. Configure the Application Settings rule.
  3. Optional: Preconfigure an Avro schema.
  4. Create a Kafka data set.
    For more information, see Creating a Kafka data set.
What to do next: Kafka data sets can be utilized as either a source or a destination in a data flow rule, but only in real-time mode. Because Kafka servers support partitioning, by distributing data flow runs across all nodes that are configured as part of the data flow service you can increase throughput and data flow processing resiliency. For more information on data flows, see Processing data with data flows.
  • Creating a Kafka configuration instance

    To manage connections to your Apache Kafka server or cluster of servers that is the source of your application stream data, configure a Kafka configuration instance in the Pega Platform Data-Admin-Kafka class.

  • Configuring application settings for Kafka data set topics

    Use the Use application settings with topic values setting in a Kafka data set creation ruleset to use different topics in different environments (for example, development, staging, production), without the need to modify and save a data set rule in each environment.

  • Configuring Avro schema for Kafka data set

    When you configure a Kafka data set, you can choose Apache Avro as your data format for the Kafka message values and message keys. Avro is a lightweight binary message encoding, which relies on schemas to structure the encoded data.

Have a question? Get answers now.

Visit the Support Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega.com is not optimized for Internet Explorer. For the optimal experience, please use:

Close Deprecation Notice
Contact us