Connecting Kafka and Pega Platform
Apache Kafka is a fault-tolerant and scalable platform that you can use as a data source for real-time analysis of customer records as they occur. Create Kafka data sets to read and write data from and to Kafka topics, and use this data as a source of events, such as customer calls or messages. Your application can use these events as input for rules that process data in real time and then trigger actions.
Before you begin this task, ensure that you understand the following points:
- Kafka configuration instances
- A Kafka configuration is an instance of data that you create in the Data-Admin-Kafka class of your application. These rules aim to establish a client connection between Pega Platform and an external Kafka server or a server cluster. You will need to define this configuration instance when creating a Kafka data set.
- Application Settings
- The Application Settings rule setting allows your Kafka data set to use different topics in different environments (for example, development, staging, production), without the need to modify and save a data set rule in each environment.
- Message format
- There are two message formats that are currently supported for Kafka: JSON and Avro. If you choose Avro as the message format you must preconfigure an Avro schema.
- Kafka data sets
- Every Kafka server or server cluster to which you connect stores records in streams that are categorized as topics. To access each topic from Pega Platform, a Kafka data set rule must be created. When you configure a Kafka data set, you can choose an existing topic from the target Kafka configuration instance, or create a topic if the Kafka cluster is configured for automatic topic creation. Additionally, you have the option to specify partition keys to be applied to the data during distributed data flow runs, and to read historical Kafka records. These records refer to the records that were created before the real-time data flow run that references this Kafka data set was initiated.
- Ensure that a Kafka configuration instance is available in your system.For more information, see Creating a Kafka configuration instance.
- Configure the Application Settings
rule.For more information, see Configuring application settings for Kafka data set topics.
- Optional: Preconfigure an Avro schema.For more information, see Configuring Avro schema for Kafka data set.
- Create a Kafka data set.For more information, see Creating a Kafka data set.
- Creating a Kafka configuration instance
To manage connections to your Apache Kafka server or cluster of servers that is the source of your application stream data, configure a Kafka configuration instance in the Pega Platform Data-Admin-Kafka class.
- Configuring application settings for Kafka data set topics
Use the Use application settings with topic values setting in a Kafka data set creation ruleset to use different topics in different environments (for example, development, staging, production), without the need to modify and save a data set rule in each environment.
- Configuring Avro schema for Kafka data set
When you configure a Kafka data set, you can choose Apache Avro as your data format for the Kafka message values and message keys. Avro is a lightweight binary message encoding, which relies on schemas to structure the encoded data.
Previous topic Partition keys for Stream data sets Next topic Creating a Kafka configuration instance