SR-D60121 · Issue 525491
All interactions visible in "Latest Responses" for ADM
Resolved in Pega Version 8.3.2
Interactions were not visible in the "Latest Responses" section of the Model Management landing page for Adaptive models if the requests were stored on multi-node systems. This was due to the system fetching the Last Responses using a list of server nodes built using a version of deployment.getClusterState(tools) which gave only the ADM nodes list instead of all the ADM nodes both client and server. To resolve this, the system has been updated to use ServiceRegistry to get all of the data flow nodes and get the last responses from each of them.
SR-D60268 · Issue 521464
Performance and thread-handling improvements for SSA
Resolved in Pega Version 8.3.2
The SecureRandom class was used internally by SSAExecutionContext indirectly via UUID generation. Because this exhibited performance issues on some Linux environments, UUID has been replaced with static AtomicLong. In addition, a memory leak was observed when the strategy (SSA) execution resulted in an exception, and the strategy template has been modified to gracefully shutdown the VM under all circumstances. Thread-safety measures have also been tuned to be more fine-grained to reduce the potential thread contention that was seen while borrowing the SSAInterpreter object from SSAInterpreterPool.
SR-D66223 · Issue 529994
Update Handler will not run during migration
Resolved in Pega Version 8.3.2
Rolling restart of DataFlow, ADM ,VBD, and Util Tiers failed with a PENDING_JOINING error after an in-place upgrade. This was traced to the logic for the update timing: when nodes start after an upgrade from 7.x to 8.x they will migrate data flow runs. Migration happens on only one node, and while it's in progress the other nodes will wait until migration finishes before they come up. At this point the state of the data flow services will be 'PENDING JOINING'. The issue is that while migrating runs, the Data Flow Update Handler was triggered to validate whether there were nodes available on the service the run belongs to. This call can cause the corresponding data flow service to be initialized, but the call will be blocked since all services wait for the migration to end. This resulted in a deadlock which prevented all nodes from coming up successfully. To resolve this, the process has been updated to skip the update handler during migration to avoid triggering the initialization of client services that are waiting on the migration lock.
SR-D66397 · Issue 530331
ADM out-of-sync corrected for multi-datacenter Cassandra cluster
Resolved in Pega Version 8.3.2
After setting up the multi-datacenter configuration for a Cassandra cluster that consisted of six nodes in datacenter 1 and three nodes in datacenter 2, failover testing revealed a mismatch in the number of ADM models stored in each datacenter. The mismatch was observed mostly in the number of records present in the "adm_scoringmodel" and "adm_response_commit_log_date_tiered" tables. When Cassandra nodes are down, the other nodes in the cluster will store hints (records to be written) for the down nodes. When these nodes come back online the hints are replayed to those nodes and the data is written. Hints are written for 3 hours, so if a node come back up within 3 hours data is recovered and repairs are not required. The gc_grace_seconds for the above tables that were getting out of sync across the two datacenters was set to zero seconds. The "gc_grace_seconds" attribute is not just used as the time for removal of tombstones, it's also used to set the TTL for records written to the system.hints table. That meant that when the hints were written for the ADM tables for the nodes that were down, they were immediately expired since it was set to 0 and not played back when the terminated nodes restarted and joined the cluster. This has been resolved with this fix for all customers new to this release. Existing customers already on v7.3 or higher will need to complete the local change detailed below:Connect to the Cassandra cluster using cqlsh in the Pega Cassandra distribution and then run ALTER TABLE adm_commitlog.adm_response_commit_log_date_tiered WITH gc_grace_seconds = 86400; to change the relevant setting from zero to the equivalent of one day - the same length of time that the data in the table lives for. This will mean that any hints written can still be used to replay data to another node while the data itself is alive. It does also mean, however, that, given a constant load, a day's worth of expired ADM event data in the table will always be present on the disk, as the tombstones can now not be cleaned up for a day.
SR-D68707 · Issue 529869
Update Handler will not run during migration
Resolved in Pega Version 8.3.2
Rolling restart of DataFlow, ADM ,VBD, and Util Tiers failed with a PENDING_JOINING error after an in-place upgrade. This was traced to the logic for the update timing: when nodes start after an upgrade from 7.x to 8.x they will migrate data flow runs. Migration happens on only one node, and while it's in progress the other nodes will wait until migration finishes before they come up. At this point the state of the data flow services will be 'PENDING JOINING'. The issue is that while migrating runs, the Data Flow Update Handler was triggered to validate whether there were nodes available on the service the run belongs to. This call can cause the corresponding data flow service to be initialized, but the call will be blocked since all services wait for the migration to end. This resulted in a deadlock which prevented all nodes from coming up successfully. To resolve this, the process has been updated to skip the update handler during migration to avoid triggering the initialization of client services that are waiting on the migration lock.
SR-D69028 · Issue 528972
Deadlock in static Initialization of IntList resolved
Resolved in Pega Version 8.3.2
JVM Deadlock was seen related to the static Initialization of a subclass field in class com.pega.decision.strategy.ssa.runtime.collections.api.IntList . Thread dumps showed threads in RUNNABLE State that were parked to wait for class initialization, and this was traced to a missed sonar alert which failed in multi-threading. To resolve this, the system handling has been updated to prevent potential deadlock.
SR-D71621 · Issue 533294
Real time processing picks up correct datetime for Capture Response records
Resolved in Pega Version 8.3.2
A Realtime Data flow for the Capture Response flow was configured with a strategy shape set to load previous decisions within the past 7 days. Once this Realtime DF was started, attempting to Capture Response for decisions made after that startup timepoint did not work. This was traced to the InteractionID being written with global properties for the datetimes, and has been resolved by making those datetime properties local so the start and end time are not cached and the time range is calculated based on "now”.
SR-D74117 · Issue 539460
DDS service will not run Hazelcast check if external Cassandra is configured
Resolved in Pega Version 8.3.2
Services were not responding, and thread dumps seen in the logs indicated that a large number of threads were waiting for one to come back from getting the cluster state for a DSM process. Investigation showed that the threads were waiting for a Hazelcast response about the cluster state. However, since a Hazelcast call is not needed when Pega is configured with external Cassandra, the DDS Service code has been changed to not to check for candidate nodes if configured with external Cassandra cluster.
SR-D74247 · Issue 542915
Resolved errors when using Build Model from the Preview Console
Resolved in Pega Version 8.3.2
Using the Web Chatbot interface and trying to perform Build Model action from Preview Console failed with multiple errors, either "This action is not allowed as it is outside the current transaction" or "class <blank> doesn't exist". This was traced to issues with the transaction during model update, and has been resolved by conditionally disabling the show page step of pzGetModelProcessStatus. This step creates a difference in the context of the current transaction and is disabled when called from Update API.
SR-D75519 · Issue 536717
Corrected calculating propensities
Resolved in Pega Version 8.3.2
Several PMML models designed to compare the outcomes to a control dataset experienced an issue where the probability scores in Pega did not match the original control dataset. The PMML model was also tested using KNIME; those results matched Pega but not the original control dataset. Investigation showed that the JPMML evaluator contained outdated code, and the incorrect calculations have been resolved.