Skip to main content


         This documentation site is for previous versions. Visit our new documentation site for current releases.      
 

Planning your case archiving process

Updated on July 8, 2022

Plan your case archiving process so that it is optimally efficient and meets the needs of your organization's database size.

Before you begin: Before you run the archive process, complete the following tasks:
  • If you use an on-premises or client-managed cloud deployment, configure a secondary storage repository to which to copy your archive files. For more information, see Secondary storage repository for archived data.
  • For resolved cases that could be re-opened in the near future, to ensure that archiving this type of case will not result in permanently deleting the case, configure the data retention policy to include extra time to account for this possibility. To configure a data retention policy, see Configuring your data retention policy. Alternatively, you can exclude these case types from archiving so there is no chance that you will permanently delete this resolved case. For more information about excluding case types, see Excluding cases from an archiving policy.
Running a successful archiving process means archiving cases faster than the rate at which the cases become eligible for archiving. When you start the case archiving process, you might have a backlog of cases that are eligible for archiving. The following figure shows how this backlog can grow as more cases become eligible for case archiving every day:
Case archiving plan
A case archiving policy ensures that the rate of archiving and purging meets or exceeds the rate of growth of eligible cases.

After you implement an archive process, you can monitor its performance and determine if it needs further changes by reviewing the following details and maintenance suggestions:

  • Run a report on the archived cases resulting from your archive process. This reportt is the best way to verify that the process meets your organization's archiving needs.
  • To improve searches of archived cases data, optimize the properties on which you can configure a search index. After you have archived case data, configure your archive search indexing based on these specific, optimized properties. For details about property optimization, see Optimizing database properties; for details about marking a case type to be available in a search index, see Including indexed data during case search.
  • If your case archival policy includes case types with a parent/child hierarchy, both parent and child cases must be resolved before the policy will archive both the parent and child cases. For more information, seeCases in a hierarchy.
  • If you imported a Rule-Admin-Product (RAP) rule with the Data-Retention-Policy class instances, review the case archiving settings for your cases, since this import overrides the case archiving polices.
  • To check the performance of your case archiving process (see steps 4c and 4d) add the following custom indexes on the case type worktable and on the pr_metadata table :
    • For the index on the work table for the case type, enter
      (pxobjclass, pyresolvedtimestamp) include (pystatuswork, pzinskey) where pxcoverinskey is null 
      .
    • For the index on the pr_metadata, enter
      (pyisparent, pyarchivestatus) include (pzinskey)
      (pyobjclass, pyarchivestatus) include (pyInsKey, pzInsKey)
      (pyobjclass , pyarchivestatus ) include (pyfilelocation , pzinskey, pyinskey)

    When running case archive jobs, use the following settings:

    • Configure dataarchival/LimitPerPolicy to change the rate at which you archive eligible cases.
    • Configure dataarchival/purgeQueryLimit to change the rate at which you delete records after they are archived to an external repository.

      For more information, see Settings that control case archiving processes.

    Plan to run your case archive jobs at an initially fast rate to clear your backlog of resolved cases and free up database space.

  1. Identify the number of cases in your initial backlog and the rate at which cases become eligible for archiving.
  2. Create a targeted archiving policy for your case archiving needs.
    For example, your system contains a backlog of 10,000 cases and adds 500 cases a day. For more information about creating an archiving policy, see Creating an archiving policy.
  3. Schedule a case archiving process that finishes during a time period of low-system load.
    For example, your system might experience low system load every day for five hours or every weekend for 12 hours. For more information about scheduling an archiving policy, see Scheduling a case archiving policy.
  4. Determine the upper range for case archive processing by adjusting the dataarchival/LimitPerPolicy and dataarchival/purgeQueryLimitsettings:
    1. Set a low rate in the dataarchival/LimitPerPolicy and dataarchival/purgeQueryLimit dynamic system settings, respectively.
    2. Run a process with the following jobs by using the low value for the number of cases:
    3. Monitor the progress of your case archiving process. For more information, see Monitoring the progress of your case archiving process.
    4. Evaluate the performance impact and time that the low rate in your case archiving process has on your system.
    5. Increase the case archiving settings, and then run the process again with increased values.
      A faster rate of archiving impacts system resources more, and takes longer to complete.
      Note: The pyPegaPurger job can take a significant amount of time, and even a small value, such as 5000, for the dataarchival/purgeQueryLimit setting can cause timeout errors. If you encounter an error, use a smaller value for dataarchival/purgeQueryLimit and then schedule multiple runs of only the pyPegaPurger job until pr_metadata is empty. An empty pr_metadata indicates that pyPegaPurger purged all the cases that the pyPegaArchiver, pyPegaIndexer, and pyPegaPurger jobs processed.
    6. Continue to increase the rate to determine that maximum value with which you can complete the process within a low system load duration and with an acceptable system impact.
    7. Optional: If you need to preserve archived case data, place an exclusion on archived cases to prevent permanent deletion of the case data.
  5. Clear your backlog by using the maximum rate you determined in the previous step.
    The maximum rate of case archiving determines the time frame for clearing your backlog.
  6. Adjust the case archive process after clearing your backlog:
    • If your current case archiving rate meets or exceeds the rate at which cases become eligible for archiving, keep that archiving rate.
    • If your current case archiving rate is slower than the rate at which cases become eligible for archiving, plan to run your case archiving jobs more frequently to archive faster.
    For example: If your rate does not exceed the rate at which cases become eligible for archiving, then run your case archiving jobs twice a day instead of once.
    You can also quickly test an archiving policy using a test mode on non-production systems.
  7. To enable a testing mode to configure an archiving policy in minutes instead of days, select the Enable testing mode check box. .
    Note: You cannot enable testing mode on a production environment.
    1. Select the Enable Archiving check box to set a testing-mode archive.
    2. In the Archive cases that are resolved earlier than (in days) field, enter the number of minutes that a case requires in a Resolved-* status before it is eligible for archiving.

    Use the testing mode when you want to create and resolve sample cases, and then run an immediate case archive process to test the functionality.

    Note: Pega Platform permanently deletes archived data immediately from an external repository while in testing mode.
What to do next:
  1. Continue to add and remove case exclusions as needed.
  2. Run your expunger job to permanently delete case data from your external repository.
    1. Enable a permanent deletion policy for your archived case data.

      For more information, see Creating an archiving policy.

    2. Schedule py_PegaExpunger to run after the pyPegaPurger job completes.

      When the job runs, it permanently deletes the eligible data based on your data retention policy.

Have a question? Get answers now.

Visit the Support Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega.com is not optimized for Internet Explorer. For the optimal experience, please use:

Close Deprecation Notice
Contact us