Support Play: Tuning the IBM JVM 1.4.2 for performance
Summary
As a Java application, Process Commander runs in the Java Virtual Memory (JVM). A Java application allocates all its objects in the memory heap, which is managed by the Garbage Collector. To maximize the efficiency of an application running in a JVM, it is important to tune the JVM - specifically, the memory allocation and garbage collection functions.
You can install Process Commander with either of the most popular JVMs on the market: the Sun JVM and the IBM JVM. This article describes tuning that you can perform with the IBM JVM, and presents example messages that identify areas which could benefit from tuning.
As of January 2007, two IBM JVM versions are widely used by sites: 1.4.2 and Java 5. Although the concepts of garbage collection are the same for these two, the implementation and the garbage collection information reported in the log file, differ radically..
As of this writing, Pegasystems has not had enough experience with the Java 5 JVM to publish specific parameter information on the best way to tune with this JVM. Therefore, although this article mentions Java 5 in a few places, and although the concepts described apply to both versions, most specific information is for Java 1.4.2.
Quick Links
- Mark
- Sweep
- Compaction
- Parallel Mark
- Concurrent Mark
How To View Garbage Collection Statistics
Viewing verboseGC from the log
- Compaction data
- Concurrent Mark data
Viewing verboseGC from Process Commander Monitoring Tools
- Garbage Collection Statistics
- Log Usage Statistics
Troubleshooting an Out-Of-Memory Error
Suggested Approach
IBM has published the IBM Developer Kit and Runtime Environment Java 2 Technology Edition Version 1.4.2 Diagnostics Guide (Eighth Edition, April 2006). You can download this book is as a PDF file from http://www-128.ibm.com/developerworks
/java/jdk/diagnosis/142.html
There are a number of quotes in this Support Play from the Diagnostics Guide, specified by page number.
Background
The IBM Java Virtual Machine (JVM) allocates space for applications to run in virtual memory. This space is known as the “heap.” On initialization, the JVM allocates the whole heap in a single contiguous area of virtual storage; the amount of memory that is allocated is determine by the setting of the –Xmx parameter (details on this setting later in the Play). Within that limit, the heap may expand or contract, depending upon the space required for various objects. (Specifics on precisely when the heap expands or contracts are outside the scope of this Play, but are described in the IBM Diagnostic Guide.) For the purposes of this Support Play, it is enough to know that the heap will handle this resizing efficiently, if the application is well written.
When a Java application runs, every time it creates a Java object for any purpose, memory space for that object must be allocated from the heap. Depending upon the type of object – its size, and the length of time it will remain “live” – space will be allocated in a different part of the heap; similar object allocations are grouped together for performance optimization.
Objects in Java are allocated space in the heap wherever there is enough contiguous space for that object. Similarly to data being stored on a PC’s hard drive, the heap space becomes fragmented over time. For example, in the entire heap, there may be enough space to open a 640KB object, but this space is broken up into 128K chunks, so the space is not contiguous and cannot be used for this object. (Unlike a PC, which can store one large file in several locations, the space for Java object allocations must be all together.)
Thus, the heap has a space for small objects (the Small Object Area, or SOA), where it is expected that many small objects will be stored, deleted, and new ones stored again, and the space for large objects (the Large Object Area, called the “wilderness” in versions before 1.4.2, and now called the “LOA”), which is designed to hold objects which are 64K or larger. (For example, the Process Commander caches are generally allocated in the LOA.) Depending upon the usage of these areas (the number of small objects vs. large objects), the heap will automatically shrink or expand these areas as needed.
In addition to size of objects, objects are stored in different locations in the heap based on their tenure. Some objects are short-lived, and are discarded and ready for garbage collection quickly. Others, like the Process Commander caches, may be in use as long as the application or the JVM is running. Beginning in Java 5, as objects “age,” they are moved from the short-term allocation (the “nursery”) to long-term allocation (“tenured”).
JVM Parameters
There are many parameters available for the IBM JVM, which allow customization of each system. The parameters detailed in this section are the ones that Process Commander recommends adjusting. Other parameters may also be adjusted at the discretion of the your Java administrator.
- -Xms<size> Sets the initial size of the heap.
- -Xmx<size> Sets the maximum size of the heap.
The Java heap size is one of the most important tunable parameters of your JVM. It is especially important if several processes and JVMs are being run on your system. The heap contains all Java objects (live and discarded), along with free memory.
Some manuals recommend setting the –Xmx and the –Xms to the same value. However, using the same values is not generally a good idea; since garbage collection is based on how full your heap is, a large minimum value will delay the start of garbage collection until the heap is full. The first time that the Garbage Collector runs on this large heap will therefore take a long time, which becomes an expensive operation. Also, a larger heap is most likely to be very fragmented, leading to the need to do a heap compaction - again, a very expensive operation. The recommendation is to start your application with the minimum heap size that it needs. When it starts up, the Garbage Collector will run often and, because the heap is small, efficiently. The heap will then expand to an appropriate size for your application.
The JVM should start with a small Java heap; it expands the heap based on an application’s allocation requests until it reaches the value specified by -Xmx. Expansion occurs after garbage collection if GC is unable to free enough heap storage for an allocation request, or if the JVM determines that expanding the heap is required for better performance. (IBMDiagnostic Guide, p. 28)
What you consider to be an acceptable heap size depends on your application; you will certainly need to experiment. In addition to balancing the frequency and length of garbage collections, you must also remember that memory that is allocated to one applications heap is not available to other applications. This is an example of fixing a bottleneck in one area, by increasing heap size to decrease frequency of garbage collection, and causing problems somewhere else. For example, other processes might have to use paging to supplement their diminished memory. Under no circumstances should heap size be larger than physical memory. (IBM p. 141)
Examples of the use of -Xms and -Xmx are:
-Xms2m -Xmx64m: Heap starts at 2 MB and grows to a maximum of 64 MB.
-Xms100m X mx100m: Heap starts at 100MB and can never grow.
-Xms20m –Xmx1024m: Heap starts at 20MB and grows to a maximum of 1GB.
-Xms50m: Heap starts at 50MB and grows to the default maximum.
-Xmx256m: Heap starts at default initial value and grows to a maximum of 256MB.
(IBM pp. 493-4)
IMPORTANT NOTE: When entering the -Xmx and -Xms information, make sure to type a lowercase "m" after the number: -Xms100m . If you use an upper-case M , Java will not read it, and will then report "out of memory" errors, because the memory was not set.
Pinned Clusters
Objects on the Java heap are usually mobile; that is, the Garbage Collector can move them around if it decides to resequence the heap. Some objects, however, cannot be moved, either permanently or temporarily; these immovable objects are known as pinned objects.
Some objects become “pinned” because there are references to them that the Garbage Collector cannot verify; thus, they cannot be collected as garbage. There are also types of objects (such as class definitions) which are always pinned.
Pinned objects are stored in two areas:
- the kCluster
- the pCluster
The kCluster is an area of storage (outside the other allocation areas in the Java heap, separate from the SOA and the LOA) which is used exclusively for class blocks – loaded class definitions. When a class is referenced for execution, the JVM classloader must load the class and store its byte code signature within the JVM; that information is stored in the kCluster. By default, the kCluster area holds 1280 entries (class references).
When this area is full, the JVM begins allocating class definitions into the pCluster. The pCluster is an area of storage that is used to allocate any pinned objects; by default, it is 16 KB long.
When the kCluster is full, the Garbage Collector allocates class blocks in the pCluster. When the pCluster is full, the Garbage Collector allocates a new pCluster of 2 KB. Because this new pCluster can be allocated anywhere in the heap, it can cause problems. The pinned objects effectively deny the GC the ability to combine free space during heap compaction, and could result in a heap that contains a lot of free space, but in relatively small discrete amounts, so that an allocation that appears to be well below the total free heap space will fail. (IBM p. 12)
NOTE: The kCluster command-line option is an integer, representing the number of class entries. The pCluster option refers to the size of the area allocated (which is why there is a lower-case “k” at the end, signifying kilobytes).
Garbage Collection
As stated above, objects are allocated space in the heap, used, and eventually discarded. An object continues to be “live” while a reference (or pointer) to it exists somewhere in the system; when an object ceases to be referenced from the active state, it becomes garbage. Rather than keeping all the discarded objects in the heap (which would eventually take up all the space in the system), the Garbage Collector runs and removes the objects from allocation, freeing up the memory they occupied for reuse.
Memory management in the IBM JVM contains the Garbage Collector and the Allocator, and is responsible for allocating memory as well as collecting garbage. Because the task of memory allocation is small, compared with the task of garbage collection, the term “garbage collection” is also generally used to mean “memory management.”
The IBM Garbage Collector operates in a “stop-the-world” mode, because all application threads are paused while the garbage is collected – no user processing is accomplished during this time. Thus, it is important that garbage collections be as efficient as possible.
A high-level overview of the garbage collection process is as follows:
- The application needs to instantiate some new Java object, and tries to allocate this object in the heap.
- Not enough contiguous space exists in the heap for this object, so there is an allocation failure (memory allocation fault). (NOTE: This “failure” does not, at this point, show an error that a user would see; it will only be visible in the GC log, if verboseGC is enabled.)
- Due to the allocation failure, a garbage collection is triggered. The Garbage Collector collects and discards all the garbage in the heap that it can, freeing up allocation space.
- The allocation for the new Java object is retried.
Garbage collection can occur in three phases:
- Mark
- Sweep
- Compaction (optional)
NOTE: A Java program explicitly call to the Garbage Collector by using the System.gc() call. However, both Pegasystems and IBM recommend that this call not be used, especially in a runtime environment. The Garbage Collector is carefully adjusted to deliver maximum performance to the JVM. Forcing it to run with an explicit call during processing will severely degrade the JVM performance (IBM, p. 27).
Mark
In the mark phase, all the objects that are referenced in the system are identified, and also scanned for other references that they make. This identifies all the “live” objects, as well as the objects which are no longer referenced. Marking is a fairly expensive operation in terms of system resources.
Sweep
In this phase, the system goes through and collects all the unreferenced objects as “garbage,” freeing up the space they were allocated. Sweeps are quick, and do not require a lot of system resources.
Compaction
As stated above, generally a garbage collection is triggered by the system trying and failing to allocate heap space to a Java object. After the system marks all the objects and does a sweep to free up space, the allocation is attempted again. If the space in the heap is very fragmented, and there is still not enough contiguous space to allocate the object after a standard garbage collection, then a compaction occurs. This is sort of like a defragmentation of a PC drive; all of the referenced objects are moved within the heap space so they are compacted together, freeing up more contiguous space. Compaction occurs rarely, and requires a lot of system resources.
Parallel Mark
The majority of garbage collection time is spent marking objects. Therefore, a parallel version of mark has been developed. The goal of Parallel Mark is not to degrade mark performance on a uniprocessor, and to increase typical mark performance on a multiprocessor system. (IBM p. 15)
Rather than just using one thread for the Mark phase, Parallel Mark uses one application thread and multiple helper threads. Parallel Mark is most useful on a multiprocessor system; a platform with N processors will have N-1 helper threads created by default. (It is possible to change the number of helper threads, but N-1 is optimal and should not be changed for most systems.) More marking will be accomplished in less time, which results in performance improvements (as the marking is being done in the background, rather than in the stop-the-world mark phase).
For multiple-CPU systems, Parallel Mark is enabled by default; this setting can also be used in conjunction with Concurrent Mark (next section).
Concurrent Mark
Concurrent Mark allows reduced garbage collection times by starting a marking phase before the heap is full. Marking unreferenced items as garbage is done in the background, so that when an allocation failure occurs and garbage must be collected, only the much quicker sweep phase must be run, and the other threads are “stopped” for a much shorter time. The Garbage Collector tries to start the concurrent mark phase so that it completes at the same time as the heap is exhausted (IBM p. 16).
Note that although this processing is done in the background, it still will require some CPU. The overhead varies depending upon how much idle CPU time is available for the background thread.
By default, Concurrent Mark is disabled. To enable Concurrent Mark, use the following parameter:
-Xgcpolicy:optavgpause
How to View Gargage Collection Statistics
When doing any tuning of a JVM, begin by using the verboseGC setting to create a garbage collection log file (known as the “verboseGC log” or “verboseGC file”).
Important note: VerboseGC logging overhead is small, so enabling this switch in any environment is strongly recommended.
You can use these settings :
- -verbosegc, -verbose:gc Prints garbage collection information. The format for the generated information is not architected and therefore varies from platform to platform and release to release. (IBM p. 491)
- -Xverbosegclog:<path to file><filename> Causes verboseGC output to be written to the specified file. If the file cannot be found, verboseGC tries to create the file, and then continues as normal if it is successful. If it cannot create the file (for example, if an invalid filename is passed into the command), it will redirect the output to stderr. (Note: The environment variable IBM_JVMST_VERBOSEGC_LOG has been removed from V1.4.1 and above.) (IBM p. 494)
- Xverbosegclog:<path to file><filename, X, Y> X and Y are integers. This option works similarly to -Xverbosegclog:<path to file><filename>, but, in addition, the verboseGC output is redirected to X files, each containing verboseGC output from Y GC cycles. (Note: The environment variable IBM_JVMST_VERBOSEGC_LOG has been removed from 1.4.1 and above. ) (IBM p. 494)
To enable these settings, put them into the Java command line. For Process Commander, that generally involves using the application server Administrative Console (for WebSphere or WebLogic) or editing the setenv
file (for Tomcat).
Whenever the setting is enabled, the verboseGC log file will be created. The default name begins with “native_stderr” and ends with “.log”; custom file names may be used by setting Xverbosegclog
.
The location of this file will vary according to the application server. For example, the WebSphere Diagnostic Data paper says the following:
“Native code running in a WebSphere Application Server process can write data to the process logs (also called native logs). Native code is non-Java code, typically found in files with .dll, .exe, and .so extensions. The process logs are named native_stdout.log and native_stderr.log. They are located in the <WAS_install_root>/profiles/<profile>/logs/<server> directory.”
(Sadtler, Carla, and David Titzler: WebSphere Application Server V6: Diagnostic Data. IBM Redbooks Paper, 2005, p. 12.
NOTE: The output provided by – verbose:gc differs between Java 1.4.2 and Java 5. This Support Play covers the output log for Java 1.4.2.
Viewing verboseGC from the log
Below is sample output from a verboseGC log:
<AF[4]: Allocation Failure. need 8192 bytes, 19877299 ms since last AF>
<AF[4]: managing allocation failure, action=1 (0/765039592) (40265240/40265240)>
<GC(4): freeing class com.ibm.ws.Transaction.JTA.XARecUtil(0x102c4b80)>
<GC(4): unloaded and freed 1 class>
<GC(4): GC cycle started Wed May 3 21:36:09 2006
<GC(4): freed 557940264 bytes, 74% free (598205504/805304832), in 162 ms>
<GC(4): mark: 150 ms, sweep: 12 ms, compact: 0 ms>
<GC(4): refs: soft 0 (age >= 32), weak 0, final 29094, phantom 0>
<AF[4]: completed in 163 ms>
<AF[5]: Allocation Failure. need 512 bytes, 18672452 ms since last AF>
<AF[5]: managing allocation failure, action=1 (0/765039592) (40265240/40265240)>
<GC(5): GC cycle started Thu May 4 02:47:22 2006
<GC(5): freed 562285064 bytes, 74% free (602550304/805304832), in 149 ms>
<GC(5): mark: 137 ms, sweep: 12 ms, compact: 0 ms>
<GC(5): refs: soft 0 (age >= 32), weak 2, final 27321, phantom 0>
<AF[5]: completed in 150 ms>
Each GC message starts with an Allocation Failure, which is what generally triggers a garbage collection.
<AF[4]: Allocation Failure. need 8192 bytes, 19877299 ms since last AF>
All of the lines in the first section of this example begin with the number “4” (“AF[4]” “GC(4)”), signifying that this was the fourth garbage collection in this JVM. (In the next section, all the lines begin with “5”.)
The first line in a section shows the size of the object that needs allocation (“8192 bytes”), and the amount of time, in milliseconds, since the last Allocation Failure (“19877299 ms” or around 5 ½ hours).
<AF[4]: managing allocation failure, action=1 (0/765039592) (40265240/40265240)>
The next line states that the system is getting ready to do something about the Allocation Failure. The numbers at the end show the free space in the heap: the first set of numbers shows the free space in the main part of the heap (“0/765039592” or no free space available out of 765MB) and the second set shows the free space in the LOA (40265240/40265240, or the entire 40MB is available).
The action shows what the Garbage Collector is going to do about the allocation failure. There are seven different actions (all detailed in the IBM Guide, p. 302), but the most common are:
Action | Description |
---|---|
1 | This action performs a garbage collection without using the LOA. It is designed to avoid compactions by keeping the LOA available for a large allocation request. |
2 | This action means that the Garbage Collector has tried to allocate out of the LOA and failed. |
Thus, in the above log example, the Garbage Collector was trying to allocate 8192 bytes in the regular area of the heap, and failed.
The next section of the log shows the garbage collection activities:
<GC(4): freeing class com.ibm.ws.Transaction.JTA.XARecUtil(0x102c4b80)>
<GC(4): unloaded and freed 1 class>
<GC(4): GC cycle started Wed May 3 21:36:09 2006
<GC(4): freed 557940264 bytes, 74% free (598205504/805304832), in 162 ms>
<GC(4): mark: 150 ms, sweep: 12 ms, compact: 0 ms>
<GC(4): refs: soft 0 (age >= 32), weak 0, final 29094, phantom 0>
The date and time of the start of the garbage collection cycle is shown. The next line shows how much memory was freed up by the garbage collection, and the percentage now free in the specified area (the regular area of the heap – in this case, 74% is now free – although not necessarily contiguous).
This section also shows how long the garbage collection took in the system. For the above example, the collection took 162 milliseconds, or .162 second; of that, the mark phase of GC took 150 ms, and the sweep phase took 12 ms. No compaction was done.
The final line refers to details of the garbage collection, which this Support Play does not cover. (See the IBM Guide for full details on these settings.)
Compaction data
Some allocations will require compaction:
<AF[40]: Allocation Failure. need 329232 bytes, 1551 ms since last AF>
<AF[40]: managing allocation failure, action=2 (367741616/805304832)>
<GC(40): GC cycle started Fri May 26 12:13:15 2006
<GC(40): freed 46126728 bytes, 51% free (413868344/805304832), in 2941 ms>
<GC(40): mark: 256 ms, sweep: 22 ms, compact: 2663 ms>
<GC(40): refs: soft 0 (age >= 32), weak 25, final 86, phantom 0>
<GC(40): moved 8134202 objects, 377239360 bytes, reason=16, used 170232 more bytes>
<AF[40]: managing allocation failure, action=3 (413868344/805304832)>
<GC(40): need to expand mark bits for 946272768-byte heap>
<GC(40): expanded mark bits by 2203648 to 14786560 bytes>
<GC(40): need to expand alloc bits for 946272768-byte heap>
<GC(40): expanded alloc bits by 2203648 to 14786560 bytes>
<GC(40): need to expand FR bits for 946272768-byte heap>
<GC(40): expanded FR bits by 4407296 to 29573120 bytes>
<GC(40): expanded heap by 140967936 to 946272768 bytes, 58% free, ratio:0.332>
<AF[40]: completed in 3181 ms>
In the above example, the system needed 329Kb of contiguous space. As this is a large amount of space, it tried to allocate this into the LOA, and failed; evidently, although there is a lot of space available (367Mb), there wasn’t enough contiguous space. Therefore, a GC cycle was started.
Obviously, if there is 367Mb “free” but it is not possible to allocate a 329Kb object, the LOA space is probably pretty fragmented. Thus, the compaction runs. The system shows that many objects were moved to different locations in the heap:
<GC(40): moved 8134202 objects, 377239360 bytes, reason=16, used 170232 more bytes>
In this garbage collection, not only was a compaction necessary, but (based on the rules set up in Java) the heap needed to be expanded in order to allocate this object (as shown by the last few messages in this section).
Concurrent Mark data
A system which has Concurrent Mark enabled will show slightly different verboseGC output in the log:
<AF[4]: Allocation Failure. need 528 bytes, 84540 ms since last AF or CON>
<AF[4]: managing allocation failure, action=1 (0/510025904) (26843472/26843472)>
<GC(4): Bytes Traced =35614619 (Foreground: 27422093+ Background: 8192526) State = 1 >
<GC(4): GC cycle started Thu Sep 21 17:29:17 2006
<GC(4): wait for concurrent tracers: 1 ms>
<GC(4): freed 400674608 bytes, 79% free (427518080/536869376), in 150 ms>
<GC(4): mark: 121 ms, sweep: 29 ms, compact: 0 ms>
<GC(4): refs: soft 0 (age >= 32), weak 85, final 1590, phantom 1>
<AF[4]: completed in 154 ms>
The “bytes traced” number shows the number of bytes for objects the concurrent marking phase has tracked down to determine whether they can be garbage-collected or not.
The state is a significant number in the concurrent mark GC log. There are a number of possible states:
State | IBM Value | Description |
---|---|---|
0 | HALTED | The concurrent marking process was halted as a result of the allocation failure. |
1 | EXHAUSTED | The system went through the entire heap and marked everything that was possible to mark as garbage; there is nothing more to mark in the concurrent marking process. (NOTE: If the system gets to either EXHAUSTED state, a GC is automatically started.) |
2 | EXHAUSTED_BK_HELPER | Using background/helper threads, the system went through the entire heap and marked everything that was possible to mark as garbage; there is nothing more to mark in the concurrent marking process. (NOTE: If the system gets to either EXHAUSTED state, a GC is automatically started.) |
3 | ABORTED | An allocation failure occurred before the concurrent marking process was able to mark enough to be efficient. |
Viewing verboseGC from Process Commander Monitoring Tools
The Process Commander monitoring tool (the Monitor Servlet in Version 4.2, or the System Management Application in Version 5.1) allows data from the verboseGC log to be viewed in increments, by specifying a start time and a stop time.
There are three buttons on this window:
Field | Description |
---|---|
View | View the requested data on the page. |
Collect | Force an explicit garbage collection. NOTES: This button is provided for debugging purposes only – generally it is recommended to allow the JVM to perform the garbage collection on its own. As this button applies to garbage collection, it is grayed out when Log Usage Statistics is chosen. |
CSV | Create a comma-delimited report of the requested data for download. |
Garbage Collector Statistics
This view uses the logfile that is generated by the VerboseGC JVM parameter, and organizes the GC data (broken out for each hour) in order to give a picture of the garbage collection activity over time.
NOTE: Verbose GC output must be enabled in the JVM in order to use this feature.
Field | Description |
---|---|
GC Log File | Enter the full path to the garbage collection log output file, including the log file name. If the location of the log file is included in the JVM options specified when starting the application server, then this field will default to that log file path. |
JVM Start Time | Displays the start time for the system JVM. |
Start Time | Displays the start time parameter for the system JVM, which is when garbage collection began. Change this time to report the garbage collection data to investigate a particular time/date segment. |
Stop Time | Displays the stop time parameter for the system JVM, which is when garbage collection ended. Change this time to report the garbage collection data to investigate a particular time/date segment. |
To view the statistics:
1. Click the radio button next to the Garbage Collector Statistics section.
2. In the GC Log File field, enter the path to the log file, including the file name.
Example: f:\websphere\tempdir\logs\gclog.out
3. Set the Start and Stop times.
4. Click View to view the report on the screen, or CSV to create a CSV file to download the report to a local machine.
NOTE: The garbage collection statistics all apply to one JVM (one node) in the Process Commander system.
GC Statistic | Description |
---|---|
JVMStartTime | The start time for the JVM of the system. |
StartTime | The beginning of the time period chosen for this report. |
StopTime | The end of the time period chosen for this report |
TotalNumberOfCollections | The total of all garbage collections done in this JVM/node for the specified time period. |
TotalNumberOfMajorCollects | The number of major (or “full”) garbage collections done in this JVM/node for the specified time period |
TotalNumberOfMinorCollects | The number of minor garbage collections done in this JVM/node for the specified time period. NOTE: Minor Collects are not done in the IBM JVM (the statistic is here for Sun JVMs). |
TotalStorageFreed | The amount of space, in bytes, released in this JVM/node by all garbage collections done in the specified time period. When an object is released in Java, it goes to garbage collection, and the space required to hold that object in the JVM is freed. |
MajorStorageFreed | The amount of space, in bytes, released in this JVM/node by major garbage collections done in the specified time period. |
MinorStorageFreed | The amount of space, in bytes, released in this JVM/node by minor garbage collections done in the specified time period. NOTE: Minor Collects are not done in the IBM JVM (the statistic is here for Sun JVMs). |
TotalCollectionTime | The amount of time, in seconds, required by all garbage collections done in this JVM/node in the specified time period. |
MajorCollectionTime | The amount of time, in seconds, required to do the major garbage collections in this JVM/node in the specified time period. |
MinorCollectionTime | The amount of time, in seconds, required to do the minor garbage collections in this JVM/node in the specified time period. NOTE: Minor Collects are not done in the IBM JVM (the statistic is here for Sun JVMs). |
ElapsedTime | The amount of time, in seconds, between the Start and Stop times specified. |
FinalCollectTime | The last time any garbage collection was run in this JVM/node. |
CollectionElapsedTime Percentage | A comparison of the elapsed time to the total collect time. IMPORTANT: If the system is spending more than 3% of its processing time doing garbage collection, it is spending too much time, and a problem exists somewhere in the system setup. Check this measurement, and then if that is high, look at the other collection times to see the collection pattern. |
The Hourly Garbage Collection Statistics show the pattern of garbage collection over time. If the workload on the system is consistent, then the garbage collection should also be consistent. If the workload is consistent but there is a spike in the garbage collection, that should be investigated to see what kind of processing caused the spike.
Column heading | Description |
---|---|
Hour | The hour (or portion of the hour) included in the specified time period. |
Number of GC | The Total Number of Collections value for this hour. |
Amount | The Total Storage Freed, in bytes, for this hour. |
Collect Time | The Total Collection Time for this hour. |
Percentage | The Collection Elapsed Time Percentage for this hour. |
Log Usage Statistics
Log Usage shows overall system activity. Based on the number of interactions, the Log Usage will show various Performance Analyzer counts of system usage, so the system administrator can see what activities are going on from a system-level perspective.
For example, if a user complains that the system is slow at 4 p.m., the sysadmin can choose Start and Stop parameters to highlight that time and see whether there was a sudden spike in the number of users, or in activities run by existing users, or some other occurrence that might cause a system slowdown.
To view the statistics:
- Click the radio button next to the Log Usage Statistics section.
- In the Node Name field, enter the name of the node to be viewed.
- Set the Start and Stop times.
- Click View to view the report on the screen, or CSV to create a CSV file to download the report to a local machine.
NOTE: The Log Usage statistics all apply to one JVM (one node) in the Process Commander system.
For definitions of these statistics, see Using Performance Tools .
Tuning Strategy
It is important that the application be tuned before tuning the JVM, to make sure that its processing is efficient. If the JVM is tuned first, then some application problems may be masked.
For example, due to a complex work object structure with lots of drop-down lists requiring data from the BLOB, one site's appliation was essentially loading the data from the entire database for each new work object - over 30MB of data for each work object. They tuned the JVM so that it ran efficiently under this load and handled the massive amount of garbage this setup created, but it would have been much better had they looked at how their application worked and adjusted that, before tuning the JVM itself.
The Installation Guide details starting settings for the upper and lower parameters of the heap, as follows:
- -Xmx:768m
- -Xms:256m
As pointed out in the Install Guide, these parameters are minimum amounts only, to enable you to begin using Process Commander. Your settings may need to be higher, based on your server hardware and the number of other applications on the server. Each application’s memory usage will vary, depending upon the load (number of users) and type of processing being done. Therefore, each application will require specific tuning, using the verboseGC information.
In addition, for a large application like Process Commander, which loads many classes, the default kCluster size of 1280 entries is too small. To avoid problems (described in the kCluster section), use the kCluster command-line option (-Xk) to expand this area:
-Xk20000
The kCluster size should be expanded to cover the number of classes loaded by a production-level Process Commander installation. That number may be found by checking the Class Summary section of the Class Loader Management page (in the System Console for 4.2, or the System Management Application for 5.1). This should display the number of classes that are loaded at any given time. IBM recommends a kCluster size of 22000, which should work fine for Process Commander.
The default pCluster setting is appropriate for Process Commander:
-Xp16k,2k
Do not change it.
NOTE: The kCluster command-line option is an integer, representing the number of class entries. The pCluster option refers to the size of the area allocated (which is why there is a lower-case “k” at the end, signifying kilobytes).
Tuning Your Application
Unlike some of the other performance areas (which should be checked all along as the application is being built), tuning for memory usage should be done after the application is completed.
Begin the tuning process by enabling verboseGC. Note your start time (so you can find info in the log file), and then start the application and run it with several users for some amount of time, making sure to sample all the processing (create, process, and close each type of work object; send correspondence; run reports, etc.). After all the features have been exercised, close the application, noting the stop time, and review the GC log.
There are a number of application issues that may be highlighted by the data in the verboseGC log, including:
- the volume of garbage
- quick allocation failures
- the same object being loaded over and over
- allocation failures consistently doubling
- the size of allocations vs. the number of users
- the heap continuing to grow
When any of these issues occur, it is important to be able to know not only what the JVM is doing, but also what the application is doing. Since it is difficult to tell application processing directly from the verboseGC log, the Process Commander monitoring tool described above shows the verboseGC log information (garbage collection perspective) juxtaposed with the Log Trace information (application perspective). This combined view of the statistics shows an hourly breakdown of the activity in the system, and allows the system administrator to see what the system is doing vs what the JVM is doing, and how they relate together. If there is a garbage collection problem, and too much garbage is being generated, is that directly or indirectly related to the activities being run? Is a spike in garbage collection paralleled by a spike in processing? Is the growth across the two areas consistent – i.e., more activities run means more garbage - or is one growing faster than the other? Use this tool to trace garbage collection anomalies to actions in the application.
Volume of Garbage
As shown in the Viewing VerboseGC sections of this Support Play, each allocation failure message states how much space (in bytes) is needed for this allocation.
<AF[3]: Allocation Failure. need 10336 bytes, 7898744 ms since last AF>
If this number is ridiculously high (“need 759983392 bytes”) in numerous messages, then the recommendation is to look at what the application is trying to load. If the number is large enough, the processing it reflects may be triggering some of the ALERT messages in the ALERT log, such as the Interaction Byte Threshold (loading too much from the database into memory). Check both the Pega log and the ALERT log for pointers to determine what the application is doing that requires so much data.
Another facet of the volume is the amount of time that the system spends collecting the garbage. In the Garbage Collector Statistics in the monitoring tool, the CollectionElapsedTimePercentage is shown. If this percentage is higher than 3% for the system, then the system is spending too much time collecting garbage – the amount of time spent will become perceptible to users as a system slowdown.
Check the number of collects, and the amount of garbage collected (“Total Storage Freed”). High numbers here may point to issues like inefficient database queries (loading the entire BLOB into the clipboard in order to retrieve one property, and then throwing all that data away).
Important: The goal of tuning should be that garbage collections take less than 1% of processing time.
Quick Allocation Failures
The first line of the allocation failure message also displays how long since the last Allocation Failure (the last garbage collection).
<AF[3]: Allocation Failure. need 10336 bytes, 7898744 ms since last AF>
In the above example, over two hours had passed since the last allocation failure, which is fine. However, there might be allocation failures which happen more quickly:
<AF[12]: Allocation Failure. need 524 bytes, 1009 ms since last AF>
Garbage collections should not happen more often than every 5 seconds or so (at least). If there is a string of allocation failures in the log which are all happening quickly, there may not be enough free space in the heap, or not enough contiguous free space in the heap, to allocate objects, causing frequent failures. Investigate why there is so much allocated space in the heap (see next section). Are objects not being released when they should, or is the heap too small?
Size of Allocations vs. Number of Users
When running a small number of users on an application, each user should only require a reasonably small “footprint” in the system. To check this, look at the amount of garbage that is thrown away, but also check the amount of memory that is still allocated after a garbage collection.
<AF[180]: Allocation Failure. need 1594144 bytes, 15356 ms since last AF>
<AF[180]: managing allocation failure, action=2 (286688376/805304832)>
<GC(180): GC cycle started Mon May 22 21:25:50 2006
<GC(180): freed 17379958 bytes, 38% free (304068334/805304832), in 283 ms>
<GC(180): mark: 171 ms, sweep: 20 ms, compact: 92 ms>
<GC(180): refs: soft 0 (age >= 32), weak 47, final 112, phantom 0>
<GC(180): moved 114274 objects, 30519288 bytes, IC reason=14>
<AF[180]: completed in 284 ms>
In the above system, over 500MB of memory is being used. If there are only 10 users on the system at this point in processing, then each user is using around 50MB of memory on the system. This is excessive, and the reason for this amount of memory usage should be investigated. It is possible that users aren’t logging out correctly, but are creating multiple requestors; it generally takes over an hour for a requestor to automatically log off, so if each user has an extra ten or twenty requestors in the system, that will use up memory quickly. The Requestor Management tool (in the Monitor Servlet in V 4.2, or the System Management Application in V5.1) will show what requestors are currently logged into a node of the system.
Another pointer for this issue would be: if all the users are logged off the system, is the allocation number freed up, or does it still show a small number as “% free”?
Same Object Loaded Multiple Times
As stated above, the allocation failure message shows space required. If the system is trying to load the exact same size object over and over, something odd may be going on in the application – perhaps the same large flow is being loaded over and over again.
Example:
<AF[2]: Allocation Failure. need 8192 bytes, 3205 ms since last AF>
<AF[2]: managing allocation failure, action=0 (700753704/805304832)>
<GC(2): GC cycle started Wed May 3 13:53:13 2006
<GC(2): freed 59229688 bytes, 94% free (759983392/805304832), in 23 ms>
<GC(2): mark: 18 ms, sweep: 5 ms, compact: 0 ms>
<GC(2): refs: soft 0 (age >= 32), weak 0, final 39, phantom 0>
<AF[2]: completed in 24 ms>
<AF[3]: Allocation Failure. need 8192 bytes, 7898744 ms since last AF>
<AF[3]: managing allocation failure, action=1 (0/765039592) (40265240/40265240)>
<GC(3): GC cycle started Wed May 3 16:04:52 2006
<GC(3): freed 596388880 bytes, 79% free (636654120/805304832), in 123 ms>
<GC(3): mark: 112 ms, sweep: 11 ms, compact: 0 ms>
<GC(3): refs: soft 0 (age >= 32), weak 25, final 12305, phantom 0>
<AF[3]: completed in 123 ms>
<AF[4]: Allocation Failure. need 8192 bytes, 19877299 ms since last AF>
<AF[4]: managing allocation failure, action=1 (0/765039592) (40265240/40265240)>
<GC(4): freeing class com.ibm.ws.Transaction.JTA.XARecUtil(0x102c4b80)>
<GC(4): unloaded and freed 1 class>
<GC(4): GC cycle started Wed May 3 21:36:09 2006
<GC(4): freed 557940264 bytes, 74% free (598205504/805304832), in 162 ms>
<GC(4): mark: 150 ms, sweep: 12 ms, compact: 0 ms>
<GC(4): refs: soft 0 (age >= 32), weak 0, final 29094, phantom 0>
<AF[4]: completed in 163 ms>
<AF[5]: Allocation Failure. need 8192 bytes, 18672452 ms since last AF>
<AF[5]: managing allocation failure, action=1 (0/765039592) (40265240/40265240)>
<GC(5): GC cycle started Thu May 4 02:47:22 2006
<GC(5): freed 562285064 bytes, 74% free (602550304/805304832), in 149 ms>
<GC(5): mark: 137 ms, sweep: 12 ms, compact: 0 ms>
<GC(5): refs: soft 0 (age >= 32), weak 2, final 27321, phantom 0>
<AF[5]: completed in 150 ms>
In the above example, the system is trying to allocate space for an object that is 8192 bytes four different times. The application should be examined to see if the same object is being loaded over and over, rather than being cached and reused.
Allocation Failures Doubling
Allocation failures may show a doubling in size:
<AF[6]: Allocation Failure. need 4000 bytes, 18868609 ms since last AF>
<AF[6]: managing allocation failure, action=1 (0/765039592) (40265240/40265240)>
<GC(6): GC cycle started Thu May 4 08:01:50 2006
<GC(6): freed 561908936 bytes, 74% free (602174176/805304832), in 145 ms>
<GC(6): mark: 133 ms, sweep: 12 ms, compact: 0 ms>
<GC(6): refs: soft 0 (age >= 32), weak 0, final 27605, phantom 0>
<AF[6]: completed in 145 ms>
<AF[7]: Allocation Failure. need 8000 bytes, 19147611 ms since last AF>
<AF[7]: managing allocation failure, action=1 (0/773092640) (32212192/32212192)>
<GC(7): GC cycle started Thu May 4 13:20:58 2006
<GC(7): freed 567926568 bytes, 74% free (600138760/805304832), in 152 ms>
<GC(7): mark: 141 ms, sweep: 11 ms, compact: 0 ms>
<GC(7): refs: soft 0 (age >= 32), weak 0, final 28016, phantom 0>
<AF[7]: completed in 154 ms>
<AF[8]: Allocation Failure. need 16000 bytes, 19343908 ms since last AF>
<AF[8]: managing allocation failure, action=1 (0/781145688) (24159144/24159144)>
<GC(8): GC cycle started Thu May 4 18:43:22 2006
<GC(8): freed 575231280 bytes, 74% free (599390424/805304832), in 153 ms>
<GC(8): mark: 142 ms, sweep: 11 ms, compact: 0 ms>
<GC(8): refs: soft 0 (age >= 32), weak 0, final 28296, phantom 0>
<AF[8]: completed in 153 ms>
<AF[9]: Allocation Failure. need 32000 bytes, 19634399 ms since last AF>
<AF[9]: managing allocation failure, action=1 (0/789198736) (16106096/16106096)>
<GC(9): GC cycle started Fri May 5 00:10:37 2006
<GC(9): freed 581341368 bytes, 74% free (597447464/805304832), in 154 ms>
<GC(9): mark: 142 ms, sweep: 12 ms, compact: 0 ms>
<GC(9): refs: soft 0 (age >= 32), weak 0, final 28722, phantom 0>
<AF[9]: completed in 154 ms>
This could be a pointer to an issue in custom Java code in the application. If a developer used string buffers in the custom Java code, but neglected to set a specific length, then when data for the string is larger than the buffer, the buffer automatically doubles. Thus, the next time it is used, it requires double the allocation space in memory. If data for that string is larger than the buffer, the buffer will double in size again, and again the allocation space will be doubled.
Heap Keeps Growing
When the system is first started, the verboseGC log often shows some heap expansion messages:
<AF[2]: Allocation Failure. need 8192 bytes, 3205 ms since last AF>
<AF[2]: managing allocation failure, action=0 (700753704/805304832)>
<GC(2): GC cycle started Wed May 3 13:53:13 2006
<GC(2): freed 59229688 bytes, 94% free (759983392/805304832), in 23 ms>
<GC(2): mark: 18 ms, sweep: 5 ms, compact: 0 ms>
<GC(2): refs: soft 0 (age >= 32), weak 0, final 39, phantom 0>
<AF[2]: completed in 24 ms>
<GC[2]: Expanded System Heap by 65536 bytes
<GC[2]: Expanded System Heap by 65536 byte
However, after the system has been running for some time, most of the important rules and data should be cached, and the heap should not need to keep expanding. If the heap keeps growing, this should be investigated. There may be a memory leak, which happens when objects are not “dereferenced,” so the system doesn’t know to collect them as garbage.
Memory Leak
As stated above, there may be a memory “leak,” where objects which are no longer being used are still referenced somewhere in the system, so they cannot be collected as garbage. In this case, the heap will continue to expand until it hits its limit, and then begin giving Java Out Of Memory errors.
The first test for a memory leak is: as all the users log off the system, does the garbage collection number go down, and the “% free” number go up? Check the verboseGC log first thing in the morning, to see what the first couple of allocation failures show. If the “% free” number is very high first thing (“94% free”), then there is probably not a memory leak – some other problem is causing your symptoms.
If the number remains high (“25% free” first thing in the morning with one user), then look for issues which are known to cause memory leaks:
- Check the version of the JVM that is running, and see if there are memory leak issues with that version
- Check if the system is running Jasper, and what version; there are known memory leak issues with some version
Tuning Your JVM
Tune the JVM after the application has been tuned. Once the application is as efficient as it can be, run the test again with verboseGC enabled and no load (one or two users) for some specified time, exercising all functionality, including starting up and shutting down the application. Check the verboseGC log to see the heap size at this stage – this should provide a rough guide to the starting size of the heap (the –Xms parameter) that is needed.
Then run the system through the same test under a heavy load (many users), exercising the same functionality, and examine the verboseGC log again.
As stated previously, the heap should grow during processing. The heap expands until it reaches a steady state, then remains in that state, which should give a heap occupancy (the amount of live data on the heap at any given time) of around 70%. At this level, the frequency and pause time of garbage collection should be acceptable. (IBM p. 9)
Example:
<AF[4]: Allocation Failure. need 8192 bytes, 19877299 ms since last AF>
<AF[4]: managing allocation failure, action=1 (0/765039592) (40265240/40265240)>
<GC(4): GC cycle started Wed May 3 21:36:09 2006
<GC(4): freed 557940264 bytes, 74% free (598205504/805304832), in 162 ms>
<GC(4): mark: 150 ms, sweep: 12 ms, compact: 0 ms>
<GC(4): refs: soft 0 (age >= 32), weak 0, final 29094, phantom 0>
<AF[4]: completed in 163 ms>
If the heap runs out of space, and “Java Out Of Memory” errors result, check the application processing. If processing is efficient, it may be that the heap just needs more space. Increase the –Xmx parameter until the “Out of Memory” errors cease.
NOTE: Do not increase the heap size (-Xmx parameter) until the other tuning has been done! It is vital not to mask other performance issues by expanding the heap size, perhaps at the expense of other applications running in the same JVM.
If the heap is fully expanded (at a steady state) but the occupancy level is greater than 70%, increase the –Xmx value so that the heap is not more than 70% occupied.
NOTE: The heap should not be made larger than the physical memory in the machine. If it is larger, then some allocations will be paged to disk, which gives slower performance.
Also, set up the system to avoid compactions. Compaction is usually caused by requests for large memory allocations. Analyze requests for large allocations and avoid them if possible (IBM p. 10).
Enabling Concurrent Mark
Concurrent Mark processing should not be automatically enabled in a system. Background marking of the garbage can make garbage collection more efficient, but since there are background processes using CPU resources, it can also be less efficient.
The first factor to consider when determining whether to enable Concurrent Mark is what sort of processing will be done by the application. Since Concurrent Mark does background garbage marking, this will use some resources for overhead, but will decrease the amount of time users can’t access the system (during garbage collection), thus increasing response time. Therefore, for any system where response time to users is important, Concurrent Mark should be considered.
If there are no direct users for the Process Commander application – in other words, if Process Commander is being used as a batch processor – then Concurrent Mark should not be enabled. For batch processing, throughput is more important than response time, and the uninterrupted stop-the-world garbage collection gives better throughput than background collection.
For applications with direct users, where response time is important, the next step is to analyze the GC log to determine whether Concurrent Mark is useful for your particular application.
The background process of concurrent marking does take some time – at least 30 seconds. If allocation failures are occurring in the system faster than that, then the verboseGC log will consistently show that the concurrent marking has been ABORTED – State=3:
<AF[2]: Allocation Failure. need 524 bytes, 478 ms since last AF or CON>
<AF[2]: managing allocation failure, action=0 (528854792/536869376)>
<GC(2): Bytes Traced =0 (Foreground: 0+ Background: 0) State = 3 >
<GC(2): GC cycle started Thu Sep 21 17:27:44 2006
<GC(2): freed 3577488 bytes, 99% free (532432280/536869376), in 45 ms>
<GC(2): mark: 37 ms, sweep: 8 ms, compact: 0 ms>
<GC(2): refs: soft 0 (age >= 32), weak 0, final 7, phantom 0>
<AF[2]: completed in 45 ms>
In the above example, the allocation failures were .5 seconds apart. This was not enough time for the concurrent mark process to do anything, so there were no bytes traced, and the concurrent mark state is “3” – ABORTED. In this case, concurrent mark did not improve performance, but probably worsened it.
The developer should get the time between allocation failures to be at least 30 seconds, and then implement optavgpause to enable Concurrent Mark. This will then allow the system to take advantage of background garbage collection processing.
Ongoing Tuning
Although both the application and the JVM may have been tuned initially, the system administrator’s work is not done. Tuning does not need to be done every week or every month. However, whenever a large change is made to the system, the sysadmin should re-examine the verboseGC output to ascertain whether additional tuning needs to be done. Such changes might include:
- Adding a group of new users. Adding one user will probably not require retuning the system, but adding multiple users may.
- Adding or changing hardware.
- Putting new RuleSets into production. New RuleSets mean new processing, which may require additional allocations.
- Upgrading to a new Process Commander version.
Troubleshooting an Out-of-Memory Error
As stated above, out-of-memory errors may be caused by a number of factors.
Investigate the garbage collection information.
- Did this problem occur due to a spike in allocation (suddenly, 100Mb needed to be allocated!), or was there a gradual filling of the heap until there was no more memory left?
- If the heap gradually filled, how many requestors are running? (Are people’s memory footprint too large?)
- What happened just before the user got the out-of-memory error? (Did the previous user’s processing try to load the whole database? The database data might have just fit into the memory with no errors, but now anything else that comes along will cause an error.)
Additional Resources
As stated above, IBM has published the IBM Developer Kit and Runtime Environment Java 2 Technology Edition Diagnostics Guides. There are different versions of this Guide available as PDF files from the IBM website.
For Java 5: Download the PDF fromhttp://www-128.ibm.com/developerworks/java/jdk/diagnosis/
Of Special Interest For This Support Play:
- Chapter 2: Understanding the Garbage Collector
- Chapter 30: Garbage Collector Diagnostics
For Java 1.4.2: Download the PDF fromhttp://www-128.ibm.com/developerworks/java/jdk/diagnosis/142.html
Of Special Interest For This Support Play:
- Chapter 2: Understanding the Garbage Collector
- Chapter 31: Garbage Collector Diagnostics
WebSphere Application Server v6 Diagnostic Data
http://www.redbooks.ibm.com/abstracts/redp4085.html?Open
Diagnostic Tool for Java Garbage Collector:
http://www.alphaworks.ibm.com/tech/gcdiag
IBM Pattern Modeling and Analysis Tool for Java Garbage Collector:
http://www.alphaworks.ibm.com/tech/pmat
HeapRoots:
http://www.alphaworks.ibm.com/tech/heaproots
HeapAnalyzer:
http://www.alphaworks.ibm.com/tech/heapanalyzer
Additional PDN Articles:
- Tech Note: Using Performance Tools (PAL), for V4.2 and for V5.1. Click a link on the System Tools page.
- How to detect and remedy the performance impact of Obj-List methods and list view report
- Platform Bulletin: IBM JVM Issues
Need Further Help?
If you have followed this Support Play, but still require additional help, contact Global Customer Support by logging a Support Request .
Previous topic Support Play: Analyzing OutOfMemory Exceptions Next topic Support Play: Tuning the IBM JVM 5.0 for performance