Understanding the pzPvStream (BLOB) database column
This presentation is part of the Designing and Building for Performance Self-Study Course.
Transcript
The full PRPC object image is stored in the database table within the pzPVStream column, defined as a BLOB datatype. BLOBs are very efficient and flexible mechanisms for object storage because:
- BLOBs are compressed by PRPC for low storage overhead.
- BLOBs can hold any amount of information since there are no physical limit constraints, such as maximum column lengths or page sizes.
- The object structure can be very complex, holding many levels of nested structures and repeating groups of information.
- The object structure can change from instance to instance without the involvement of the DBA in making complex changes to the database schema.
The following is an example of what occurs when requesting an object that requires reading the stream.
- The DBMS executes a Select to retrieve the object; a typical BLOB for a work object is 20,000 bytes or more in size.
- During this operation, three extra I/O operations are performed to read the pzPvStream, as this 20k of data is stored in 8k chunks on their own data pages.
- This 20k of data is transmitted as a "stream" across the network.
- The JVM allocates a large chunk of memory to receive this data. The stream is uncompressed using CPU resources and requires even more memory to hold the uncompressed data. This stream is parsed to find the data for the two referenced properties.
- The two requested clipboard property instances are created, consuming more CPU.
- Finally, the temporary memory is discarded, and the JVM soon invokes the garbage collector and takes more CPU to reclaim this 40k.
When reading just one instance of an object, such as when opening a work object, this overhead is hardly noticed. When using a List View, this process happens for each row in the list. The typical cap on List View reports is 500 rows, which means it is easy to generate (500 x 40,000 or 20MB of garbage). Multiply this by 100 users and 2 GB of garbage is generated!
This is one reason why memory allocation in production systems becomes excessive and how "garbage collection" within the JVM becomes strained. High CPU usage and high network utilization also affect user response time. Fortunately, by exposing columns, this issue can be easily remedied.