Importing data into a Postgres database
You can import data into a Postgres database from a .csv file by using the Data Upload utility. The Data Upload utility is useful for uploading data that changes frequently. For example, you can use it to upload changes to your product catalog. This utility is available only for tables that do not contain BLOB columns and only for Postgres databases.
To export files to a database table, perform the following high-level tasks:
- Create and compress the .csv data files
- Create the manifest.xml file that describes the .csv files to upload and how to upload them
- Configure the file directories that are used by the Data Upload utility
- Update the service package data instance access group
- Update the file listener data instance
- Copy the files to the configured file directories
- Start the file listener
- Optional: Configure a logging appender
Creating and compressing the .csv data files
Export your on-premises database tables to separate .csv files by using your database tools and processes. Optionally, compress the .csv files into a .zip file. Compressing large files reduces the time it takes to upload them. A .zip file is the only supported format for compression.
Creating the manifest file
Create a manifest file in XML format that describes the CSV data files to be uploaded and how to upload them. The manifest file name must be unique to ensure that an existing manifest file will not be overwritten if a new manifest file is uploaded before the previous upload has finished. The file name must be in the format Manifest*.xml, where * must be unique.
Include the following attributes in the manifest file.
XML attribute | Details |
---|---|
process_option | Specify how to upload the files:
|
upload_date | Specify the date on which the .csv files are uploaded. You can use any date format. |
className | Specify the Pega® Platform class name that corresponds to the table. |
compressed | Specify whether the .csv is compressed:
|
delimiter | Specify the delimiter that is used in the .csv file to delimit the columns. |
fileName | Specify the .csv file name. |
mode | Specify how to add data to the table:
|
The following sample manifest file shows how to configure the XML to upload data files:
<?xml version="1.0" encoding="UTF-8"?>
<uploadfiledetails>
<process_option>sequence</process_option>
<upload_date>01-Jun-2016</upload_date>
<uploadfiles>
<uploadfile>
<className>Employee-Temp</className>
<compressed>no</compressed>
<delimiter>,</delimiter>
<fileName>Employee_test_01.csv</fileName>
<mode>replace</mode>
</uploadfile>
<uploadfile>
<compressed>yes</compressed>
<fileName>Employee_all.zip</fileName>
<subfiles>
<subfile>
<className>Employee-All</className>
<compressed>no</compressed>
<delimiter>,</delimiter>
<fileName>EmployeeAll_1.csv</fileName>
<mode>append</mode>
</subfile>
<subfile>
<className>Employee-All</className>
<compressed>no</compressed>
<delimiter>,</delimiter>
<fileName>EmployeeAll_2.csv</fileName>
<mode>append</mode>
</subfile>
</subfiles>
</uploadfile>
</uploadfiles>
</uploadfiledetails>
The following file is the corresponding sample .xsd file:
<?xml version='1.0' encoding='UTF-8'?>
<xs:schema elementFormDefault="qualified" attributeFormDefault="unqualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="uploadfiledetails">
<xs:complexType>
<xs:all>
<xs:element type="xs:string" name="process_option"/>
<xs:element type="xs:string" name="upload_date"/>
<xs:element name="uploadfiles">
<xs:complexType>
<xs:sequence>
<xs:element name="uploadfile" maxOccurs="unbounded" minOccurs="0">
<xs:complexType>
<xs:all>
<xs:element type="xs:string" name="className" minOccurs="0"/>
<xs:element type="xs:string" name="compressed" minOccurs="0"/>
<xs:element type="xs:string" name="delimiter" minOccurs="0"/>
<xs:element type="xs:string" name="fileName" minOccurs="0"/>
<xs:element type="xs:string" name="mode" minOccurs="0"/>
<xs:element name="subfiles">
<xs:complexType>
<xs:sequence>
<xs:element name="subfile" maxOccurs="unbounded" minOccurs="0">
<xs:complexType>
<xs:all>
<xs:element type="xs:string" name="className" minOccurs="0"/>
<xs:element type="xs:string" name="compressed" minOccurs="0"/>
<xs:element type="xs:string" name="delimiter" minOccurs="0"/>
<xs:element type="xs:string" name="fileName" minOccurs="0"/>
<xs:element type="xs:string" name="mode" minOccurs="0"/>
</xs:all>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:all>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:all>
</xs:complexType>
</xs:element>
</xs:schema>
Configuring the file directories
Configure the directories to use for decompressing and copying the files in Data-Admin-System-Settings (DASS). Because the file listener and Data Upload utility move and rename files, the Pega Platform server must have read and write access to the configured file directories.
Create the following Dynamic System Settings:
File directory |
|
Stage directory |
|
Manifest file directory |
|
To create the settings:
- In the Records Explorer, click
- Click Create.
- In the Short description field, enter a short description.
- In the Owning Ruleset field, enter the owning ruleset, for example, "Pega-RulesEngine".
- In the Setting purpose field, enter the system setting key, for example, "File upload directory".
- Click Create and open.
- In the Value field, enter the file path as shown in the table, for example, "DataUploadUtility/env/loc/uploadfile".
- Click Save.
- Repeat this procedure for the remaining settings.
Updating the service package data instance
Update the access group for the service package data instance. This access group must have access to all the Pega Platform classes for which the upload is happening.
- In the Explorer panel, click .
- In the SERVICE PACKAGE NAME column, find and click UploadUtilityServicePackage.
- In the Service access group field, select the operator's access group.
- Click Save.
Updating the file listener data instance
Update the file listener data instance so that it runs at startup.
- In the Explorer panel, click .
- In the LISTENER NAME column, find and click UploadUtilityManifestListener.
- Clear the Block startup check box to enable this file listener.
- In the Startup option field, select Run on all nodes.
- In the Requester login section, add the username and password for the administrator of the application.
- Click Save.
Copying the files to the configured file directories
Copy the .csv files, and then copy the manifest file to the file directories that are specified in the DASS. The manifest file must be copied last because it causes the Data Upload utility to upload the data.
Start the file listener
Start the file listener in the System Management Application (SMA).
- Log in to SMA.
- Connect to the node on which the file listener is running, and select Listener Management.
- If the UploadUtilityManifestListener listener is not listed as running, select it from the list of available listeners.
- Click .
Optional: Configuring a logging category
Errors that occur when you run this utility are logged to the Pega log file. Errors that occur because of the content of a .csv file, such as text data in a numeric field, result in a JDBC exception in the log file. You can configure a new logging category named DataUploader to write the output of this utility to a separate log file. Add the DataUploader category, with the level set to "debug", to the prlogging.xml file (Pega 7.2 or earlier) or the prlog4j2.xml file (Pega 7.3 or later). For more information about setting the logging level, see About the logging level settings tool.
You can also create a logging appender to write the output of this utility to another file or target by configuring a new appender that contains the DataUploader category with the level set to "debug." For more information, see Configuring Pega file logging appenders.
Previous topic How to use the Column Populator utility in PRPC 6.x to Pega 7.1.9 Next topic Use the Pega-RULES agent to purge selected database tables regularly