Skip to main content


         This documentation site is for previous versions. Visit our new documentation site for current releases.      
 

File Data Set file path pattern

Updated on July 5, 2022

To process data in parallel by using multiple threads, the File Data Set operates on a collection of files instead of a single file. The file path pattern includes tokens that match existing files to the pattern or generate new files.

File path pattern tokens

When you specify the file path pattern for a File Data Set, you can use the following file path pattern tokens:

Date / time token
The date / time token is represented by %{FORMAT} where FORMAT is any date time format accepted by java.time.format.DateTimeFormatter with the exception of timezone specifications "zzzz", "V", "X", "x", "Z", and "O".
The date / time token matches file names that contain a date and time in the specified format.
For new files, the date time token generates the current date in the specified format.
Wildcard token
The wildcard token is represented by the * (asterisk) character.
For existing files, the wildcard token matches any non-empty string.
For new files, it generates a unique ID to guarantee file name uniqueness.
Implicit token
If the wildcard token is not present, the implicit token is added to the pattern to guarantee file name uniqueness.
The token is added at the end of file name, right before extension.
An implicit token matches the file names that do not contain a token value, or contain a value consisting of the - (hyphen) character followed by any string.
New file names are generated with - (hyphen), followed by a string that guarantees file name uniqueness.
A new file name without a token value is never generated, for example, assuming the path file.json, a file called file.json is never generated.
Note: The following constraints apply to tokens:
  • Tokens are not allowed in the directory name.
  • Only one date / time token is allowed.
  • Only one wildcard token is allowed.

File name uniqueness when generating files

All files that the system generates in a file repository by using a File Data Set, include a unique ID in the file name. This ID is a string that guarantees the uniqueness of the file names. For example, if the file name pattern for a File Data Set is file.json, the system generates files names with unique IDs after a hyphen (-) and before the file extension:

  • file-vlmmssaprdnd7421164395985556020230321.csv
  • file-vlmmssaprdnd7421164395986666420230321.csv
  • file-vlmmssaprdnd7421164395983234420230321.csv

If the file name pattern includes a wildcard character (*), for example, file*.json, the system generates file names by replacing the wildcard character with a unique ID:

  • filevlmmssaprdnd7421164395985556020230321.csv
  • filevlmmssaprdnd7421164395986666420230321.csv
  • filevlmmssaprdnd7421164395983234420230321.csv

The system does not generate any file names without a unique ID. For example, file.json is not a valid file name for new files and is never generated.

Examples of file path patterns

To better understand the different tokens, refer to the examples provided in the following table:

Examples of file path patterns

PatternAllowedMatched filesNot matched filesGenerated files
file.jsonYes
  • file.json
  • file-abc11658845611263.json
  • file-abc.json
fileabc11658845611263.json (no hyphen)file-abc11658845611263.json
fileYes
  • file
  • file-abc11658845611263
  • file-abc
fileabc11658845611263 (no hyphen)file-abc11658845611263
dir/file.jsonYes
  • dir/file.json
  • dir/file-abc11658845611263.json
  • dir/file-abc.json
  • dir/fileabc11658845611263.json (no hyphen)
  • wrongdir/file-abc11658845611263.json (dir does not match pattern)
dir/file-abc11658845611263.json
file_*_123.jsonYes
  • file_abc11658845611263_123.json
  • file_abc_123.json
file___123.json (no wildcard token value)file_abc11658845611263_123.json
file%{yyyyMMdd}_123.jsonYes
  • file20220726_123.json
  • file19800102_123.json
  • file20220726010203_123.json (wrong date time token)
  • file_123.json (missing date time token)
  • fileabc_123.json (not a date time token value)
file20220726_123-abc11658845611263.json
file%{yyyy-MM-dd-HH-mm-ss}_123.jsonYes
  • file2022-07-26-16-43-21_123.json
  • file1980-01-02-03-04-05_123.json
file20220726164321_123.json (wrong date time token format)file2022-07-26-16-43-21_123-abc11658845611263.json
file_*_123_%{yyyyMMdd}_456.jsonYes
  • file_abc11658845611263_123_20220726_456.json
  • file_abc11658845611263_123_19800102_456.json
  • file_a_123_20220726_456.json
  • file__123_20220726_456.json (no wildcard token value)
  • file_abc11658845611263_123_2022_456.json (wrong date time token format)
file_abc11658845611263_123_20220726_456.json
file*abc*xyz.jsonNo (two wildcard tokens)Not applicableNot applicableNot applicable
file%{yyyyMMdd}abc%{yyyy-MM-ddHH:mm:ss}xyz.jsonNo (two date time tokens)Not applicableNot applicableNot applicable
dir*/file.jsonNo (directory has a token)Not applicableNot applicableNot applicable

  • Previous topic Creating a File data set record for files on repositories
  • Next topic Requirements for custom stream processing in File data sets

Have a question? Get answers now.

Visit the Support Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega.com is not optimized for Internet Explorer. For the optimal experience, please use:

Close Deprecation Notice
Contact us