Troubleshooting: "Content is not allowed in prolog" from Connector Wizard with XSD or WSDL input
Symptom
When the Connector Wizard is run for Web Services (SOAP), an XML or XSD parsing error may occur if one of the files in the Web service definition cannot be parsed.
The Connector wizard reports the error on the form:
The PegaRULES log contains messages associated with the failure:
19:28:28,337 [ttp-8080-Processor20] (
pegarules.parserule.XMLParserBase) ERROR lfitzixp|10.60.51.106 - Caught exception parsing XML stream org.xml.sax.SAXParseException: Content is not allowed in prolog.
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source) at
com.pega.pegarules.parserule.XMLParserBase.parseDocument(XMLParserBase.java:437)at
com.pega.pegarules.parserule.XMLParserBase.parseDocument(XMLParserBase.java:428) at
com.pega.pegarules.services.XSDParser.loadSchemaDocument(XSDParser.java:839) at
com.pega.pegarules.services.XSDParser.loadExternalDocuments(XSDParser.java:792)
at com.pega.pegarules.services.XSDParser.findAllSchemaNodes(XSDParser.java:670)
at com.pega.pegarules.services.XSDParser.parseDocument(XSDParser.java:529)
at com.pega.pegarules.services.XSDParser.parse(XSDParser.java:500)
at com.pega.pegarules.services.XSDParser.parse(XSDParser.java:492)
at com.pega.pegarules.services.XSDParser.parse(XSDParser.java:481)
Solution
Diagnostic Steps / Workaround
The "Content is not allowed in prolog" error indicates that the SAX parser was unable to properly read one of the XML files needed to build the SOAP connector.
This error can arise from many causes. However they all stem from invalid or unreadable content in the XML files that make up the SOAP service (either WSDL or XSD schema files).
To identify the cause, follow these steps.:
- If the exception message does not specify which file that caused the exception (this is common), download all files that make up the SOAP service definition (WSDL file and all XSD files) to your computer for analysis. Since XSD schema files often reference other XSD schema files, follow the import statements in each XSD document to make sure you get all necessary files.
- Open each file with a text editor or XML editor. The WSDL file and all XSD schema files are XML files and must contain only valid XML content. Often the cause of this message is an improperly terminated comment or comment text that is not within comment tags. You may find extra text at the start of the file that makes it not well-formed XML. Remove extraneous text to make each file a well-formed XML document, and then rerun the Connector wizard using the modified files.
- If your visual inspection using a text does not uncover any problems, use a more sophisticated tool to test the validity of the XML,.such as Altova's XMLSpy. (Many other freeware and commercial software products can test the validity of an XML document.) If this step identifies a problem with one or more of the files, update them to make them valid, and rerun the Connector wizard. However, if all files pass XML validation, continue with step 4.
- Certain versions of the SAX parser (JDK 1.3.x and 1.4.x) do not correctly handle Unicode document signatures or BOMs (Byte Order Marks). Unicode or UTF (Unicode Transformation Format) BOMs are required for UTF-16 and higher byte encodings, but are not needed in UTF-8 encoding. Depending upon the program that generated the WSDL and/or XSD files, they may contain a BOM, even if not required. If a BOM exists in the file, some SAX parser versions misinterpret the BOM as invalid content, and throw the exception listed above. To determine whether this problem is the source of the errors, open the WSDL and/or XSD files again using a hexadecimal editor. (Most text-only editors and browsers correctly handle the BOM and don't display it.). When viewing a UTF-8 encoded document that contains a BOM, you will see encoding similar to the image below. The first three characters (bytes) in the image below (byte codes EF BB BF) are BOM characters that cannot be read by some versions of the SAX parser:
Delete these characters from the file. Typically this requires deleting all text on the first line of the file and then retyping or copying the necessary parts from another known good file. Then recheck that the BOM is removed. Then rerun the Connector wizard on the modified files.
- If these steps do not uncover the cause of the exception, contact Global Customer support.
Additional Information and Resources
These resources describe the meaning and use of BOMs, the limitation with many current versions of the SAX parser, and causes of the exception message detailed above.
W3C Display problems caused by the BOM FAQ: http://www.w3.org/International/questions/qa-utf8-bom
Unicode.org UTF Byte Order Mark FAQ: #BOM
Sun Microsystems SAX parser bug item for BOM interpretation: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6206835
Previous topic How to map a SOAP header in the request message of a SOAP connector