Example – Basic XML

Example – Basic XML

 

Situation

Document to be recognized

<?xml version="1.0" encoding="UTF-8"?>
<INVOICES>
  <INVOICE>
    <SENDERID>8710400000006</SENDERID>
    <SENDERIDCODELIST>14</SENDERIDCODELIST>
    <RECIPIENTID>8712345678906</RECIPIENTID>
    <RECIPIENTIDCODELIST>14</RECIPIENTIDCODELIST>
    <TESTINDICATOR>TRUE</TESTINDICATOR>
    <BASEEDIVERSION>96A</BASEEDIVERSION>
    <BASEEDIVERSIONSTANDARD></BASEEDIVERSIONSTANDARD>
    <GROUP>ESC</GROUP>
    <DOCUMENTTYPE>380</DOCUMENTTYPE>
    <INVOICENUMBER>104109</INVOICENUMBER>
    <MSGFUNCTION>9</MSGFUNCTION>
    <ACK>NA</ACK>
    <INVOICEDATE>20201105</INVOICEDATE>
    <INVOICETIME>1635</INVOICETIME>
    <TAXRATE></TAXRATE>
    <TAXCATEGORY></TAXCATEGORY>
    <EXCISEFREE></EXCISEFREE>
    <DATES>
      <DELIVERYDATE>20201105</DELIVERYDATE>
    </DATES>
  </INVOICE>
</INVOICES>

Configuration steps

  1. Preparation
    1. Examine how to recognize data (sub)types.
    2. Examine which metadata is necessary and available (not necessary for documents that will be discarded).
  2. Configuration
    1. Configure document recognition.
    2. Configure how to extract and set metadata.
  3. Testing your configuration.

Step 1: Preparation

Step 1a: Requirements

SmartBridge needs to extract metadata from the document, so it needs to know where to find this metadata in the document. Therefore, go over your document and prepare Xpaths to locate at least the following metadata:

  • Document format (XML, etc.)
  • Type of document (invoice, etc.)
  • (Optional: Version of the type of document, e.g. D96A)
  • Sender identifier (and the type of the identifier)
  • Recipient identifier (and the type of the identifier)
  • Envelope number or document number
  • (Optional: Value to identify test documents)

Make a note for required metadata that cannot be found in the document, and set your own value for this metadata.

Step 1b: Create a new Document Structure

  1. Click on the   +  at the bottom of the page to add a document structure, select ‘Add XML’.

  2. Enter a descriptive name.

 

Field

Description

Example value

Name

Give a name to the document structure you are defining. Name will be used for your own recognition only.

Exact DESADV

Step 2: Configuration

Step 2a: Configure how to recognize the document

Identify what content sets this document apart from other XML documents

We might be processing other XML documents that have a huge resemblance to this document. How can SmartBridge tell these XML documents apart? Identify what content sets this document apart from other XML documents. Configure one or more of the following fields as a recognition method:

 

 

 

Recognize XML document using

Possible identifier

Doctype (DTD) reference

N/a in example; leave empty

Namespace reference

N/a in example; leave empty

XML Schema reference

N/a in example; leave empty

XML node

/INVOICES (we could also use /INVOICES/INVOICE or /INVOICES/INVOICE/INVOICENUMBER)

 

Step 2b: Setting metadata

Skip this step for files that are not used for further processing.

 

When comparing the example XML document against the requirements, you will find that you need to:

  • Have SmartBridge dynamically set values for standard metadata, by extracting values from the document.
  • Provide your own static values for standard metadata.
  • Skip adding custom metadata.

The nodes of our example document contain the following data:

Required metadata

Remark

Where to find it

Additionally set

Document format (XML, etc.)

XML is automatically recognized. No need to configure this.

n/a

n/a

Type of document (invoice, etc.)

The example document has a node that contains the Document Type, but in a proprietary format: Document Type ’380′ stands for ‘invoice’. Therefore, we assume this information is unavailable. This means we should manually label the document with this property.

No Xpath available; needs to be assigned. Set value ‘INVOIC’.

Xpath = False

Optional: Version of the type of document, e.g. D96A

Can be extracted from the document.

/INVOICES/INVOICE/BASEEDIVERSION

Xpath = True

Sender identifier

Can be extracted from the document.

/INVOICES/INVOICE/SENDERID

Xpath = True

Type of Sender identifier

 

/INVOICES/INVOICE/SENDERIDCODELIST

Xpath = True

Recipient identifier

Can be extracted from the document.

/INVOICES/INVOICE/RECIPIENTID

Xpath = True

Type of Recipient identifier

Can be extracted from the document.

/INVOICES/INVOICE/RECIPIENTIDCODELIST

Xpath = True

Envelope number or document number.

Can be extracted from the document.

/INVOICES/INVOICE/INVOICENUMBER

Xpath = True

Optional: Value to identify test documents

  • Can be extracted from the document.
  • Also identify the possible values for the test indicator. For example, ’1′ might stand for test documents and ’0′ or empty for official documents. Use this value in the ‘Value retrieval options’.

/INVOICES/INVOICE/TESTINDICATOR

  • Xpath = True
  • Value retrieval: test indicator

 

Click ‘Save’ to save your settings. 

 

Step 3: Test your Document Structure


 Visual explanation


  1. Have the test file ready (see Step 1).
  2. In the upper right-hand corner click on the test button:

    A new window will open.

  3. Click on Browse... to select the test file, for testing whether the new Document Structure matches your test file.

  4. (Optional step unless your Document Structure uses communication attributes) Configure the Inhouse Recognition Parameters section.
  5. Click on Test to analyze your file. You will see all the information SmartBridge is able to extract from the document, using the Document Structure that you created.

  6. Review the Results section. Correct your Document Structure in case you run into unexpected results (e.g. when the results show the name of a different Document Structure), then test again.

 

You might run into unexpected results when you test a Document Structure definition that contains Macros.

 Click here to expand...

In most cases SmartBridge is able to set these Macros in case a communication module first processes the document. However, this testing method does not use a communication module for testing. As a consequence, during this type of testing you will likely encounter unprocessed Macros in the test results. This is expected behavior.

Read the FAQ to learn more »


Visual example of end result

 

Example of a document structure for an XML file (click image to enlarge).


On this page