Editing advanced XML document comparison pipeline configurations

DeltaXML’s XML Compare product includes a Document Comparator, specially enhanced for comparing XML documents with predominantly narrative content. To get the best results from DeltaXML’s XML Document Comparator, you can tailor its configuration, both for specific input document types and for specific outputs. This article shows how to develop the Document Comparator Pipeline (DCP) configuration file using features from DeltaXML’s free XSLT/XPath extension for the popular Visual Studio Code editor. For the rest of this article, I’ll refer to Visual Studio Code as ‘VS Code’ and assume this extension is installed.

Editing DCP with embedded XPath expressions in VS Code

When to use VS Code

DeltaXML’s DCP format is defined in an XML Schema (1.0 and 1.1 XSD versions are included).

An XML-Schema 1.1 compatible XML editor like OxygenXML (which many of our customers use) is more than sufficient if you are developing a relatively basic DCP file for DeltaXML’s Document Comparator, perhaps with just one or two ‘pipeline parameters’.

VS Code’s DCP editing features provides the same basic features as other such editors, however, it comes into its own when you need a more adaptable DCP configuration, possibly with embedded XPath expressions and a number of pipeline parameters. This might be the case for example, when you’re using the Document Comparator’s REST API.

DCP Editing Features in VS Code

Creating a new DCP File

In Visual Studio Code, create a new file and save it as ‘test.dcp’. The ‘.dcp’ filename extension is recognised and causes ‘DCP’ to be shown in the ‘Language Mode’ part of the status bar at the bottom of the editor.

At the top of the new file editor, enter ‘<' (as you would normally do when adding a new XML element), you will be presented with a list of available ‘code snippets, select ‘Standard Configuration DCP’ from the list. A basic element-structure for a standard DCP is inserted into the editor, with the ‘id’ attribute of the 'documentComparator’ element pre-selected so you can edit this directly. You now have a valid DCP file that you can save again and then invoke from the command-line using the given 'id’ value.

In the editor, type ‘<' to see a list of available DCP snippets

Built-in DCP Auto-Completion and Validation

As you edit the DCP file, you will be presented with auto-completion options according to the element or attribute context. If an element-name, attribute-name or attribute-value is invalid, a warning will be shown.

Note: With other XML editors like OxygenXML you can also get this behaviour, you will just need to copy our DCP schema to your file-system and then either create a ‘validation scenario’ or add the relevant xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" namespace and xsi:noNamsepaceLocation attribute declarations.

VS Code’s context-aware auto-completion list

Pipeline Parameter Reference Checking

The DCP format supports stringParameter and booleanParameter elements, each with a name attribute. These elements have default attributes with values that can be overriden from the GUI or Command-line user-interfaces or from our REST or Java APIs.

If the Parameter element is declared but the name is not referenced within the DCP, then the name attribute is shown grey-out.

Any DCP element that corresponds to a DocumentComparator setting or an XSLT filter parameter can have one of three attributes: literalValue, parameterRef or xpath.

The VS Code editor will mark as invalid any parameterRef attributes that do not match the name of the appropriately typed (string or boolean) Parameter element. Other XML editors can show associated XML Schema 1.1 assertion error messages but they cannot highlight the specificparameterRef attributes at fault.

XPath Validation and Auto-Completion

DocumentComparator or XSLT filter parameters can be set with an xpath attribute, other attributes (like name on stringParameter and file elements) allow XPath expressions to be enclosed within curly-braces in the same way as XSLT’s attribute-value-templates (AVTs).

For XPath expressions within the DCP, the VS Code editor provides special syntax-highlighting, basic XPath syntax-checking, variable-reference checking and auto-completion.

To help navigate larger DCP files, the Outline View in the Explorer pane shows a tree-view of the DCP file, with special highlighting for parameters etc. Several other VS Code navigation features have been extended with specific support for DCP.

XSLT Filter Paths

XSLT filters can be inserted into specific parts of the Document Comparator pipeline with the DCP format. XSLT filters for user-defined files or built-in resources are declared as file or resource elements within an element representing a specific ‘extension-point’ in the pipeline.

Filter Files

To add a new XSLT file to the pipeline, enter the relative path for the XSLT in the path attribute that is added with auto-completion. If the XSLT file does not exist, a squiggly-line appears under the path, hover over this to show an ‘action popup’ and then click ‘Follow Link’ on the popup and then click on the ‘Create File’ button.

Click on the ‘Follow Link’ popup-action to create a new XSLT file

You can choose to edit the XSLT filter within VS Code with full XSLT language support from DeltaXML’s extension (see our XSLT/XPath User Guide) or use your own preferred XSLT editor. In VS Code, an XSLT snippet creates a boiler-plate ‘identity transform’ XSLT filter that can be quickly modified to suit your requirement.

Filter Resources

The Document Comparator is distributed with a set of built-in XSLT resources. When inserting a built-in XSLT resource, auto-completion is provided for the name attribute of the resource element. The auto-completion list is populated with the names of all built-in XSLT resources.

Choosing an XSLT filter resource-name from the auto-completion list

Summary

This article has outlined how DeltaXML’s VSCode extension supports XSLT and DCP language features, providing a dedicated environment to help manage the configuration of a Document Comparator Pipeline. If you want to try it out please install the extension from the VS Code MarketPlace or directly from VS Code.

Keep Reading

Managing Risk in Legal Documentation

/
Proactively addressing compliance, accuracy, and security risks in legal documentation is essential to protect from costly errors.

Ensuring Accuracy in Legal Documentation

/
Efficient document comparison and merging can drastically improve accuracy, collaboration, and compliance for legal teams.

Introducing HTML Compare

/
HTML Compare is your go-to for tracking, comparing, and managing HTML content changes with ease, offering clear visual highlights and customisable settings.

Introducing Subtree Processing Mode for Greater Flexibility

/
A new feature that lets you control how content is compared by processing sections as either text or data.

Beyond Step-Through XSLT Debugging

/
Print-debugging in XSLT provides a broader view of code behaviour by capturing variable values at multiple points.

DeltaXML’s Smart Comparison Report

/
With clear insights and detailed analysis, DeltaXML's new Comparison Report makes fine-tuning configuration easier than ever.

Solving Common Challenges with Inaccurate Document Management

Discover practical strategies to overcome common challenges in regulated industries.

How to avoid non-compliance when updating technical documents in regulated industries

Navigate the challenges of updating technical documents in regulated industries.

Built-in XML Comparison vs Document Management Systems (DMS)

Compare using specialised XML comparison software versus a DMS in regulated industries.