File Formats and ConversionQA Functionality

What is ConversionQA?

ConversionQA is a new tool from DeltaXML that provides assurance that your content conversion project is successful. It enables you to compare inputs of any two XML formats, ignoring the XML structure, to determine any differences in the content that may have occurred during conversion. Differences are provided in a simple HTML report that helps you to identify where the transformation is going wrong.

File Types

ConversionQA handles any (with some caveats) XML inputs, allowing you to check conversions between any two formats. The base input type is ‘XML’, which is any single-file-based XML input, e.g., DocBook, XHTML, etc. The initial version offers special handling for two other XML input types: Microsoft Word .docx and DITA maps. If you have other requirements for inputs that need special handling, please let us know, and we’ll do our best to accommodate them.

Single file XML

These are treated as-is, with no special handling. Each input is parsed and then passed directly into ConversionQA, where content is extracted from the XML structure for comparison.

Microsoft Word .docx

XML-based Word documents are passed in their plain form (which is a zip file) to ConversionQA. The relevant parts of the zip are extracted, parsed and processed before passing into the content comparison engine.

DITA maps

The whole file structure of DITA maps must be packaged as a zip before passing to ConversionQA. As well as the zip file, you need to specify a path within the zip that points to the top-level map. Once passed to ConversionQA, the zip file is unpacked, and the map is published using the DITA-OT to a single-file DITA document. This single file is then treated in the same way as a normal single file XML input.

Functionality

The primary purpose of ConversionQA is to check that a document conversion has been successful, i.e., it hasn’t lost any content in the process. This means that changes that don’t affect the content itself are ignored. For example:

  • Merging two paragraphs together
  • Changing the text formatting of one or more words
  • Converting a bulleted list into a numbered list

There are occasions where these changes might result in a content change as well, e.g., adding a space in between two words when merging paragraphs or adding a period at the end of each sentence when converting a bulleted list into a paragraph. These changes will be highlighted in the report.

Other significant changes, like added or deleted paragraphs, will also be included in the report. The changed content is included alongside an XPath showing where to find the change in the input, e.g.

Running Conversion QA

We’ve designed ConversionQA to fit into the larger content conversion workflow. This means that you can build it into your applications using either a Java API or by hosting it as a REST service, which you can make calls to from any coding language. Each call returns a Boolean representing whether the content conversion was successful or not, and, in the case, where there are differences, the HTML report is available.

If you need to check content after conversion, get started with a ConversionQA trial today.

Keep Reading

Managing Risk in Legal Documentation

/
Proactively addressing compliance, accuracy, and security risks in legal documentation is essential to protect from costly errors.

Ensuring Accuracy in Legal Documentation

/
Efficient document comparison and merging can drastically improve accuracy, collaboration, and compliance for legal teams.

Introducing HTML Compare

/
HTML Compare is your go-to for tracking, comparing, and managing HTML content changes with ease, offering clear visual highlights and customisable settings.

Introducing Subtree Processing Mode for Greater Flexibility

/
A new feature that lets you control how content is compared by processing sections as either text or data.

Beyond Step-Through XSLT Debugging

Print-debugging in XSLT provides a broader view of code behaviour by capturing variable values at multiple points.

Solving Common Challenges with Inaccurate Document Management

Discover practical strategies to overcome common challenges in regulated industries.

How to avoid non-compliance when updating technical documents in regulated industries

Navigate the challenges of updating technical documents in regulated industries.

Built-in XML Comparison vs Document Management Systems (DMS)

Compare using specialised XML comparison software versus a DMS in regulated industries.

How Move Detection Improves Document Management

Learn how move detection technology improves document management by accurately tracking relocated content.