Move detection when comparing XML files

May 2024 – DeltaXML brings advanced move detection to XML comparison for enhanced document precision.

XML Compare has always excelled at structural comparisons, ensuring that changes in element order are accurately identified. However, many users face the challenge of tracking content blocks, such as paragraphs, that move within a document. We are excited to introduce the latest feature in our XML Compare product: “Move Detection.” This new capability addresses a significant challenge—accurately tracking content that has been relocated within a document, even when unique identifiers are not present.

The ability to detect finer-grained changes, such as the movement of paragraphs or sections, is crucial across various industries. Whether you are managing extensive standards documents, facilitating seamless communication between authors and editors, or ensuring the integrity of legal contracts, the precision offered by move detection is invaluable.

Our new “Move Detection” feature offers enhanced configurability and flexibility, allowing you to identify and track moved content with ease.

Benefits of enhanced move detection during comparison

  • Allows move detection without the need for unique identifiers, making it easier to track changes in documents without additional markup.
  • Cleaner comparison results by identifying moved content, reducing the clutter of added and deleted elements.
  • Users can enable or disable move detection, configure which elements to track, and set the scope of detection based on their specific needs.
  • Facilitates clearer communication between authors and editors by indicating moved content instead of deletions and additions.
  • Gives users confidence in the accuracy of their document comparisons by providing detailed move detection and change identification.
  • Particularly beneficial for documents with extensive content reorganisation, making it easier to track and review changes over long release cycles.

How will moves be shown within my results?

For those familiar with DeltaXML, you know that the strength of our comparison solutions lies in their robust configuration and integration capabilities, largely enabled by our intuitive deltaV2 result file. At its core, the deltaV2 format simplifies the comparison of ‘A’ and ‘B’ documents by combining them into a single document. Once move handling is enabled the Move Source is identified using a @deltaxml:move-id and the Move Target is identified using a @deltaxml:move-idref which has the same value as the @deltaxml:move-id. We call this a Move Pair. In this format, deltaxml:deltaV2 attributes (within the DeltaXML namespace) are added to elements where differences exist. These attributes can take one of the following values: A, B, A=B, or A!=B. Here, ‘A’ or ‘B’ signifies the document source, while the ‘=’ or ‘!=’ separator indicates whether the matching source elements are identical or different.

With the introduction of move detection, a new attribute, deltaxml:movedText, is added to elements where moves have occurred. This attribute highlights the moved content, ensuring that both the original and new locations are clearly identified within the deltaV2 file.

By maintaining results in this XML format, you retain the flexibility to use XSLT to transform these results into any format or process that suits your needs. This approach ensures that the powerful insights provided by move detection are easily integrated into your existing workflows and systems, enhancing your ability to manage and review document changes efficiently.

Why is move detection so beneficial?

Detecting content moves is essential for maintaining the integrity and readability of documents. Without move detection, relocated content might be marked as deleted from its original location and added elsewhere, creating confusion and clutter in the comparison results. By accurately identifying moves, this feature ensures a cleaner, more intuitive output, highlighting genuine changes in the content.

Enable or Disable Move Detection

The “Move Detection” feature can be optionally enabled or disabled based on user preferences. This allows you to decide when move detection is necessary for your documents, giving you greater control over the comparison process.

Identify Move Candidates

Users have full control over which elements should be considered for move detection through XPath configuration. This allows for both broad and precise selection criteria. For example, a technical writer updating a large manual can configure the system to detect moves in paragraphs but exclude those containing images by using an XPath such as //p[not(descendant::image)]. This specificity ensures that only relevant content moves are tracked, streamlining the review process.

Configure a ‘Move Class’

The move class feature enables users to limit the scope of move detection to specific sections of a document. By setting a move class with an XPath such as ancestor::section/@id or ancestor::section/title/text(), users can ensure that moves are only recognised within defined boundaries. This is particularly useful for standards organisations that need to track changes within individual sections of long documents, ensuring that content moves are accurately captured without generating unnecessary noise.

Optionally Remove the Move Source

For users who prefer a cleaner comparison output, this feature allows the removal of the original location of moved content, displaying only the new location (the ‘move target’). This option is beneficial in scenarios such as collaborative editing, where an author might want to see only the final position of edited content without the clutter of its previous location. For example, in academic publishing, this can help streamline the review of moved content in research papers, making it easier to focus on the current structure.

Advanced Configuration Options

For those needing more granular control, advanced configuration options are available. These settings allow you to define the circumstances under which a move candidate is included. For instance, you can choose between an ‘unrestricted’ mode, where even content moved within deleted sections is detected, and a ‘restricted’ mode, where only directly marked deletions are considered for moves. This advanced configurability ensures the move detection process can be tailored to fit specific document management needs.

Let’s get comparing

If you’re already a DeltaXML customer, you can access this new “Move Detection“ feature by simply updating to the latest version. Our comprehensive documentation will guide you through everything you need to know to make the most of this powerful enhancement.

For those new to DeltaXML, we invite you to trial our products today. Take advantage of our free samples and discover how easily you can manage your changing content.

If you have any questions or would like a demo, please don’t hesitate to get in touch. We’re here to help you make the most of your XML comparison processes.

We’d love to hear your feedback on this feature or any ideas you may have for future improvements, so please share your thoughts in the comments section below. Your input is super important in helping us make our solutions even better for you. Thank you for your continued support and collaboration, and to make sure you never miss a new feature sign up to our newsletter.

Keep Reading

Managing Risk in Legal Documentation

/
Proactively addressing compliance, accuracy, and security risks in legal documentation is essential to protect from costly errors.

Ensuring Accuracy in Legal Documentation

/
Efficient document comparison and merging can drastically improve accuracy, collaboration, and compliance for legal teams.

Introducing HTML Compare

/
HTML Compare is your go-to for tracking, comparing, and managing HTML content changes with ease, offering clear visual highlights and customisable settings.

Introducing Subtree Processing Mode for Greater Flexibility

A new feature that lets you control how content is compared by processing sections as either text or data.

Beyond Step-Through XSLT Debugging

Print-debugging in XSLT provides a broader view of code behaviour by capturing variable values at multiple points.

DeltaXML’s Smart Comparison Report

With clear insights and detailed analysis, DeltaXML's new Comparison Report makes fine-tuning configuration easier than ever.

Solving Common Challenges with Inaccurate Document Management

Discover practical strategies to overcome common challenges in regulated industries.

How to avoid non-compliance when updating technical documents in regulated industries

Navigate the challenges of updating technical documents in regulated industries.

Built-in XML Comparison vs Document Management Systems (DMS)

Compare using specialised XML comparison software versus a DMS in regulated industries.