Setting a New Standard in XML Comparison

An introduction to DeltaXML and the industry standard in XML Comparison

Introduction to DeltaXML

Founding and Early Development

Firstly, a little bit of history. In 1991 our founder and CEO, Robin La Fontaine, working with a small team, began providing consultancy services to the US standards organisation EIA (Electronic Industries Alliance). This work led to the development of the EDIF (Electronic Design Interchange Format) standard, a vendor-neutral standard used for the exchange of electronic circuit design data.

Later, in 1998, when XML emerged as an international standard for the representation of data, Robin realised that the methods used in developing file comparison in EDIF could be migrated to the new format and, in 2000, he obtained his first patent. This patent covered a novel algorithm, developed by Robin and his team, for the intelligent comparison of two XML files. Then, in 2001, our first commercial product, XML Core, was released, immediately gaining our first major customer, in the semiconductor industry.

Innovation and Patent Achievements

Since then, XML Core, now renamed XML Compare, has undergone continuous development with many improvements and extensions to its functionality. A second patent was obtained in 2007 for the comparison of 3 or more XML files and a third patent in 2013.

Example of a SVG comparison

Over the years XML has developed into a number of standards for the representation of documents as well as data. In response in 2010 we released DITA Compare and DocBook Compare. These products use the same patented algorithms as XML Compare but are optimised for handling document files in the two formats. In 2015 we released the XML Merge product which allows for the comparison and merging of 3 or more versions of an XML file. In 2019 we introduced XML Data Compare, a low-code version of XML Compare, optimised to handle structured data, for example Product Information, rather than documents. More recently, we have added graphical comparison of SVG (Scalable Vector Graphics) files to our XML, DITA and DocBook Compare products.

Industry Shifts

Since the early 2000’s JSON (JavaScript Object Notation) has emerged as a rival to XML for the representation and exchange of structured data; so in 2020 we released DeltaJSON. This product again uses the same patented algorithms as XML Compare but this time optimised to handle the comparison of data files in the JSON format.

As you can see from this brief summary, DeltaXML has specialised for its entire, more than 20-year history, in providing solutions to the complex problem of managing changes in structured XML content. Our products are the most intelligent and technically advanced available, and are unrivalled for accuracy and flexibility. Our products have been continuously developed in response to technological developments and customer feedback.

Marketplace

There are many other software products on the market which can carry out comparisons between XML files. These range from free-to-use online tools, through licensed SaaS offerings, to differencing modules integrated into XML editors or Component Content Management Systems (CCMS).

Simple online tools are well suited to low volume applications where file sizes are small, the differences between files are fairly simple, and a certain amount of manual intervention is acceptable. Support is usually limited and, of course, data security may be compromised by the necessity to export and re-import the users’ data.

SaaS tools certainly have their place too and, in fact, DeltaXML offer our range of Content Compare products as a low-cost alternative to our full on-premise implementations. These products have been optimised for a number of XML grammars, including S1000D, JATS and XSL-FO. As with online tools, these products are more suited to low volume applications with small file sizes; also, data security may be an issue. However, our SaaS products have been designed with data security as a priority. Support is usually provided and software maintenance is done entirely by the supplier, reducing the overheads associated with IT management. A free 90-day trial is available for these tools.

Integrated differencing modules work well, if you are using the appropriate editor or CCMS and don’t need to integrate with any other software. However, for full integration within a complex workflow a comprehensive API interface is essential, such as the REST APIs included in all DeltaXML’s on-premise products. Also, some of these modules can struggle with complex sequences of changes between file versions and with merging files from multiple sources. DeltaXML’s patented change management algorithms can handle these issues with minimal manual intervention. XSLT pipelining and unique, patented delta file outputs enable our software to be flexibly configured to cope with the most demanding use cases.

Key Features

Intelligent, structurally aware comparison algorithms

XML Compare is our flagship XML change management product and our other products use the same technology adapted for their particular applications. XML Compare provides a powerful solution for identifying and processing the differences between any two XML files that share the same root element. Its primary use is as a toolkit for integration into other systems or applications via comprehensive APIs, but it may also be run standalone from the command-line or a simple GUI. XML Compare’s XML Comparison features work through your two XML files, analysing their structure and matching up all the corresponding elements between them. XML Compare identifies all of the differences in your content according to your specific configuration. It then writes them to an output file which combines your original content with new markup detailing the changes.

Accurate results

Accuracy is somewhat subjective, depending on requirements, but because of our products’ structural awareness, our intelligent algorithms don’t produce false positives or trip up over complex changes. While other basic comparison tools have their own idea of what is technically accurate and correct, with DeltaXML you are able to define what is usefully accurate and correct.

To name just one example, simple line-by-line comparisons often produce false positives when content is added or removed. The reason for this is that if you add a line to a document pushing other lines down, they will no longer line up and can result in the entire document being shown as false positives. This is further complicated by additions, deletions, edits, and moves, which simple tools simply can’t identify and end up displaying as additions or deletions for lines that have simply been edited or moved.

Anybody who has tried using Git Merge to resolve conflicts when complex changes have been made by their development team will understand this pain (and will be pleased to hear that we’ve got a solution for this!).

By taking into account the structure of an XML document, our solutions eliminate these errors. By comparing both structure and content, and tracking changes in both, we can match up content that has moved, and been altered in all sorts of complicated ways.

On that note, let’s look at configuration…

Highly configurable

As mentioned previously, sometimes what is technically accurate is not usefully accurate – for example, when you are getting false positives, or have collections of changes you are expecting that are not significant.

DeltaXML can be configured to handle these situations, even in the most complex use cases. The extensive built-in functionality includes a large number of features, both general and also industry-specific. All of these have guides and samples, ready to be downloaded and used. Examples include: unordered comparison, MathML, ignoring white space etc; you can turn these on and off as required.

All of our products follow a similar pattern whether they are for two-way comparison or for n-way merging. Almost all of our XML processing is carried out in modular transformation stages using either XSLT or Java’s SAX filters. Each XML input passes through its own pipeline before it reaches the comparison or merge operation. The input pipelines can be individually defined for each input, or you can use the same one for every input. Once the compare or merge has taken place, the result is passed through an output pipeline where transformation of the delta result takes place. This could be to ignore changes, modify the way that changes are represented or, in the case of our format-specific products, convert our delta markup into native markup for the format.

In 2017 the Danish legal publishing house, Karnov Group, purchased a similar Swedish company, Norstedts Juridik. This meant that they needed to merge the two companies’ databases, comprising over 16,000 files of legislative information stored as XML. They used XML Compare to automate this task and the configurable pipelining functionality was absolutely essential for the successful completion of the project.

“I have close to 30 years in the field and in my experience XML Compare doesn’t have an equal at what it does.”

Source: Atlassian, How Karnov Group Merged Two Legal Publishing Companies’ Incompatible Content Databases, Ari Nordstrom, Case Study

Flexible outputs

XML Compare is not just an XML differencing tool, it is an enterprise solution for finding and processing changes between two XML documents. It can output differences in a number of pre-defined formats, including, for example, our HTML side-by-side view, or alternatively to an XML file so that differences can be processed through an XSLT pipeline to represent changes however you wish, wherever you need. Many different options are available for output, such as PDF, HTML and Redlines for human inspection, or database and others for machine processing. We also offer plug-ins to major XML editors such as ArborText, Oxygen, and FrameMaker; these produce outputs in each of their proprietary tracked changes formats.

Easily integrated

DeltaXML’s products are optimised for integration into any enterprise workflow and content management systems with many examples, such as DeltaXML partners Docufy and Ixiasoft, using our products to gain an unparalleled competitive edge. All of the products’ functionality is accessed through Java or REST APIs, which are fully documented and come with complete, working examples, which you can download from our documentation site and start working with immediately. We also offer easy integration with XML editors such as OxygenFramemaker and Arbortext.

We’ve designed our APIs with the clear intent of making it quick and simple to integrate DeltaXML’s toolkits into your own environment, whether you are a Java developer or not. Our REST APIs also offer a great deal of flexibility, allowing you to use DeltaXML products from a much wider range of programming languages. In addition, the on-premise REST API comes with built-in queuing functionality, as well as endpoints to check on the status of any job.

Patented algorithms

Our patented comparison algorithms are the best in the industry and are backed up by over 20 years of intensive research & development.

Patent granted 2001270901; EP1325432; 60134999.7; US8,196,135B2; CA 2416876; US 8,423,518 B2; EP2174238; 602008031420.0. Patents pending 1315520.5; 14275178.3; 14/474,377

Industry experience

DeltaXML’s products are essential tools for major companies in many industries where the accuracy of data and documentation is absolutely vital. Accuracy is essential where it is necessary to satisfy statutory, regulatory, commercial and safety requirements. Typical examples include finance, aerospace, pharmaceuticals and publishing, including many household names, such as large pharmaceutical companies and aircraft manufacturers in North America and Europe.

Support

DeltaXML provides a full support service to all our customers, including comprehensive documentation, user guides and examples, and a full-time Support Desk. You can also get in touch with us easily, and if you need advice on implementation, infrastructure, or specialist expertise, you can speak with our solution architect partners.

DeltaXML has partnered with some of the leading companies in the field of XML content creation and management. Together we can provide integrated solutions with proven reliability and accuracy. In addition to the partners who have integrated our software, we have contacts throughout the industry who can help with specialist expertise in all things to do with XML and XSLT.

Certifications

DeltaXML is certified to ISO 9001 for quality management and ISO 27001 for data security.

Deployment

DeltaXML products are available with a range of different licensing arrangements including on-premises, SaaS or cloud deployment. Our concurrent operation licensing solution is designed to work on any kind of machine – from personal laptops and computers to physical servers and scalable cloud environments such as on Amazon Web Services. To find out more please get in touch.

Key Benefits

  • Patented delta output file format enables flexible, application-specific post-processing
  • 100% accurate automated content comparison saves time, money and resources
  • Flexibility of configuration reduces manual intervention to the minimum
  • Ease of integration into existing enterprise-wide CMS or workflow eliminates bespoke coding
  • Choice of output formats streamlines authoring, reviewing and editing processes
  • Our technology is fully scalable making it ideal for large document volumes or big-data projects
  • Comprehensive support package reduces system administration overhead

Summary

DeltaXML produces a range of software products based on our industry-leading, patented technology. These products are aimed at enterprise-wide, high-volume applications that demand the most accurate identification and management of changes in XML or JSON content. They are designed for ease of integration into existing Content Management Systems, content creation workflows and data management systems. They are equally suited to handling documentation or structured data and their flexibility and configurability enables them to tackle the most complex and demanding processes with the minimum of manual intervention. DeltaXML has specialised in XML comparison technology for over 20 years and has wide experience of applications across a range of industries where accuracy is crucial, for example aerospace and pharmaceuticals. All DeltaXML products can be downloaded for a free trial with full support. If accuracy and traceability in your XML data or documentation is essential to you, please contact us.

Need to compare your documents or data?

Schedule a personalised, guided demo with a DeltaXML expert today. Discover firsthand how our comparison tools can elevate your revision and version control processes.

Book a discovery call

Keep reading…