How to compare JSON values, objects and arrays

Comparing two JSON files is fairly straightforward, though there are a few areas where it is not quite as simple as it seems. The three literal names, truefalse and null are not a problem, though note they must be lower case. Before comparing two numbers, they should be normalised so that 1 and 1.0 would not show a change. Similarly 100 and 1e2 would also be deemed to be equal. For example:

JSON A

{
  "a": 1,
  "b": 100.0
}

=

JSON B

{
  "a": 1.0,
  "b": 1e2
}

Strings may also need some normalisation to handle special character encodings so that for example.

JSON A

{
  "a": "Here are some apostrophes
    ( ' and ' and \u0027 )" 
}

=

JSON B

{
  "a": "Here are some apostrophes
    ( \u0027 and \u0027 and ')"
}

Objects also compare well in that each member property is identified by a string which should be unique within the object (it does not have to be unique but behaviour is unpredictable if they are not unique!). Therefore corresponding members can be identified without ambiguity even if the order of the members is different.

JSON A

{
  "firstname": "Andrew",
  "lastname" : "Other"
}

=

JSON B

{
  "lastname" : "Other",
  "firstname": "Andrew"
}

Any object that has a unique key member should ideally be represented as an object where the key is pulled out as the member string – this leads to unambiguous comparison. See the example below.

Arrays present more of a problem for comparison. This is because arrays are used for different purposes. For example, if an array is used to represent an x,y coordinate, then the expectation is that [ 34, 56 ] is not the same as [ 56, 34 ]. However, if the array is being used as an unordered set of numbers, then the arrays should be considered equal. So comparing by position or as unordered items are alternative approaches to be applied depending on the interpretation of the array data.

Furthermore, comparing by position is not always what is needed when we use an array as a list, where the item order is significant. In this case, comparing [1,3,2,4,5] with [1,3,4,5] by position would give three differences: 2 != 4, 4 != 5 and 5 is a deleted item.

[ 1, 3, 2, 4, 5 ]
  |  |  |  |  x
[ 1, 3, 4, 5 ]

A more intelligent ordered comparison might just say that 2 has been inserted.

[ 1, 3, 2, 4, 5 ]
  |  |  +  |  |
[ 1, 3,    4, 5 ]

So it is arrays that cause most problems in comparing JSON data.

When JSON is generated, arrays are often used where the data could be represented as objects. Converting such an array into an object may therefore be a sensible pre-comparison step in order to get only ‘real’ changes identified.

For example:

{"contacts": [
  {
    "id": "324",
    "first_name": "AN",
    "last_name": "Other"
  },
  {
    "id": "127",
    "first_name": "John",
    "last_name": "Doe"
  }
]}

would be much better represented for comparison purposes as:

{"contacts": {
    "324": {
      "first_name": "AN",
      "last_name": "Other"
    },
    "127": {
      "first_name": "John",
      "last_name": "Doe"
    }
}}

It may not look quite so natural, but the corresponding contacts will be aligned properly.


Try a 30 day free professional trial of DeltaJSON today →

Keep Reading

Managing Risk in Legal Documentation

/
Proactively addressing compliance, accuracy, and security risks in legal documentation is essential to protect from costly errors.

Ensuring Accuracy in Legal Documentation

/
Efficient document comparison and merging can drastically improve accuracy, collaboration, and compliance for legal teams.

Introducing HTML Compare

/
HTML Compare is your go-to for tracking, comparing, and managing HTML content changes with ease, offering clear visual highlights and customisable settings.

Introducing Subtree Processing Mode for Greater Flexibility

/
A new feature that lets you control how content is compared by processing sections as either text or data.

Beyond Step-Through XSLT Debugging

/
Print-debugging in XSLT provides a broader view of code behaviour by capturing variable values at multiple points.

Solving Common Challenges with Inaccurate Document Management

Discover practical strategies to overcome common challenges in regulated industries.

How to avoid non-compliance when updating technical documents in regulated industries

Navigate the challenges of updating technical documents in regulated industries.

Built-in XML Comparison vs Document Management Systems (DMS)

Compare using specialised XML comparison software versus a DMS in regulated industries.

How Move Detection Improves Document Management

Learn how move detection technology improves document management by accurately tracking relocated content.