Posted13 September 2024
byPhil G. Fearon

Beyond Step-Through XSLT Debugging

Posted13 September 2024
byPhil G. Fearon

To get different viewpoints when analysing XSLT code behaviour, we sometimes need alternatives to the step-through debugger built into XSLT code editors like Oxygen. ‘Print Debugging’ is one alternative. Print debugging involves the use of ‘print commands’ inserted at points within code to print out values for specific variables. In this article I describe how, with the help of some tools we can use print-debugging more effectively in XSLT 3.0.

Why we need print debugging

To help explain what print debugging brings for programming, I will use an analogy from the time I spent fault-finding electronic circuits when studying engineering many years ago.

With a multimeter, a measurement probe is placed at a point in the circuit to take readings (say voltage and resistance), broadly equivalent to a break-point placed in a specific part of the code and noting the values of specific variables.

On the other hand, with the multi-trace oscilloscope, several measurement probes are inserted into the circuit at one time, and we see a trace for each probe simultaneously, this is equivalent to the insertion of print commands at several places in the code before running it.

So, step-through debugging gives us a small viewpoint into our code at a specific time; we run code from one breakpoint to another or even step-through line-by-line, but this requires heavy user interaction, and importantly, we cannot normally step backwards.

In contrast, with print-debugging we see variable values at many points in our code; moreover, we can see how these values change over time by scrolling through the printed values.

Given these differences, it is almost inevitable that the way we debug will affect the way we write or modify our code. Print-debugging gives a potentially wider overall picture of how functions and templates work with each other, using this knowledge can help improve the high-level structure of our code.

Why XSLT print-debugging is unpopular

So, print-debugging, for certain scenarios, gives us more ‘at a glance’ information to debug and maintain our code and avoids the need to manually step-through code or manage breakpoints. Why then do we not use it all the time?

Let us summarise the main challenges:

Inserting print-commands for each ‘watch variable’ is tedious and error prone
Print-commands add ‘noise’ to our code making it harder to maintain
Finding the print command in our code that printed a specific value
The serialisation of types like XPath maps is not easy to read
The sheer volume of printed variable information can inhibit effective analysis

With the right tools, we can effectively overcome each of these challenges as I set out in the following section.

Tools to help with print-debugging

DeltaXML’s XSLT/XPath for extension for Visual Studio Code is covered here because it has features intended specifically for print-debugging in XSLT, these features are documented in the User Guide. (All screenshots included are from Visual Studio Code and the code being debugged is ‘real’, that is, from my current XSLT project).

With some customisation, similar print-debugging features can be used in other XSLT editor like Oxygen, combined with an external terminal application with extended colour support if it lacks an integrated one.

Auto-generation of XSLT print commands

In XSLT, the < xsl:message> instruction is the main ‘print command’ (the trace() function for XPath is available but tends to be used less frequently). Using an XSLT text-value template within a single < xsl:message> we can print a useful header label followed by several variable names and their corresponding runtime values paired together in two left-justified columns:

Using < xsl:message> in this way makes the instruction readable but also easy to edit in future. While it would take a minute or two to insert the above example code manually in the editor a code editor’s auto-completion can use the code context at the cursor position to insert all this in an instant with just a single user-action.

In VS Code, for example, all variables declared up to the cursor position can be printed using the < xsl:message> auto-completion action. The < xsl:message> instruction shown below was generated in such a way.

An extract from the printed output from the generated < xsl:message> is shown below:

The printed output shows the state of all parameters and variables for the named function, text is justified so variable names appear in a column distinct from the values. The full printed output spanned several pages, but we can view state each time the function is invoked by scrolling up or down.

Printing variable values in the terminal

In the previous example, each ‘watch variable’ in the < xsl:message> instruction was included as an argument to the custom XSLT function ‘ext:print()’. For example: ext:print($nodesInSegB,11,' '). The ext:print() function is an alternative to XSLT’s ‘serialise()’ function designed specifically for print-debugging and is integrated into VS Code but maintained and documented in an independent open source repository.

In XSLT, complex data structures can be composed using maps, arrays and sequences. To debug code using these types we need to quickly comprehend their content as our code runs. The ext:print() function is designed for this, it outputs these types using a concise notation like JSON and adds formatting and optional colouring in the terminal to help.

Map sequence example with ext:print()

The print output example below shows a sequence of two XPath maps, each map has a ‘location’ property whose value is also a sequence of maps. The ext:print() function uses adaptive line-formatting, indentation, and bracket-pair colourisation to show the structure but keep the output compact.

Map sequence example with serialize()

In contrast, the same sequence printed using XSLT’s ‘serialize()’ function gives the same information but, for me at least, is more difficult to take in, I have had to use word-wrap on the single line, so it fits this page:

Preventing Information Overload

With print-debugging, we can use a range of strategies to reduce the volume of printed output. We can also use text-search facilities in our terminal to quickly find specific labels in the output.

Knowing that an < xsl:message> instruction capable of printing many variables can be auto generated in an instant, we should remove these instructions once there is no immediate need for them, they can be quickly reinstated if required later.

Enclosing a generated < xsl:message> instruction within an < xsl:if> conditional lets us specify under what conditions the instruction should be invoked. Also, we can modify the XPath expression referencing a variable to filter the output, for example, with XPath sequences it is easy to add a predicate to the expression inside the < xsl:message> referencing the variable to only print the last ten items in a sequence.

XSLT 3.0 debugging in practice

I use print-debugging almost exclusively for XSLT 3.0 development now. There are however occasions when, if the XSLT output is XML, and the code of concern is solely responsible for generating that output, it is easier to add special ‘trace attributes’ to the XML elements in the output to indicate what generated those elements and why.

I find print-debugging especially useful with XPath maps which are useful in many different scenarios when processing interim data that is non-trivial. Also, it is easy to add a single print command to track the state of all parameters in code components like an < xsl:iterate> instruction or a recursive template or function at development time to check the code is performing the way intended.

In the past, the hand-crafting of print < xsl:message> instructions was painstaking and error-prone, I would therefore keep < xsl:message> instructions on a branch on the version control system until I was very confident they were no longer needed. I would also frequently comment out these instructions only to restore them later. These < xsl:message> instructions therefore made the code bloated and less easy to read as a whole.

Now, however, I delete < xsl:message> instructions as soon as the immediate task is complete, safe in the knowledge that they can be regenerated just as quickly as if needed again.

With print-debugging, because XSLT processors can and do perform ‘lazy execution’, we should be aware that print commands can either affect the order of execution of our code, or indeed force execution of code that would otherwise be skipped. So long as we are aware of this we should not get caught out. Step-through debuggers face a similar problem, in that they tell us a story about the order of execution which many not be the reality once full optimisation is in place at runtime.

XSLT Print-debugging future

Improved Serialisation

The ‘ext:print()’ function demonstrated earlier is almost indispensable for print debugging, it does however have room for improvement. For example, the way XML elements and their values are represented is something of a compromise. It would be nice to see a built-in equivalent when XSLT 4.0 comes along also.

Generation of < xsl:message>

The < xsl:message> instruction auto generated in VS Code has a ‘header’ line that could contain more useful information. For example, if there is a context position, this could be included along with the context size.

The XPath trace() function

In XSLT 3.0 and the proposed XSLT 4.0, XPath plays a much more prominent role as it has become more powerful and expressive. As more processing is performed within more complex XPath expressions, the importance of XPath’s trace() function will increase. VS Code (and other XSLT editors) should be extended in future to help with wrapping XPath expressions in a trace() function, and removing that trace() function once it is no longer required.

Keep Reading

Ensuring Accuracy in Legal Documentation

30 October 2024

/

0 Comments

Efficient document comparison and merging can drastically improve accuracy, collaboration, and compliance for legal teams.