The Mystical Principles of XSLT: Enlightenment through Software Visualization

Evan Lenz, President
August 2, 2016
The Mystical Principles of XSLT
Enlightenment through Software Visualization

2
Grokking XSLT
 Easy to learn just enough to be dangerous
− Lots of terrible code in the wild
 Understanding template rules is elusive
− Not what procedural programmers expect
− Lots of explanations that are partial, imprecise, or ambiguous
 XSLT really isn’t that difficult a language
− But only if you grok template rules
 A lot that’s invisible

Avoiding xsl:apply-templates
3

Befriending xsl:apply-templates
4

By analogy to OO
 From XSLT 1.0 Pocket Reference:
− http://lenzconsulting.com/how-xslt-works/#applying_template_rules
• “In object-oriented (OO) terms, xsl:apply-templates is like a function that
iterates over a list of objects (nodes) and, for each object, calls the same
polymorphic function. Each template rule in your stylesheet defines a different
implementation of that single polymorphic function. Which implementation is
chosen depends on the runtime characteristics of the object (node). Loosely
speaking, you define all the potential bindings by associating a “type” (pattern)
with each implementation (template rule).”
− http://lenzconsulting.com/how-xslt-works/#modes
• “Furthering the OO analogy, a mode name identifies which polymorphic
function to execute repeatedly in a call to xsl:apply-templates. When
mode="foo" is set, foo acts as the name of a polymorphic function, and each
template rule with mode="foo" defines an implementation of the foo ‘function.’”
5

By analogy to other instructions
 This:
<xsl:apply-templates select="animal"/>
…
<xsl:template match="animal[@type eq 'cat']">…</xsl:template>
<xsl:template match="animal[@type eq 'dog']">…</xsl:template>
 is (more or less) equivalent to this:
<xsl:for-each select="animal">
<xsl:choose>
<xsl:when test="@type eq 'cat'">…</xsl:when>
<xsl:when test="@type eq 'dog'">…</xsl:when>
</xsl:choose>
</xsl:for-each>
6

De-mystifying XSLT wizardry
 Mysticism:
− “a theory postulating the possibility of direct and intuitive
acquisition of ineffable knowledge or power”
 In relation to XSLT:
− Belief in the possibility that there is a way to make XSLT’s
behavior more directly and intuitively understandable
• By making implicit behavior explicit and the abstract concrete, through an
interactive visualization
7

Elusiveness of the task
 Alternating between faith and doubt that what I’m
imagining is possible or useful
 Grand schemes that collapse into themselves
 Making abstract ideas concrete
− Shapes in my mind converted to shapes on the screen
 Disillusionment
− Finding out that certain ideas weren’t well-formed
− The crucible of incarnation
 The feeling that there’s nothing really “there”
8

Like a debugger?
 Debugging is like the scientific method
− The debugger is the instrument of measurement
− Start with a hypothesis
− Set:
• breakpoints
• watch variables
− One step at a time, careful not to “step over” when we want to
“step into”—otherwise we’ve missed our chance and have to start
all over (because we can’t go backwards in time)
9

Not like a debugger
 Software visualization for XSLT:
− A holistic experience, rather than a narrow line of inquiry
− A space in which to freely explore, in any direction
− Not bound by our previous steps
− No need to start with a question
− Just start playing around and see what we will see
10

Not like a debugger
 Sri Aurobindo on gnostic knowledge:
− “For while the reason [like a debugger] proceeds from moment to
moment of time and loses and acquires and again loses and
again acquires, the gnosis [or visualization tool] dominates time
in a one view and perpetual power and links past, present and
future in their indivisible connections, in a single continuous map
of knowledge, side by side.” (The Synthesis of Yoga, p. 464)
11

Escaping the bounds of time
 A transformation is fundamentally:
− a data set
− not a process
 XSLT’s functional nature
− More like an abstract sculpture, showing relationships
− Less like a movie
 Time is just one way to traverse the relationships:
− we can go forward and backward in time (result document order)
− we can step outside of time
− we can traverse the relationships using a different dimension
12

Transformation as microcosm
 “The Big Bang”
− the initial context node
 “The Engine of Creation”
− template rules
 “Emergence/Manifestation”
− the result tree
14

Scope and method
 Scope:
− Template rules (and maybe xsl:for-each)
• XSLT’s core processing model
− XPath, though foundational, is out of scope
 Method:
− Explode a transformation into all its parts
− Put them back together
15

Fundamental unit: the focus
 As defined in the XSLT Recommendation, consists of:
− the context item (.),
− the context position (position()), and
− the context size (last()).
16

An expanded definition of focus
 We will define “focus” as:
− the particular instance of a focus
− i.e. each instantiation of a focus-changed sequence constructor
• body of <xsl:template>, body of <xsl:for-each>, etc.
 Then we can consider the shallow result chunk implied by
a focus to be an intrinsic property of that focus
− 1:1 relationship
17

For example
 Given a single instantiation of this rule:
<xsl:template match="heading">
<title>
<xsl:value-of select="."/>
</title>
</xsl:template>
 We could represent the “focus” like this:
<trace:focus context-id="n1a833c7b3a005151"
context-position="1"
context-size="1"
rule-id="nfa49863d62353482"
invocation-id="bc08a736-13f7-e97b-a272">
<title>This is the title</title>
</trace:focus>
18

For example
<title>
</title>
</xsl:template>
context-size="1"
rule-id="nfa49863d62353482"
</trace:focus>
19
Cosmic address
(GUID) for the
current invocation
of <xsl:apply-
templates/>

For example
<title>
</title>
</xsl:template>
context-size="1"
rule-id="nfa49863d62353482"
</trace:focus>
20
How many
nodes are in the
list being
processed by
that invocation

For example
<title>
</title>
</xsl:template>
context-size="1"
rule-id="nfa49863d62353482"
</trace:focus>
21
In other words:
how many foci
belong to the
current
invocation

For example
<title>
</title>
</xsl:template>
context-size="1"
rule-id="nfa49863d62353482"
</trace:focus>
22
The position of
the current
focus in that list

For example
<title>
</title>
</xsl:template>
context-size="1"
rule-id="nfa49863d62353482"
</trace:focus>
23
generated ID of
the node being
processed (a
<heading>
element)
Cosmic address
(generated ID) of
the node being
processed
(a <heading>
element)

For example
<title>
</title>
</xsl:template>
context-size="1"
rule-id="nfa49863d62353482"
</trace:focus>
24
generated ID of
the node being
processed (a
<heading>
element)
Cosmic address
(generated ID)
of the matching
template rule
in the stylesheet

For example
<title>
</title>
</xsl:template>
context-size="1"
rule-id="nfa49863d62353482"
</trace:focus>
25
The shallow
result chunk that
this focus creates

For example
<title>
</title>
</xsl:template>
context-size="1"
rule-id="nfa49863d62353482"
</trace:focus>
26
The shallow
result chunk that
this focus creates

Why it’s called a “shallow” result chunk
 It doesn’t include the results of nested invocations:
<trace:focus context-id="n292122aca28a94ca"
context-size="1"
rule-id="n8ef8c66cac078ff"
invocation-id="4dd7580b-d44e-a0e4-714d">
<html>
<head>
<trace:invocation id="bc08a736-13f7-e97b-a272"/>
</head>
<body>
<trace:invocation id="6129c19f-2632-e99c-53b5"/>
</body>
</html>
</trace:focus>
27

context-size="1"
<html>
<head>

</head>
<body>

</body>
</html>
</trace:focus>
28

context-size="1"
<html>
<head>

</head>
<body>

</body>
</html>
</trace:focus>
29
Links to zero or
more other foci
having the given
invocation ID

Mystical principle: To see is to create
 Inner seeing results in outer manifestation
− They are not separate; they are two sides of the same coin.
 To focus on the input is to create the output
− (as defined by the template rule)
 They are inextricably linked, 1-to-1
− As defined here, the focus and the result chunk are intrinsic to
each other.
30

Mystical principle: All is accessible
 Each focus is connected to every other focus via
successive invocation-id links
 Every entity (node, rule, or focus) has a “cosmic address”
(GUID or generated node ID)
− Unique, globally accessible address within the world of the
transformation
 You can begin anywhere in the world and traverse to
anywhere else in the world, including past, present, future
 Nothing is lost through the passage of time
31

Mystical principle: Inner upholds outer
 The result tree is not only isomorphic to the tree of foci
(the XSLT execution tree)
 It is an adornment of that tree
− The result is the part we see
− The inner structure upholds the outer result
 The visible world is a decoration of the invisible
− a manifestation of a deeper, unseen reality
32

Mystical principle: There is no separation
 Distinctions are only by definition and thus arbitrary
 A focus and result chunk are “intrinsic” to each other only
because we are looking at it that way
 A focus is “extrinsic” to another focus only because we
defined it that way
 Ultimately, there is only an unbroken whole
 But:
− Division and distinction are functions we possess
− There is joy in the division and joy in the reunion
− That’s why we do it 
33

Bringing this down to earth: a case study
 Client project to rewrite an enrichment engine
 Technical articles are automatically enriched with search
terms based on their content
− (using a taxonomy of terms and reverse queries in MarkLogic)
 There are various business rules about what types of
article should be enriched with what types of terms
 The rules need to evolve over time
 Most importantly, the rule behavior needs to be
transparent to the analysts and taxonomy managers
35

XSLT pipeline for enrichment rules
 For the rewritten engine’s enrichment rules, I decided to
use XSLT as the rules language (surprise, surprise!)
− declarative and powerful
− simplified by using a multi-stage pipeline
− good in conjunction with the modified identity transform
 The article passes through unchanged, except that new
metadata (“enrichment”) is inserted into each section
 Example stages:
36
− get-terms
− add-inherited
− remove-subsumed
− remove-excluded
− add-more-terms
− filter-terms

Distinction between “rules” and “engine”
 Entire enrichment engine consists of XSLT template rules
− (and imported XQuery libraries)
 But not all template rules are created equal
 Some modes are “engine-level”
− not meant to be regularly customized
 Some modes are “rules-level”
− direct implementations of the business rules
− meant to be customized to handle different content scenarios
 The latter are the ones that need to be made transparent
37

Making the business rules transparent
 Modes are grouped by pipeline stage
 Each mode has a default behavior
− e.g. mode="terms" by default uses a reverse query to find all
matching terms
− can be overridden to provide a different behavior
• (such as to fix the term for a particular article type, regardless of the content)
 The default rules don’t need to be shown
− Only the custom overrides
38

Trace-enabling the XSLT
 Used XSLT to pre-process (“trace-enable”) the original
engine and rules XSLT
 The trace-enabled stylesheet additionally generates:
− inline trace data
• <trace:match-start> and <trace:match-end> markers in the result tree
− out-of-band trace data
• using xdmp:set() in MarkLogic
 The “rule-level” modes are annotated in the original
engine code
− so the trace-enabler knows which template rules to augment
− [show example, line 538 of engine.xsl]
− [also show trace data example in oXygen]
40

Other techniques used
 Documenting the rules inline, so they get fed straight to
the tracer interface
 Storing the trace data for each input document into the
(MarkLogic) database for faster subsequent renders
 Automatically invalidating (and thus forcing re-generation
of) the cached trace data whenever a rule or data change
is detected
41

Generalizing to arbitrary XSLT

Characteristics to preserve
 Colored lines to depict relationships
 Visually represent XML as XML
 Interactive slider for building/un-building the result
43

Difficulties
 How to visualize temporary trees?
− i.e. results stored in <xsl:variable>
 Modifying template rule results could break the stylesheet
− type errors
− node count dependencies, etc.
44

Solution: Store trace data out-of-band
 Via side effects
− Write a document to a database, or
− Update a global variable
 Benefits:
− avoid type errors
− avoid unexpected (altered) stylesheet behavior
− shredded trace data may suggest more scalable visualization
implementations
• (e.g. for large source documents)
45

Links
 Online demo here:
− http://xmlportfolio.com/xslt-visualizer-demo/
 Code on GitHub:
− http://github.com/evanlenz/xslt-visualizer
47

The Mystical Principles of XSLT: Enlightenment through Software Visualization

More Related Content

Viewers also liked

Similar to The Mystical Principles of XSLT: Enlightenment through Software Visualization

Recently uploaded

The Mystical Principles of XSLT: Enlightenment through Software Visualization