XQuery/Topological Sort

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Motivation[edit | edit source]

You have a Directed Acyclic Graph (DAG) to track things such as a dependency graph. You want to sort in input DAG of nodes so that in the output reflects the dependency structure.

The Topological Sort of a Directed Acyclic Graph puts nodes in a sequence such that every node references only preceding nodes.

This ordering is needed for example in scheduling processes in a Pipeline.

For example, given a DAG defined as

<node id="a">
       <ref id="b"/>
       <ref id="c"/>
</node>
<node id="b">
       <ref  id="c"/>
</node>
<node id="c"/>

the topological order would be:

<node id="c"/>
<node id="b">
       <ref  id="c"/>
 </node>
<node id="a">
       <ref id="b"/>
       <ref id="c"/>
</node>

The definition of topological order can be simply expressed in XQuery:

declare function local:topological-sorted($nodes) as xs:boolean {
    every $n in $nodes satisfies
          every $id in $n/ref/@id 
                 satisfies $id = $n/preceding::node/@id
};

A recursive algorithm is also straightforward:

declare function local:topological-sort($unordered, $ordered )   {
    if (empty($unordered))
    then $ordered
    else
        let $nodes := $unordered [ every $id in ref/@id satisfies $id = $ordered/@id]
        return 
          if ($nodes)
          then local:topological-sort( $unordered except $nodes, ($ordered, $nodes ))
          else ()  (: cycles so no order possible :)
};

which is invoked as

let $graph :=
 <graph>
    <node id="a">
       <ref id="b"/>
       <ref id="c"/>
    </node>
    <node id="b">
       <ref  id="c"/>
    </node>
    <node id="c"/>
 </graph>
let $sortedNodes := <graph>{local:topological-sort($graph/node,())}</graph>
return local:topological-sorted($sortedNodes/node)

Explanation[edit | edit source]

$ordered is initially the original sequence, $ordered is empty. At each iteration, the set of nodes which are dependent only on the ordered nodes are calculated and these are removed from the unordered nodes and added to the ordered nodes.

References[edit | edit source]