XQuery/Sequences

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Motivation[edit | edit source]

You want to manipulate a sequence of items. These items may be very similar to each other or they may be of very different types.

Method[edit | edit source]

We begin with some simple examples of sequences. We then look at the most common sequence operators. XQuery uses the word sequence as a generic name for an ordered container of items.

Understanding how sequences work in XQuery is central to understanding how the language works. The use of generic sequences of items is central to functional programming and stands in sharp contrast to other programming languages such as Java or JavaScript that provide multiple methods and functions to handle key-value pairs, dictionaries, arrays and XML data. The wonderful thing about XQuery is that you only need to learn one set of concepts and a very small list of functions to learn how to quickly manipulate data.

Examples[edit | edit source]

Creating sequences of characters and strings[edit | edit source]

You use the parenthesis to contain a sequence, commas to delimit items and quotes to contain string values:

   let $sequence := ('a', 'b', 'c', 'd', 'e', 'f')

Note that you can use single or double quotes, but for most character strings a single quote is used.

   let $sequence := ("apple", 'banana', "carrot", 'dog', "egg", 'fig')

You can also intermix data types. For example the following sequence has three strings and three integers in the same sequence.

   let $sequence := ('a', 'b', 'c', 1, 2, 3)

You can then pass the sequence to any XQuery function that works with sequences of items. For example the "count()" function takes a sequence as an input and returns the number of items in the sequence.

   let $count := count($sequence)

To see the results of these items you can create a simple XQuery that displays the items using a FLWOR statement.

Viewing items in a sequence[edit | edit source]

xquery version "1.0";
let $sequence := ('a', 'b', 'c', 'd', 'e', 'f')
let $count := count($sequence)
return
   <results>
      <count>{$count}</count>
      <items>
       {for $item in $sequence
        return
          <item>{$item}</item>
        }
      </items>
   </results>

Execute

   <results>
      <count>6</count>
      <items>
         <item>a</item>
         <item>b</item>
         <item>c</item>
         <item>d</item>
         <item>e</item>
         <item>f</item>
      </items>
   </results>

Viewing select items in a sequence[edit | edit source]

Items within a sequence can be selected using a predicate expression.

Items can be selected by position ( 1-based ):

xquery version "1.0";
let $sequence := ('a', 'b', 'c', 'd', 'e', 'f')
return
   <items>{
      for $item in $sequence[1, 3, 4]
      return
           <item>{$item}</item>
      }
   </items>

Execute

Results:

<items>
    <item>a</item>
    <item>c</item>
    <item>d</item>
</items>

or by value:

xquery version "1.0";
let $sequence := ('a', 'b', 'c', 'd', 'e', 'f')
return
    <items>{
       for $item in $sequence[1][. = ('a','e')]
       return
          <item>{$item}</item>
       }
    </items>

Execute

Results:

<items>
    <item>a</item>
    <item>e</item>
</items>

Adding XML elements to your sequence[edit | edit source]

You can also store XML elements in a sequence:

let $sequence := ('apple', <banana/>, <fruit type="carrot"/>, <animal type='dog'/>, <vehicle>car</vehicle>)

Although you can use parenthesis to create a sequence of XML items, we can also use XML tags to begin and end a sequence and to store all items as XML elements.

Here is an example of this:

let $items := 
   <items>
      <banana/>
      <fruit type="carrot"/>
      <animal type='dog'/>
      <vehicle>car</vehicle>
   </items>

One layout convention is to put all individual items in their own item element tags and to place each item on a separate line if the list of items gets long:

let $items := 
   <items>
      <item>banana</item>
      <item>
         <fruit type="carrot"/>
      </item>
      <item>
         <animal type='dog'/>
      </item>
      <item>
         <vehicle>car</vehicle>
      </item>
   </items>

The following FLWOR expression can then be used to display each of these items:

xquery version "1.0";

let $sequence :=
    <items>
       <item>banana</item>
       <item>
          <fruit type="carrot"/>
       </item>
       <item>
          <animal type='dog'/>
       </item>
       <item>
          <vehicle>car</vehicle>
       </item>
    </items>
  

return
   <results>{
      for $item in $sequence/item
      return
         <item>{$item}</item>
   }</results>

Execute

This will return the following XML:

<results>
    <item>
        <item>banana</item>
    </item>
    <item>
        <item>
            <fruit type="carrot"/>
        </item>
    </item>
    <item>
        <item>
            <animal type="dog"/>
        </item>
    </item>
    <item>
        <item>
            <vehicle>car</vehicle>
        </item>
    </item>
</results>

Note that when the resulting XML is returned, only double quotes are present in the output.

Common Sequence Functions[edit | edit source]

There are only a handful of functions you will need to use with sequences. We will review these functions and also show you how to create new functions using combinations of these functions.

Here are the three most common non-mathematical functions used with sequences. These three are the real workhorses of XQuery sequences. You can spend days writing XQueries and never need functions beyond these three functions:

  count($seq as item()*) - used to count the number of items in a sequence.  
   Returns a non-negative integer.
  distinct-values($seq as item()*) - used to remove duplicate items in a sequence.  
   Returns another sequence.
  subsequence($seq as item()*, $startingLoc as xs:double, $length as xs:double) - used to return only a subset of items in a sequence.
   Returns another sequence.  [type xs:double for $startingLoc and $length seems strange; these will be rounded to the nearest integer]

All of these functions have a datatype of item()* which is read zero or more items. Note that both the "distinct-values()" function and the "subsequence()" function both take in a sequence and return a sequence. This comes in very handy when you are creating recursive functions. Along with "count()" are also a few sequence operators that calculate sums and average, min and max:

Along with "count()" are also a few sequence operators that calculate sums and average, min and max:

  sum($seq as item()*) - used to sum the values of numbers in a sequence
  avg($seq as item()*) - used to calculate the average (arithmetic mean) of numbers in a sequence
  min($seq as item()*) - used to find the minimum value of a sequence of numbers
  max($seq as item()*) - used to find the maximum value of a sequence of numbers

These functions are designed to work on numeric values of items, and all return numeric values (you many want to use the "number()" function when working with strings of items).

You may find that you can perform many tasks just by learning these few XQuery functions. You can also create most other sequence operators from these functions.

Occasionally Used Sequence Functions[edit | edit source]

In addition there are some functions which return modified versions of the original sequence:

  insert-before($seq as item()*, $position as xs:integer, $inserts as item()*) - for inserting
     new items anywhere in a sequence
  remove($seq as item()*, $position as xs:integer) - removes an item from a sequence
  reverse($seq as item()*) - reverses the order of items in a sequence
  index-of($seq as anyAtomicType()*, $target as anyAtomicType()) - returns a sequence of integers that
     indicate where an item is within a sequence (index counting starts at 1)

The following two functions can be used in conjunction with the bracketed predicate expression '[ ]', which operates on an item's position information within a sequence:

  last() - when used in a predicate returns the last item in a sequence so (1,2,3)[last()] returns 3
  position() - this function is used to output the position in a FLWOR statement, so 
     for $x in ('a', 'b', 'c', 'd')[position() mod 2 eq 1] return $x returns ('a', 'c')

Example of Sum Function[edit | edit source]

Let's imagine that we have a basket of items and we want to count the total items in the basket:

let $basket :=
   <basket>
      <item>
         <department>produce</department>
         <type>apples</type>
         <count>2</count>
      </item>
      <item>
         <department>produce</department>
         <type>banana</type>
         <count>3</count>
      </item>
      <item>
         <department>produce</department>
         <type>pears</type>
         <count>5</count>
      </item>
      <item>
         <department>hardware</department>
         <type>nuts</type>
         <count>7</count>
      </item>
      <item>
         <department>packaged-goods</department>
         <type>nuts</type>
         <count>20</count>
      </item>
   </basket>

To sum the counts of each item we will need to use an XPath expression to get the item counts:

  $basket/item/count

We can then total this sequence and return the result:

return
   <total>
      {sum($basket/item/count)}
   </total>

Execute

The result is 37.

Finding if an Item is in a Sequence[edit | edit source]

Users find that XQuery is easy to use since it tries to do the right thing based on the data types you give it. XQuery checks if you have a sequence, an XML element or a single string, and performs the most logical operation. This behaviour keeps your code compact and easy to read. If you are comparing a element with a string, XQuery will look inside the element and get the string for you so you do not explicitly need to tell XQuery to use the content of an element. When comparing a sequence of items with a string with the "=" operator, XQuery will look for that string in the sequence and return "true()" if the string is in the sequence. It just works!

For example, given the sequence:

 let $sequence := ('a', 'b', 'c', 'd', 'e', 'f')

If we execute:

 <result>{$sequence = 'd'}</result>

Execute

we get:

   <result>true</result>

"true()" is returned because 'd' is in the sequence. However:

  <result>{$sequence = 'x'}</result>

Will return "false()" because 'x' is not in the sequence.

You can use the "index-of()" function to get the position of an item in a sequence. If the item is in the sequence then it will return a non-zero integer, if not then the empty sequence.

xquery version "1.0";
   let $sequence := ('a', 'b', 'c', 'd', 'e', 'f')
   let $item := 'x'
   return 
      <result>{index-of($sequence, $item)}</result>

Execute

Sorting Sequence[edit | edit source]

There is no "sort" function in XQuery. To sort your sequence you just create a new sequence that contains a FLWOR loop of your items with the order statement in it.

  • For example if you have a list of items with titles as one of the elements you can use the following to sort the items by title:
  let $sorted-items :=
     for $item in $items
     order by $item/title/text()
     return $item
  • You can also use "descending" with "order by" to reverse the order:
  for $item in $items
  order by name($item) descending
  return $item
  • You can sort with your own order by creating a separate order sequence and then using the "index-of()" function in the "order by" clause to indicate, according to the specified order, how to arrange/return $i items from the queried sequence:
  for $i in /root/*
  let $order := ("b", "a", "c")
  order by index-of($order, $i)
  return $i

Set Operations: Concatenation, Unions, Intersections and Exclusions[edit | edit source]

XQuery also provides functions to join sets and to find items that are in both sets.

Assume that we have two sets that contain overlapping items:

  let $sequence-1 := ('a', 'b', 'c', 'd')
  let $sequence-2 := ('c', 'd', 'e', 'f')

Concatenation[edit | edit source]

You can concatenate both sequences by doing the following:

  let $both := ($sequence-1, $sequence-2)

or, alternatively:

  for $item in ( ($sequence-1, $sequence-2)) return $item

Which will return:

 a b c d c d e f

Union[edit | edit source]

You can also create a union set that removes duplicates for all items that are in both sets by using the "distinct-values()" function:

  distinct-values(($sequence-1, $sequence-2))

This will return the following:

  a b c d e f

Note that the 'c d' pair is not repeated.

Intersection[edit | edit source]

You can now use a variation of this to find the intersection of all items in $sequence-1 that are also in $sequence-2:

  distinct-values($sequence-1[.=$sequence-2])

This will return only items that are in BOTH $sequence-1 AND $sequence-2:

  c d

The way you read this is "for each item in $sequence-1, if this item (.) is also in $sequence-2, then return it."

Exclusion[edit | edit source]

The last set operation you might want to do is the exclusion function, where we find all items in the first sequence that are NOT in the second sequence:

  distinct-values($sequence-1[not(.=$sequence-2)])

This will return:

  a b

Returning Duplicates[edit | edit source]

The following example returns a list of all items that occur more than once in a sequence. This process is known as "duplicate detection."

Method 1: using distinct-values()[edit | edit source]

We can use the distinct-values() function on a sequence to find all the unique values in a sequence. We can then check to see if there are any items that occur twice.

xquery version "1.0";

let $seq := ('a', 'b', 'c', 'd', 'e', 'f', 'b', 'c')
let $distinct-value := distinct-values($seq)

(: for each distinct item if the count is greater than 1 then return it :)
let $duplicates :=
   for $item in $distinct-value
   return
      if (count($seq[.=$item]) > 1) then
         $item
      else 
         ()

return
   <results>
      <sequence>{string-join($seq, ', ')}</sequence>
      <distinct-values>{$distinct-value}</distinct-values>
      <duplicates>{$duplicates}</duplicates>
   </results>

Execute

This returns:

<results>
   <sequence>a, b, c, d, e, f, b, c</sequence>
   <distinct-values>a b c d e f</distinct-values>
   <duplicates>b c</duplicates>
</results>

You can also remove all duplicates just by moving the $item to the "else" portion of the "if" statement and putting "()" in the "then" portion of the "if" statement:

     if (count($seq[. = $item]) > 1) then 
        ()
     else
        $item

Method 2: Using index-of()[edit | edit source]

The following method uses the "index-of()" function to find the duplicates:

xquery version "1.0";
<duplicates>
  {
  let $values := (3, 4, 6, 6, 2, 7, 3, 1, 2)
  for $duplicate in $values[index-of($values, .)[2]]
  return
      <duplicate>{$duplicate}</duplicate>
  }
</duplicates>

Execute

Creating Sequences of Letters[edit | edit source]

You can use the codepoints functions to convert letters to numbers and numbers to letters. For example, to generate a list of all the letters from 'a' to 'z' you can write the following XQuery:

xquery version "1.0";
<letters> 
   {
   let $number-for-a := string-to-codepoints('a')
   let $number-for-z := string-to-codepoints('z')
   for $letter in ($number-for-a to $number-for-z)
   return
      codepoints-to-string($letter)
   } 
</letters>

This will return the sequence rendered as text:

  <letters>a b c d e f g h i j k l m n o p q r s t u v w x y z</letters>

Execute

Creating Letter Collections[edit | edit source]

You can also use this to create a list of subcollections:

let $data-collection := '/db/apps/terms/data'
let $number-for-a := string-to-codepoints('a')
let $number-for-z := string-to-codepoints('z')
for $letter in ($number-for-a to $number-for-z)
  return
     xmldb:create-collection($data-collection, codepoints-to-string($letter) )

This process is very common way to store related files in subcollections.

Counting Items[edit | edit source]

It is very common to need to count your items as you go through them. You can do this by adding the "at $count" to your FLWOR loop:

for $item at $count in $sequence
return
  <item>
     <count>{$count}</count>
     {if ($count mod 2) then <odd/> else <even/>}
  </item>

Note that the modulo function:

 ($count mod 2)

returns 1 for odd numbers, which gets converted to "true()", and zero for even numbers, which gets converted to "false()". You can use this technique to make alternating rows of tables different colors.

Removing Numbers[edit | edit source]

You can filter out specific types of numbers by simply adding a predicate to the end of a sequence of numbers. For example, if you wanted to remove all odd numbers from a sequence of numbers, the expression you would use would be:

  $my-sequence-of-integers[. mod 2 = 0]

Which says "of all the current numbers in the sequence, if the current number modulo 2 has a value of 0 (i.e., 'if the current number is not odd'), then keep it in the result sequence".

Here is a full example:

xquery version "1.0";
declare function local:remove-odd($in as xs:integer*) as xs:integer* {
    $in[. mod 2 = 0]
};

let $thirty := 1 to 30
 
return
   <results>
       <in>{$thirty}</in>
       <out>{local:remove-odd($thirty)}</out>
   </results>

Execute

which returns:

<results>
   <in>1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30</in>
   <out>2 4 6 8 10 12 14 16 18 20 22 24 26 28 30</out>
</results>

Converting Sequences to a String[edit | edit source]

One of the most common things we do with a sequence is to convert it to a single string for display. And we frequently want to put a separator string between the values but not after the final value. XQuery includes a very handy function for this called "string-join()". Its format is:

  string-join($input-sequence as item()*, $separator-string as xs:string) as xs:string

For example, the output of:

 let $sequence := ('a', 'b', 'c')
 return string-join($sequence, '--')

would be:

 a--b--c

Note that there is no "--" after the last string. The separator is only used between the items of a sequence.

Combining Sequence Operations[edit | edit source]

It is very common to need to "chain" sequence operations in a linear sequence of steps. For example if you wanted to sort a list of sequences and then select the first 10 items your query might look like the following:

xquery version "1.0";
let $input-sequence := doc('vessels.xml')//Vessel
let $sorted-items :=
  for $item in $input-sequence
  order by $item/name
  return $item
return
   <ol>
     {
     for $item at $count in subsequence($sorted-items, 1, 10)
     return
      element li  {
         attribute class {if ($count mod 2) then 'odd' else 'even'} ,  (: this puts an even or odd class attribute in the li :)
         $item/name/text()
      }
     }
   </ol>

Execute

This technique can be used to paginate results for search results so that users see the first 10 results of a search. A control can then be used to get the next N items from the search result.