XQuery/Sequences

From Wikibooks, open books for an open world
Jump to: navigation, search

Contents

[edit] Motivation

You want to manipulate a sequence of items. These items may be very similar to each other or they may be of very different types.

[edit] Method

We begin with some simple examples of sequences. We then look at the most common sequence operators. XQuery uses the word sequence as a generic name for an ordered container of items.

Understanding how sequences work in XQuery is central to understanding how the language works. The use of generic sequences of items is central to functional programming and stands in sharp contrast to other programming languages such as Java or JavaScript that provide multiple methods and functions to handle key-value pairs, dictionaries, arrays and XML data. The wonderful thing about XQuery is that you only need to learn one set of concepts and a very small list of functions to learn how to quickly manipulate data.

[edit] Examples

[edit] Creating sequences of characters and strings

You use the parenthesis to contain a sequence, commas to delimit items and quotes to contain string values:

   let $sequence := ('a', 'b', 'c', 'd', 'e', 'f')

Note that you can use single or double quotes, but for most character strings a single quote is used.

   let $sequence := ("apple", 'banana', "carrot", 'dog', "egg", 'fig')

You can also intermix data types. For example the following sequence has three strings and three integers in the same sequence.

   let $sequence := ('a', 'b', 'c', 1, 2, 3)

You can then pass the sequence to any XQuery function that works with sequences of items. For example the count() function takes a sequence as an input and returns the number of items in the sequence.

   let $count := count($sequence)

To see the results of these items you can create a simple XQuery that displays the items using a FLOWR statement.

[edit] Viewing items in a sequence

   xquery version "1.0";
 
   let $sequence := ('a', 'b', 'c', 'd', 'e', 'f')
 
   let $count := count($sequence)
 
   return
   <results>
      <count>{$count}</count>
      <items>
      {for $item in $sequence
       return
          <item>{$item}</item>
      }
      </items>
   </results>

Execute

   <results>
      <count>6</count>
      <items>
         <item>a</item>
         <item>b</item>
         <item>c</item>
         <item>d</item>
         <item>e</item>
         <item>f</item>
         </items>
   </results>

[edit] Viewing specified items inside a sequence

One can specify and view individual items within a sequence using the bracketed predicate express '[]' and indicating the position of the item you are interested in viewing.

   xquery version "1.0";
 
   let $sequence := ('a', 'b', 'c', 'd', 'e', 'f')
 
   let $position := $sequence[1, 3, 4]
 
   return
   <results>
      <count>{$position}</count>
      <items>
      {for $item in $sequence
       return
          <item>{$item}</item>
      }
      </items>
   </results>

Results:

   <results>
      <count>a c d</count>
      <items>
         <item>a</item>
         <item>b</item>
         <item>c</item>
         <item>d</item>
         <item>e</item>
         <item>f</item>
      </items>
   </results>

[edit] Adding XML elements to your sequence

You can also store XML elements in a sequence.

let $sequence := ('apple', <banana/>, <fruit type="carrot"/>, <animal type='dog'/>, <vehicle>car</vehicle>)

Although you can use parenthesis to create sequence of XML items, a best practice (?when) is to use XML tags to begin and end a sequence and to store all items as XML elements. One suggestion is to use items as the element name to hold generic sequences of items.

Here is an example of this:

let $items := 
<items>
  <banana/> <fruit type="carrot"/> <animal type='dog'/> <vehicle>car</vehicle>
</items>

The other convention is to put all individual items in their own item element tags and to place each item on a separate line if the list of items gets long.

let $items := 
    <items>
     <item>banana</item>
     <item>
       <fruit type="carrot"/>
     </item>
     <item>
       <animal type='dog'/>
     </item>
     <item>
       <vehicle>car</vehicle>
     </item>
    </items>

The following FLOWR expression can then be used to display each of these items:

xquery version "1.0";
 
let $sequence :=
    <items>
     <item>banana</item>
     <item>
       <fruit type="carrot"/>
     </item>
     <item>
       <animal type='dog'/>
     </item>
     <item>
       <vehicle>car</vehicle>
     </item>
    </items>
 
 
return
  <results>
  {for $item in $sequence/item
    return
      <item>{$item}</item>
    }
  </results>

This will return the following XML

<results>
    <item>
        <item>banana</item>
    </item>
    <item>
        <item>
            <fruit type="carrot"/>
        </item>
    </item>
    <item>
        <item>
            <animal type="dog"/>
        </item>
    </item>
    <item>
        <item>
            <vehicle>car</vehicle>
 
        </item>
    </item>
</results>

Note that when the resulting XML is returned, only double quotes are present in the output.

[edit] Counting Items

You can count the number of items in a sequence by using the count function and adding /* to the end of the sequence path.

[edit] Common Sequence Functions

There are only a handful of functions you will need to use with sequences. We will review these functions and also show you how to create new functions using combinations of these functions.

Here are the three most common non-mathematical functions used with sequences. These three are the real workhorses of XQuery sequences. You can spend days writing XQueries and never need functions beyond these three functions.

  count($seq as item()*) - used to count the number of items in a sequence.  
   Returns a non-negative integer.
  distinct-values($seq as item()*) - used to remove duplicate items in a sequence.  
   Returns another sequence.
  subsequence($seq as item()*, $start as int, $num as int) - used to return only a subset of items in a sequence.
   Returns another sequence.

All of these functions have a datatype of item()* which is read zero or more items. Note that both the distinct-values() function and the subsequence() function both take in a sequence and return a sequence. This comes in very handy when you are creating recursive functions.

Along with count() are also a few sequence operators that calculate sums and average, min and max:

  sum($seq as item()*) - used to sum the values of numbers in a sequence
  avg($seq as item()*) - used to calculate the average (arithmetic mean) of numbers in a sequence
  min($seq as item()*) - used to find the minimum value of a sequence of numbers
  max($seq as item()*) - used to find the maximum value of a sequence of numbers


These functions are designed to work on numeric values of items and all return numeric values. You many want to use the number() function when working with strings of items.

You may find that you can perform many tasks just by learning these few XQuery functions. You can also create most other sequence operators from these functions.

[edit] Occasionally Used Sequence Functions

In addition there are some functions that are occasionally used:

  insert-before($seq as item()*, $position as int, $inserts as item()*) - for inserting
  new items anywhere in a sequence
  remove($seq as item()*, $position as int) - removes an item from a sequence
  reverse($seq as item()*) - reverses the order of items in a sequence
  index-of($seq as anyAtomicType()*, $target as anyAtomicType()) - returns a sequence of integers that
     indicate where an item is within a sequence (index counting starts at 1)

These last two functions can be used in conjunction with the bracketed predicate expressions '[]' which operates on an item's position information within a sequence.

  last() - when used in a predicate returns the last item in a sequence so (1,2,3)[last()] returns 3
  position() - this function is used to output the position in a FLOWR statement

[edit] Example of Sum Function

Lets imagine that we have a basket of items and we want to count the total items in the basket.

let $basket :=
<basket>
   <item>
      <department>produce</department>
      <type>apples</type>
      <count>2</count>
   </item>
   <item>
      <department>produce</department>
      <type>banana</type>
      <count>3</count>
   </item>
   <item>
      <department>produce</department>
      <type>pears</type>
      <count>5</count>
   </item>
   <item>
      <department>hardware</department>
      <type>nuts</type>
      <count>7</count>
   </item>
   <item>
      <department>packaged-goods</department>
      <type>nuts</type>
      <count>20</count>
   </item>
</basket>

To sum the counts of each item we will need to use an XPath expression to get the item counts:

  $basket/item/count


We can then total this sequence and return the result:

return
   <total>
     {sum($basket/item/count)}
  </total>

Execute

[edit] Tests on Sequences

You can also test to see if a sequence contains one or all of the items in another set. There are several methods to do this.

[edit] Finding if an Item is in a Sequence

Users find that XQuery is easy to use since it tries to do the right thing based on the data types you give it. XQuery checks if you have a sequence, a XML element or a single string and performs the most logical operation. This behavior keeps your code compact and easy to read. If you are comparing a element with a string XQuery will look inside the element and get the string for you so you do not explicitly need to tell XQuery to use the content of an element. If you are comparing a sequence of items with a string with the "=" operator, XQuery will look for that string in the sequence and return true() if the string is in the sequence. It just works!

For example if we have the sequence:

 let $sequence := ('a', 'b', 'c', 'd', 'e', 'f')

Now if we execute:

  if ($sequence = 'd')
     then true()
     else false()

Because it finds 'd' in the sequence of letters and will return true() and:

  if ($sequence = 'x')
     then true()
     else false()

Will return false() because 'x' is not in the sequence.

  • You can use the index-of() function to see where the item appears in the sequence. If the item is in the sequence then it will return a non-zero. You can then return true() or false() if the item is not in the sequence.
  let $sequence := ('a', 'b', 'c', 'd', 'e', 'f')
  let $item := 'x'
  return if (index-of($sequence, $item)) then true() else false()

Recall that index-of() returns a 0 if the $item is not found in the $sequence.

  • You can use a "Quantified" expression.
  some $str in $sequence satisfies ($str = 'e')

Which will also return the the correct result. See the Wikibook article here: XQuery/Quantified_Expressions

[edit] Sorting Sequence

There is no "sort" function in XQuery. To sort your sequence you just create a new sequence that contains a FLOWR loop of your items with the order statement in it.

  • For example if you have a list of items with titles as one of the elements you can use the following to sort the items by title:
  let $sorted-items :=
     for $item in $items
     order by $item/title/text()
     return $item
  • You can return the items sorted by their element name :
  let $sorted-items :=
     for $item in $items
     order by name($item)
     return $item
  • You can also use descending with order by to reverse the order :
  for $item in $items
    order by name($item) descending
    return $item
  • If you want to sort with your own order by creating a seperate sequence and using the index-of function

to find where this item is in the sequence :

  for $i in /root/*
     let $order := ("b", "a", "c")
     let $name := name($i)
     order by index-of($order, $i)
     return $i

[edit] Set Operations: Concatenation, Unions, Intersections and Exclusions

XQuery also provides functions to join sets and to find items that are in both sets.

Assume that we have two sets that contain overlapping items:

  let $sequence-1 := ('a', 'b', 'c', 'd')
  let $sequence-2 := ('c', 'd', 'e', 'f')

[edit] Concatenation

You can concatenate the two sequences by doing the following

  let $both := ($sequence-1, $sequence-2)

or

  for $item in ( ($sequence-1, $sequence-2)) return $item

Which will return:

 a b c d c d e f

[edit] Union

You can also create a "union" set that removes duplicates for all items that are in both sets by using the distinct-values() function:

  distinct-values(($sequence-1, $sequence-2))

This will return the following:

  a b c d e f

Note that the "c d" pair is not repeated.

[edit] Intersection

You can now use a variation of this to find the intersection of all items in sequence-1 that are not in sequence-2:

  distinct-values($sequence-1[.=$sequence-2])

This will return only items that are in BOTH sequence-1 AND sequence-2:

  c d

The way you read this is "for each item in sequence-1, if this item (.) is also in sequence-2 then return it."

[edit] Exclusion

The last set operation you might want to do is the "exclusion" function, where we find all items in the first sequence that are NOT in the second sequence.

  distinct-values($sequence-1[not(.=$sequence-2)])

This will return

  a b

[edit] Returning Duplicates

The following example returns a list of all items that occur more than once in a sequences. This process is known as "duplicated detection"

xquery version "1.0";
 
let $seq := ('a', 'b', 'c', 'd', 'e', 'f', 'b', 'c')
let $distinct-value := distinct-values($seq)
 
(: for each distinct item if the count is greater than 1 then return it :)
let $duplicates :=
  for $item in $distinct-value
  return
     if (count($seq[.=$item]) > 1)
        then $item
        else ()
 
return
 <results>
    <sequence>{string-join($seq, ', ')}</sequence>
    <distinct-values>{$distinct-value}</distinct-values>
    <duplicates>{$duplicates}</duplicates>
</results>

This returns:

<results>
   <sequence>a, b, c, d, e, f, b, c</sequence>
   <distinct-values>a b c d e f</distinct-values>
   <duplicates >b c</duplicates >
</results>

You can also remove all duplicates just by moving the $item to the else() portion of the if statement and putting () in the then() portion of the else statement:

    if (count($seq[.=$item]) > 1)
       then ()
       else $item

[edit] Creating Sequences of Letters

You can use the codepoints functions to convert letters to numbers and numbers to letters. For example to generate a list of all the letters from a to z you can write the following XQuery:

let $number-for-a := string-to-codepoints('a')
let $number-for-z := string-to-codepoints('z')
for $letter in ($number-for-a to $number-for-z)
return
  codepoints-to-string($letter)

This will return a sequence of the following:

  ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z')


Execute

[edit] Creating Letter Collections

You can also use this to create a list of subcollections:

let $data-collection := '/db/apps/terms/data'
let $number-for-a := string-to-codepoints('a')
let $number-for-z := string-to-codepoints('z')
for $letter in ($number-for-a to $number-for-z)
  return
     xmldb:create-collection($data-collection, codepoints-to-string($letter) )

This process is very common way to store related files files in subcollections.

[edit] Counting Items

It is very common to need to count your items as you go through them. You can do this by adding the "at $count" to your FLWOR loop:

for $item at $count in $sequence
return
  <item>
     <count>{$count}</count>
     {if ($count mod 2) then <odd/> else <even/>}
  </item>

Note that the modulo (divide by) function:

 ($count mod 2)

returns 1 for odd numbers, which gets converted to true(), and zero for even numbers, which gets converted to false. You can use this technique to make alternating rows of tables different colors.

[edit] Combining Sequence Operations

It is very common to need to "chain" sequence operations in a linear sequence of steps. For example if you wanted to sort a list of sequences and then select the first 10 items your query might look like the following:

(: this gets a list of names items from the input :)
let $input-sequence := doc('/db/apps/items-manager/data')//item/name/text()
 
let $sorted-items :=
  for $item in $input-sequence
    order by $item
    return $item
return
<ol>{
for $item at $count subsequence($sorted-items, 1, 10)
return
  <li>       (: this puts an even or odd class attribute in the li :)
     {$name }{if ($count mod 2) then attribute class {'odd'} else attribute class {'even'}}
  </li>
}</ol>

This technique can be used to paginate results for search results so that users see the first 10 results of a search. A control can then be used to get the next N items from the search result.

Personal tools
Namespaces
Variants
Actions
Navigation
Community
Toolbox
Sister projects
Print/export