XQuery/XQuery from SQL

From Wikibooks, open books for an open world
Jump to navigation Jump to search

The Scott/Tiger example[edit | edit source]

A classic database widely used in teaching Relational databases concerns Employees, Departments and SalaryGrades. In Oracle training material it is known by the demo username and password, Scott/Tiger.

These three tables converted to XML (via the XML add-in to Excel XQuery/Excel and XML ) are:

Employees XML Table MySQL

Departments XML Table MySQL

Salary Grades XML Table MySQL

A port of the Oracle SQL file to MySQL can be found here.

Execution environments[edit | edit source]

The eXist demo server is used for the XQuery examples. These are returned either as plain XML or converted to table format.

The equivalent SQL queries are executed on an w:MySQL server,also based at the University of the West of England in Bristol

Basic Queries[edit | edit source]

Counting Records[edit | edit source]

Task: How many Employees?[edit | edit source]

SQL: select count(*) from Emp; MySQL

XQuery: count(//Emp) XML

Task: How many Departments?[edit | edit source]

SQL: select count(*) from dept MySQL

XQuery: count(//Dept) XML

Selecting records[edit | edit source]

Task: Show all Employees with a salary greater than 1000[edit | edit source]

SQL: select * from emp where sal > 1000; MySQL

XQuery: //Emp[Sal>1000] XML Table

Task: Show all Employees with a salary greater than 1000 and less than 2000[edit | edit source]

SQL: select * from emp where sal between 1000 and 2000; MySQL

XQuery: //Emp[Sal>1000][Sal<2000] XML Table

Here, successive filter conditions replace the anded conditions implied by 'between'.

Although there is no 'between' function in XQuery, it is a simple matter to write one:

declare function local:between($value as xs:decimal, $min as xs:decimal, $max as xs:decimal) as xs:boolean {
  $value >= $min and $value <= $max
};

which simplifies the query to //Emp[local:between(Sal,1000,2000)] XML Table

and has the advantage that the conversion of Sal to a number is now implicit in the function signature.

Task: Show all employees with no Commission[edit | edit source]

SQL: select * from emp where comm is null; MySQL

XQuery: //Emp[empty(Comm/text())] XML Table

Note that empty(Comm) is not enough, since this is true only if the element itself is absent, which in this sample XML it is not.

XQuery: //Emp[empty(Comm)] XML

Task: Select the first 5 employees[edit | edit source]

SQL: select * from emp limit 5; MySQL

XQuery: //Emp[position() <=5]

XML Table

Selecting Columns[edit | edit source]

List Employee names and salaries[edit | edit source]

SQL: Select ename,sal from emp MySQL

Surprisingly, selecting only a subset of children in a node (pruning) is not supported in XPath.

//Emp/(Ename,Sal) XML retrieves the required elements, but the parent Emp nodes have been lost.

//Emp/(Ename|Sal) XML is better since it keeps the elements in sequence, but it does not return Emp nodes with only the Ename and Sal children as required.

//Emp/*[name(.) = ("Ename","Sal")] XML uses reflection on the element names.


XQuery:

    for $emp in //Emp 
    return 
       <Emp>
         {$emp/(Ename|Sal)}
       </Emp>

XML Table

Here an XQuery FLWOR expression is used to create a new EMP element from the original elements.

Computing values[edit | edit source]

Computing the Annual Salary[edit | edit source]

Task: Compute the Annual Salaries of all employees. The Annual Salary is computed from 12 times the Monthly salary plus Commission. Since commission may be null, it must be replaced by a suitable numeric value:

SQL: select 12 * sal + ifnull(comm,0) from emp; MySQL

XQuery: //Emp/(12*number(Sal)+(if(exists(Comm/text())) then number(Comm) else 0)) XML


The SQL function COALESCE is the same as IFNULL but will accept multiple arguments:

SQL: select 12 * sal + coalesce(comm,0) from emp; MySQL

XQuery: //Emp/(12*number(Sal)+ number((Comm/text(),0)[1])) XML

The lack of a schema in this simple example to carry information on the type of the items, leads to the need for explicit conversion of strings to numbers.

Note the XQuery idiom:

 (Comm/text(),0)[1] 

computes the first non-null item in the sequence, the counterpart of COALESCE.

Selecting and Creating Columns[edit | edit source]

Task: List the employee names with their Annual Salary.

SQL: select ename, 12 * sal + ifnull(comm,0) as "Annual Salary" from emp; MySQL

XQuery:

for $emp in //Emp 
return <Emp>
         {$emp/Ename}
         <AnnualSalary>
            {12*number($emp/Sal)+ 
             (if (exists($emp/Comm/text())) 
             then number($emp/Comm)
             else 0)
            }
         </AnnualSalary>
       </Emp> 

XML Table

Again we have the problem of tree-pruning, but now with added grafting, which again requires the explicit construction of an XML node.


SQL Operators[edit | edit source]

IN[edit | edit source]

Task: Show all employees whose Job is either ANALYST or MANAGER

SQL: select * from emp where job in ("ANALYST","MANAGER") MySQL


XQuery: //Emp[Job = ("ANALYST","MANAGER")]

XML Table

NOT IN[edit | edit source]

Task :Select all employees whose Job is not 'ANALYST' or 'MANAGER'

SQL: select * from emp where job not in ("ANALYST","MANAGER") MySQL

This doesn't work:

XQuery: //Emp[Job !=("ANALYST","MANAGER")] XML Table

The generalised equals here is always true since everyone is either not an ANALYST or not a MANAGER. This works:

XQuery: //Emp[not(Job =("ANALYST","MANAGER"))] XML Table

Distinct values[edit | edit source]

Task: Show the different Jobs which Employees have

MySQL: select distinct job from emp; MySQL

XQuery: distinct-values(//Emp/Job) XML

Pattern Matching[edit | edit source]

Task: List all Employees with names starting with "S"

MySQL: select * from emp where ename like "S%"; MySQL

XQuery: //Emp[starts-with(Ename,"S")] XML Table

See starts-with()

Task: List all Employees whose name contains "AR"

MySQL: select * from emp where ename like "%AR%"; MySQL

XQuery: //Emp[contains(Ename,"AR")] XML Table

See contains()

Task: List all Employees whose name contains "ar" ignoring the case

MySQL: select * from emp where ename like "%ar%"; MySQL

LIKE in SQL is case insensitive, but fn:contains() is not, so the case needs to be converted:

XQuery: //Emp[contains(upper-case(Ename),upper-case("ar"))] XML Table

See upper-case()


More complex patterns need regular expressions.

MySQL: select * from emp where ename regexp "M.*R"; MySQL

XQuery: //Emp[matches(Ename,"M.*R")] XML Table

See matches()

Similarly, SQL's REGEXP is case-insensitive, whereas additional flags control matching in the XQuery matches()

MySQL: select * from emp where ename regexp "m.*r"; MySQL

XQuery: //Emp[matches(Ename,"m.*r",'i')] XML Table

('i' makes the regex match case insensitive.)

Table Joins[edit | edit source]

Simple Inner joins[edit | edit source]

Task: Find the name of the department that employee 'SMITH' works in:

SQL :

select dept.dname 
from emp, dept 
where dept.deptno = emp.deptno  
   and ename='SMITH';

MySQL


XPath : //Dept[DeptNo = //Emp[Ename='SMITH']/DeptNo]/Dname

XML

Perhaps a FLWOR expression in XQuery would be more readable:


let $dept := //Emp[Ename='SMITH']/DeptNo
return //Dept[DeptNo = $dept ]/Dname

XML

Task: To find the names of all employees in Accounting

SQL:

select emp.ename 
from emp,dept 
where dept.deptno = emp.deptno 
  and dname='Accounting';

MySQL

XPath: //Emp[DeptNo = //Dept[Dname='Accounting']/DeptNo]/Ename

XML

XQuery:

let $dept := //Dept[Dname='Accounting']/DeptNo
return //Emp[DeptNo = $dept]/Ename

XML


Note that in this release of eXist, the order of the operands in the equality is significant - to be fixed in a later release.

XQuery: //Emp[Dname='Accounting'/DeptNo = //Emp/DeptNo]/Ename


XML

More complex Inner Join[edit | edit source]

Task: List the name of each Employee, together with the name and location of their department.

SQL:

  select ename, dname,location
    from emp, dept
   where emp.deptno = dept.deptno;

MySQL

Where elements must be selected from several nodes, XPath is insufficient and XQuery is needed:

XQuery:

This join could be written as:

  for $emp in //Emp
  for $dept in //Dept
  where $dept/DeptNo= $emp/DeptNo
  return
    <Emp>
      {$emp/Ename}
      {$dept/(Dname|Location)}
    </Emp> 

XML Table

But it would be more commonly written in the form of a sub-selection:

  for $emp in //Emp
  let $dept := //Dept[DeptNo=$emp/DeptNo]
  return
    <Emp>
      {$emp/Ename}
      {$dept/(Dname|Location)}
    </Emp> 

XML Table

Inner Join with Selection[edit | edit source]

Task: List the names and department of all Analysts

SQL:

  select ename, dname
    from emp, dept
   where emp.deptno = dept.deptno
     and job="ANALYST";

MySQL

XQuery:

  for $emp in //Emp[Job='ANALYST']
  let $dept := //Dept[DeptNo= $emp/DeptNo]
  return
    <Emp>
      {$emp/Ename}
      {$dept/Dname}
    </Emp> 

XML Table

1 to Many query[edit | edit source]

Task: List the departments and the number of employees in each department

SQL:

  select dname,
         (select count(*)
            from emp
           where deptno = dept.deptno
         ) as headcount
    from dept;

MySQL

XQuery:

  for $dept in //Dept 
  let $headCount := count(//Emp[DeptNo=$dept/DeptNo])
  return  
    <Dept>
       {$dept/Dname}
       <HeadCount>{$headCount}</HeadCount>
    </Dept>


XML Table

Theta (Inequality) Join[edit | edit source]

Task: List the names and salary grade of staff in ascending grade order

Grades are defined by a minimum and maximum salary.

SQL:

select ename, grade
  from emp, salgrade
 where emp.sal between salgrade.losal and salgrade.hisal;

MySQL

XQuery:

for $emp in //Emp
let $grade :=  //SalGrade[number($emp/Sal) > number(LoSal)][number($emp/Sal) < number(HiSal)]/Grade
order by $grade
return
  <Emp>
    {$emp/Ename}
    {$grade}
  </Emp> 

XML Table

Recursive Relations[edit | edit source]

The relationship between an employee and their manager is a recursive relationship.

Task: List the name of each employee together with the name of their manager.

SQL:

select e.ename, m.ename
  from emp e
  join emp m on e.mgr = m.empno

MySQL

XQuery:

for $emp in //Emp
let $manager := //Emp[EmpNo = $emp/MgrNo]
return
  <Emp>
    {$emp/Ename}
    <Manager>{string($manager/Ename)}</Manager>
  </Emp> 

XML Table

The XQuery result is not quite the same as the SQL result. King, who has no manager, is missing from the SQL inner join. To produce the same result in XQuery, we would filter for employees with Managers:

for $emp in //Emp[MgrNo]
let $manager := //Emp[EmpNo = $emp/MgrNo]
where $emp/MgrNo/text()
return
  <Emp>
    {$emp/Ename}
    <Manager>{string($manager/Ename)}</Manager>
  </Emp> 

XML Table

Alternatively, an outer join returns all employees, including King:

SQL:

select e.ename, m.ename
  from emp e
  left join emp m on e.mgr = m.empno

MySQL

Conversion to an organisational tree[edit | edit source]

The manager relationship defines a tree structure, with King at the root, her direct reports as her children and so on. A recursive function in XQuery solves this task.

XQuery:

 declare function local:hierarchy($emp) {
     <Emp name='{$emp/Ename}'>
       <Reports>
       {for $e in  //Emp[MgrNo = $emp/EmpNo]
        return
           local:hierarchy($e)
       }
       </Reports>
     </Emp>
 };
 
 local:hierarchy(//Emp[empty(MgrNo/text())])

XML

Conversion to a Department/Employee Hierarchy[edit | edit source]

For export, a single XML file could be created with Employees nested within Departments. This is possible without introducing redundancy or loss of data because the Dept/Emp relationship is exactly one to many.

XQuery:

<Company>
  {for $dept in //Dept
  return
    <Department>
      {$dept/*}   
      {for $emp in //Emp[DeptNo = $dept/DeptNo]  
      return $emp
      }
    </Department>
  }
</Company>

XML

With this simple approach, the foreign key DeptNo in Emp has been included but it is now redundant. The except operator is useful here:

<Company>
   {for $dept in //Dept
    return
       <Department>
          {$dept/*}   
          {for $emp in //Emp[DeptNo = $dept/DeptNo]  
           return 
              <Employee>
                  {$emp/* except $emp/DeptNo}
              </Employee>
          }
       </Department>
    }
</Company>

XML

Note that this assumes there are no attributes to be copied. If there are, these would be copied with $emp/@*

Working with the hierarchical data[edit | edit source]

This hierarchical data can be queried directly in XQuery.

Path to Employee[edit | edit source]

Almost all the queries remain the same (except for the change of element name to Employee). This is because the path used to select Emps in the Emp.xml document is //Emp and is now //Employee in the merged document. If a full path had been used (/EmpList/Emp), this would need to be replaced by /Company/Department/Employee

Simple Navigation[edit | edit source]

Task: To find the department name of employee 'Smith'

XQuery: //Employee[Ename='SMITH']/../Dname XML

Task: To find the names of employees in the Accounting department

XQuery: //Department[Dname='Accounting']/Employee/Ename XML

Department /Employee Join[edit | edit source]

The main changes are in queries which require a join between Employee and Departments because they are already nested and thus become navigation up (from Employee to Department ) or down (from Department to Employee) the tree.

many - one[edit | edit source]

The query to list the Employees and the location of their Department with separate documents was:

 for $emp in //Emp
 for $dept in //Dept
  where $dept/DeptNo=$emp/DeptNo
  return
    <Emp>
      {$emp/Ename}
      {$dept/(Dname|Location)}
    </Emp> 

XML Table

With one nested document, this becomes:

 for $emp in //Employee
 return
    <Employee>
      {$emp/Ename}
      {$emp/../Location}
    </Employee> 

XML Table using the parent access to move up the tree.

1 - many[edit | edit source]

To list departments and the number of employees in the separate tables is :

for $dept in //Dept 
let $headCount := count(//Emp[DeptNo=$dept/DeptNo])
return  
    <Dept>
       {$dept/Dname}
       <HeadCount>{$headCount}</HeadCount>
    </Dept>

XML Table

which becomes:

for $dept in //Department
let $headCount := count($dept/Employee)
return  
    <Department>
       {$dept/Dname}
       <HeadCount>{$headCount}</HeadCount>
    </Department>

XML Table

Summarising and Grouping[edit | edit source]

Summary data[edit | edit source]

Task: Show the number, average (rounded), min and max salaries for Managers.

SQL: SELECT count(*), round(avg(sal)), min(sal), max(sal) FROM emp WHERE job='MANAGER'; MySQL

XQuery:

   (count(//Emp[Job='MANAGER']),round(avg(//Emp[Job='MANAGER']/Sal)),min(//Emp[Job='MANAGER']/Sal),max( //Emp[Job='MANAGER']/Sal))

XML

Better to factor out the XPath expression for the subset of employess:

let $managers := //Emp[Job='MANAGER']
return (count($managers),round(avg($managers/Sal)),min($managers/Sal),max($managers/Sal))

XML

It would be better to tag the individual values computed:

let $managers := //Emp[Job='MANAGER']
return 
   <Statistics>
     <Count>{count($managers)}</Count>
     <Average>{round(avg($managers/Sal))}</Average>
     <Min>{min($managers/Sal)}</Min>
     <Max>{max($managers/Sal)}</Max>
   </Statistics>

XML

Grouping[edit | edit source]

Task: Show the number, average (rounded), min and max salaries for each Job.

SQL: SELECT job, count(*), round(avg(sal)), min(sal), max(sal) FROM emp GROUP BY job; MySQL


In XQuery, grouping must be done by iterating over the groups. Each Group is identified by the Job and we can get the set (sequence) of all Jobs using the distinct-values function:

for $job in distinct-values(//Emp/Job)
let $employees  := //Emp[Job=$job]
return 
   <Statistics>
         <Job>{$job}</Job>
         <Count>{count($employees )}</Count>
         <Average>{round(avg($employees/Sal))}</Average>
          <Min>{min($employees/Sal)}</Min>
          <Max>{max($employees/Sal)}</Max>
     </Statistics>

XML Table

Hierarchical report[edit | edit source]

Task: List the departments , their employee names and salaries and the total salary in each department

This must generate a nested table.

SQL: ?

XQuery:

<Report>
  {
  for $dept in //Dept
  let $subtotal := sum(//Emp[DeptNo = $dept/DeptNo]/Sal)
  return
     <Department>
        {$dept/Dname}
        {for $emp in //Emp[DeptNo = $dept/DeptNo]
         return
            <Emp>
              {$emp/Ename}
              {$emp/Sal}
            </Emp>
         }
         <SubTotal>{$subtotal}</SubTotal>
      </Department>
   }
   <Total>{sum(//Emp/Sal)}</Total>
</Report>

XML


Note that the functional nature of the XQuery language means that each total must be calculated explicitly, not rolled up incrementally as might be done in an imperative language. This has the advantage that the formulae are explicit and independent and can thus be placed anywhere in the report, such as at the beginning instead of at the end:

<Report>
  <Total>{sum(//Emp/Sal)}</Total>
  {
  for $dept in //Dept
  let $subtotal := sum(//Emp[DeptNo = $dept/DeptNo]/Sal)
  return
     <Department>
         <SubTotal>{$subtotal}</SubTotal>
        {$dept/Dname}
        {for $emp in //Emp[DeptNo = $dept/DeptNo]
         return
            <Emp>
              {$emp/Ename}
              {$emp/Sal}
            </Emp>
         }
      </Department>
   }
 </Report>

XML

Restricted Groups[edit | edit source]

Task: Show the number, average (rounded), min and max salaries for each Job where there are at least 2 employees in the group.

SQL:

SELECT job, count(*), round(avg(sal)), min(sal), max(sal)
  FROM emp
 GROUP BY job
HAVING count(*) > 1;

MySQL

XQuery:

for $job in distinct-values(//Emp/Job)
let $employees := //Emp[Job=$job]
where count($employees) > 1
return 
   <Statistics>
      <Job>{$job}</Job>
      <Count>{count($employees )}</Count>
      <Average>{round(avg($employees /Sal))}</Average>
      <Min>{min($employees /Sal)}</Min>
      <Max>{max($employees /Sal)}</Max>
   </Statistics>

XML Table

Date Handling[edit | edit source]

Selecting by Date[edit | edit source]

Task: list all employees hired in the current millenium

SQL: SELECT * from job where hiredate >= '2000-01-01' MySQL

XQuery: //Emp[HireDate >= '2000-01-01']

Actually this comparison is a string comparison because of the lack of a schema to define HireDate as an xs:date.

XML Table