Apache Ant/Print version

From Wikibooks, open books for an open world
< Apache Ant
Jump to navigation Jump to search


Apache Ant

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/Apache_Ant

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.

Contents


Background

What is Apache Ant?[edit]

  • An operating system and language neutral XML based "Build Tool"
  • A scripting language for executing any complex file-system processes that can be run from a command line interface
  • A.N.T. – Another Neat Tool
  • Used for Building the Project

History of Ant[edit]

  • Built by James Duncan Davidson
  • Frustrated with the UNIX "make"
  • Invented while turning a product from Sun into open source
  • Make used "tab" as a record separator
  • Tabs frequently got converted to spaces during copy/paste operations

Why is Ant Strategic?[edit]

Ant is important because it helps organizations create repeatable build processes.

Repeatability is critical to organizations reaching the next level of CMU's Capability Maturity Model :

  1. Initial
  2. Repeatable
  3. Defined
  4. Managed
  5. Optimized

Ant helps you get from the Initial to the Repeatable level.

Ant is a Process Discipline[edit]

  • Process discipline helps ensure that existing practices are retained during times of stress
  • When these practices are in place, projects are performed and managed according to their documented plans
  • Answers the Question: How did the prior developer compile, test and install their system?
  • Excellent aid for software archeologists

Software Project Lifecycle[edit]

  • Version 1 and version 2 of a software package are frequently done by different groups
  • Sometimes version 1 and 2 are done years apart by different teams in different countries
  • Contractors and internal staff need to use the same tools
  • Shared development processes, like those used in the Open Source community, would be almost impossible without tools like Ant

Ant is Operating System and Language Neutral[edit]

  • Builds run on both Windows and UNIX/Linux systems
  • Should run anywhere a Java VM runs
  • Ant files "know" about file separators "/" vs. "\"
  • Build targets can still do OS-specific tasks
  • Works with anything that can be done from the command line
  • Easy to extend (with Java)
  • It is possible to extend with other languages as long as they have Java bindings but in practice, most people use Java to extend Ant

Ant and XML[edit]

  • If you are familiar with XML (or even HTML) you will probably learn Ant quickly
  • If you are not already familiar with XML you will need to learn some XML before you use Ant
  • One of the best ways of doing this is to read many small Ant sample tasks
  • This book should help you do this

Next Chapter[edit]

Adoption



Adoption

Organizational Ant Adoption[edit]

We have found that most people new to Ant go through several stages of learning Ant.

The three stages of learning core Ant functionality[edit]

  1. Learn the grammar and syntax of XML and a build file- this goes quickly if you know any XML or HTML (about 15 minutes)
  2. Learn the Ant concepts of properties, dependencies and task (about two hours)
  3. Build your vocabulary: Learn the basic Ant tasks you need to get your job done (duration depends on what you are doing)

Transform Your Organization![edit]

  • Integrating it into your development/QA process
  • Making it a requirement of all projects
  • Specifying that all vendor deliverables MUST include a reproducible build process
  • Problematic for Microsoft developers that are not familiar with Ant
  • Problematic for "visual only" development environments (Microsoft Visual Studio, Microsoft Analysis Services)

Next Chapter[edit]

XML Summary



XML Summary

Brief Overview of XML[edit]

This chapter has a summary of the things you need to know about XML to use Apache Ant. It is not a lot of information and users that are already familiar with HTML will pick up the concepts very quickly.

You do not need to know a lot about XML to use Ant. Just a few syntaxical items.

First, you must be very careful to match up the "begin" and "end" tags. Begin tags start with a "<" and end tags have the same name but start with a "</". Here is an simple example:

  <MyDataElement>
     Data Inside…
  </MyDataElement>

Train your eye to look for the </ that ends a tag. If it does not match up something is wrong. Tags that do not match up will cause errors.

Data Element Nesting[edit]

All XML Data Elements must be paired using matching start and end tags:

  <Parent_XML_Element>
     <Child_XML_Element>
        <Sub_Child_XML_Element>
        </Sub_Child_XML_Element>
     </Child_XML_Element>
  </Parent_XML_Element >

Understanding this paired nesting structure is critical to creating working Ant build files.

XML Attributes[edit]

The XML begin tag may also have attributes.

  <MyTag attribute1="nothing" attribute2="nothing else">Text</MyTag>

Next Section[edit]

Getting Started



Getting Started

This section contains three chapters:

  1. Installation How to download, install Apache Ant
  2. Testing Testing Apache Ant
  3. Hello World How to run a small ant program that prints "Hello World!"



Getting Started/Installation

Before starting, you will need to have a running version of the Java Development Kit (JDK) 1.2 or later. For Ant 1.7, JDK 1.5 is recommended. Ant will need a JAXP-compliant XML parser installed and available in your classpath. The binary distribution of Ant includes the latest version of the Apache Xerces2 XML parser.

To install Ant on Windows, you can use WinAnt the Windows installer for Apache Ant. Download and run the latest version of WinAnt, and follow the directions in the installer. WinAnt will place the "ant" executable on your system path, which allows you to run the command "ant" from the command line at any directory in your system. Or, you can follow these directions:

The first step is to download Apache Ant.

There are two options. You can compile the program from the source or for beginners we recommend just downloading a binary file called a zip file.

You can find the download page here: http://ant.apache.org/bindownload.cgi

You will need to download a file such as

  apache-ant-1.7.0-bin.zip

This is a compressed archive file. You will need to uncompress it.

If you are not familiar with the process of uncompressing a file you should consult your computer operating system manual.

After this is done, you will see a folder such as:

  apache-ant-1.7.0

This folder contains another folder called "bin". Within that there is a file called ant.bat that you can run directly from the command line.

Next Chapter



Getting Started/Testing

To test that apache ant is installed correctly type "ant -version" from the command line:

  C:\ ant -version
  Apache Ant version 1.6.5 compiled on June 2, 2005
  C:\

If you do not see this, you have to check your PATH variable to make sure that ant is in your path. You can do this by opening a prompt and typing the "set" command.

To add the path in Windows: Right click on "My Computer" and find the Environment Variables button. Find the System Variable "Path", and add the path for Ant's bin folder (C:\ant\bin or whatever it is) separating it from other paths with a semicolon.

Next Chapter



Getting Started/Hello World

Hello World in Ant[edit]

Create a directory "My Project". Use a text editor such as Kate, Gedit or Notepad to create a file called build.xml within directory "My Project":

   <?xml version="1.0"?>
   <project name="My Project" default="hello">
       <target name="hello">
          <echo>Hello World!</echo>
       </target>
   </project>

The first line in the file is flush left (no indentation). It tells ant that this is an XML file:

  <?xml version="1.0"?>

The next line names the (required) project "My Project" and its default target "hello":

  <project name="My Project" default="hello">

The central three lines name and define the only target ("hello") and task ("echo") in the file:

      <target name="hello">
         <echo>Hello World!</echo>
      </target>

You can now open a shell and cd to the "My Project" directory you created and type "ant"

Output of Hello World[edit]

  Buildfile: build.xml
  
  hello: 
     [echo] Hello World!
  
  Build Successful
  Total time 0 seconds

Variations[edit]

Try changing the echo line to be the following:

  <echo message="Hello There!"></echo>

What is the result? Try the following also:

<echo message="Hello There!"/>

[[../../Core Concepts|Next Section]]



Core Concepts

There are several things you must learn to use Apache Ant successfully:

  1. Basic terminology Apache Ant/Core Concepts/Terminology
  2. The structure of a build file Apache Ant/Build File Structure
  3. Using Properties Apache Ant/Property
  4. Setting up Dependencies Apache Ant/Depends
  5. Using Fileset Apache Ant/Fileset



Core Concepts/Terminology

Ant Terminology[edit]

  • [[../../Task|Ant Task]] – something that ant can execute such as a compile, copy or replace. Most tasks have very convenient default values. See the Ant manual for a complete list of tasks.
  • Ant Target – a fixed series of ant tasks in a specified order that can depend on other named targets. Targets can depend only on other targets, not on projects or tasks. A target represents a particular item to be created, it can be a single item like a jar, or a group of items, like classes.
  • Ant Project – a collection of named targets that can run in any order depending on the time stamps of the files in the file system. Each build file contains one project.

[[../../Build File Structure/|Next Chapter]]



Build File Structure

Here is the structure of a typical build.xml file:

  <?xml version="1.0"?>
  <project name="MyFirstAntProject" default="MyTarget">
     <target name="init">
        <echo>Running target init</echo>
     </target>
     <target name="MyTarget" depends="init">
        <echo>Running target MyTarget</echo>
     </target>
  </project>

Here are a few things to note:

  1. The Begin and End tags for project (<project> and </project>) MUST start and end the file.
  2. The Begin <project> MUST have an attribute called default which is the name of one of the targets.
  3. Each build file must have at least one target.
  4. The Begin and End tags for <target> and </target> must also match EXACTLY.
  5. Each target MUST have a name.
  6. Targets depend only on other targets and reference them by their target name. Targets NEVER depend on projects or tasks.
  7. Target depends are optional.
  8. Anything between <echo> and </echo> tags is outputted to the console if the surrounding target is called.
  9. Every task has to be in a target.

You can execute this from a DOS or UNIX command prompt by creating a file called build.xml and typing:

  ant

Ant will search for the build file in the current directory and run the build.xml file.

Here is a sample output of this build:

  Buildfile: C:\AntClass\Lab01\build.xml
  init:
       [echo] Running target init
  MyTarget:
       [echo] Running target MyTarget
  BUILD SUCCESSFUL
  Total time: 188 milliseconds

Optionally you can also pass ant the name of the target to run as a command line argument

  ant init

Which triggers only the init target

  Buildfile: C:\AntClass\Lab01\build.xml
  init:
       [echo] Running target init
  BUILD SUCCESSFUL
  Total time: 188 milliseconds

Next Chapter



Property

Ant does not have variables like in most standard programming languages. Ant has a structure called properties. Understanding how properties work is critical to understanding how (and why) Ant works so well.

Here is a simple demonstration of how to set and use properties

  <project name="My Project" default="MyTarget">
      <!-- set global properties -->
      <property name="SrcDir" value="src"/>
      <property name="BuildDir" value="build"/>
      <target name="MyTarget">
         <echo message = "Source directory is = ${SrcDir}"/>
         <echo message = "Build directory is ${BuildDir}"/>
      </target>
   </project>

Note that to use a property you have to put a dollar sign and left curly brace before it and a right curly brace after it. Don't get these confused with parens.

When you run this you should get the following:

  Buildfile: C:\AntClass\PropertiesLab\build.xml
  MyTarget:
       [echo] Source directory is = src
       [echo] Build directory is build
  BUILD SUCCESSFUL
  Total time: 204 milliseconds

Ant properties are immutable meaning that once they are set they cannot be changed within a build process! This may seem somewhat odd at first, but it is one of the core reasons that once targets are written they tend to run consistently without side effects. This is because targets only run if they have to and you cannot predict the order a target will run.

Properties do not have to be used only inside a target. They can be set anywhere in a build file (or an external property file) and referenced anywhere in a build file after they are set.

Here is a small Ant project that demonstrates the immutability of a property:

   <project name="My Project" default="MyTarget">
      <target name="MyTarget">
         <property name="MyProperty" value="One"/>
         <!-- check to see that the property gets set -->
         <echo>MyProperty = ${MyProperty}</echo>
         <!-- now try to change it to a new value -->
         <property name="MyProperty" value="Two"/>
         <echo>MyProperty = ${MyProperty}</echo>
      </target>
   </project>

When you run this, you should get the following output:

  Buildfile: C:\AntClass\PropertiesLab\build.xml
  MyTarget:
       [echo] MyProperty = One
       [echo] MyProperty = One
  BUILD SUCCESSFUL
  Total time: 343 milliseconds

Note that despite trying to change MyProperty to be "Two", the value of MyProperty does not change. Ant will not warn you of this.

For newcomers this might seem strange, but this is ideally suited for building up complex trees of values that are set once and used over and over again. It makes your build scripts easy to maintain and reliable.

Ant also has a nice set of "built in" properties that you can use:

This demonstrates how to read system properties

   <project name="MyProject" default="Display-Builtins">
      <target name="Display-Builtins" description="Display Builtin Properties">
         <!-- the absolute path to the location of the buildfile -->
         <echo>${basedir}</echo>
         <!-- the absolute path of the buildfile -->
         <echo>${ant.file}</echo>
         <!-- ant.version - the version of Ant that you are running -->
         <echo>${ant.version}</echo>
         <!-- ant.project.name - the name of the project that is currently executing; it is set in the name attribute of <project>. -->
         <echo>${ant.project.name}</echo>
         <!-- ant.java.version - the JVM version Ant detected; currently it can hold the values "1.1", "1.2", "1.3", "1.4" and "1.5". -->
         <echo>${ant.java.version}</echo>
      </target>
   </project>

When you run this program you should get an output similar to the following:

  Buildfile: C:\eclipse\workspace\Ant Examples\build.xml
  Display-Builtins:
    [echo] C:\AntClass\PropertiesLab
    [echo] C:\AntClass\PropertiesLab\build.xml
    [echo] Apache Ant version 1.6.2 compiled on July 16, 2004
    [echo] MyProject
    [echo] 1.5
  BUILD SUCCESSFUL
  Total time: 188 milliseconds

See the ant reference manual for a full list of built-in ant and Java properties or you can try the following link for the Java properties: getProperties

Next Chapter, Next Cookbook Chapter



Depends

The depends attribute can be included in the target tag to specify that this target requires another target to be executed prior to being executed itself. Multiple targets can be specified and separated with commas.

<target name="one" depends="two, three">

Here, target "one" will not be executed until the targets named "two" and "three" are, first.

Example of using the depends attribute[edit]

Here is an example of a build file that executes three targets in order, first, middle and last. Note that the order the targets appear in the build file is unimportant:

  <?xml version="1.0" encoding="UTF-8"?>
  <project default="three">
     <target name="one">
        <echo>Running One</echo>
     </target>
  
     <target name="two" depends="one">
        <echo>Running Two</echo>
     </target>
  
     <target name="three" depends="two">
        <echo>Running Three</echo>
     </target>
  </project>

Sample Output:

  Buildfile: build.xml
  
  one:
     [echo] Running One
  
  two:
     [echo] Running Two
  
  three:
     [echo] Running Three
  
  BUILD SUCCESSFUL
  Total time: 0 seconds

Redundant dependency[edit]

Ant keeps track of what targets have already run and will skip over targets that have not changed since they were run elsewhere in the file, for example:

  <?xml version="1.0" encoding="UTF-8"?>
  <project default="three">
     <target name="one">
        <echo>Running One</echo>
     </target>
  
     <target name="two" depends="one">
        <echo>Running Two</echo>
     </target>
  
     <target name="three" depends="one, two">
        <echo>Running Three</echo>
     </target>
  </project>

will produce the same output as above - the target "one" will not be executed twice, even though both "two" and "three" targets are run and each specifies a dependency on one.

Circular dependency[edit]

Similarly, ant guards against circular dependencies - one target depending on another which, directly or indirectly, depends on the first. So the build file:

  <?xml version="1.0" encoding="UTF-8"?>
  <project default="one">
     <target name="one" depends="two">
        <echo>Running One</echo>
     </target>
  
     <target name="two" depends="one">
        <echo>Running Two</echo>
     </target>
  </project>

Will yield an error:

  Buildfile: build.xml
  
  BUILD FAILED
  Circular dependency: one <- two <- one
  
  Total time: 1 second

Next Chapter, Next Cookbook Chapter



Fileset

FileSets are ant's way of creating groups of files to do work on. These files can be found in a directory tree starting in a base directory and are matched by patterns taken from a number of PatternSets and Selectors.

FileSet identifies the base directory tree with its dir attribute. Then the FileSet's enclosed pattern elements, both named (PatternSets) and selected by wildcards (Selectors), choose the files and folders within the base tree.

If any selector within the FileSet do not select a given file, that file is not considered part of the FileSet. This makes FileSets equivalent to an <and> selector container.

Wildcards[edit]

Wildcards are used by ant to specify groups of files that have a pattern to their names.

  •  ? : is used to match any character.
  • * : is used to match zero or more characters.
  • ** : is used to match zero or more directories.

Examples[edit]

The below FileSets all select the files in directory ${server.src} that are Java source files without "Test" in their name.

<fileset dir="${server.src}" casesensitive="yes">
  <include name="**/*.java"/>
  <exclude name="**/*Test*"/>
</fileset>
<fileset dir="${server.src}" casesensitive="yes">
  <patternset id="non.test.sources">
    <include name="**/*.java"/>
    <exclude name="**/*Test*"/>
  </patternset>
</fileset>
<fileset dir="${client.src}">
 <patternset refid="non.test.sources"/>
</fileset>
<fileset dir="${server.src}" casesensitive="yes">
 <filename name="**/*.java"/>
 <filename name="**/*Test*" negate="true"/>
</fileset>
<fileset dir="${server.src}" casesensitive="yes">
 <filename name="**/*.java"/>
 <not>
   <filename name="**/*Test*"/>
 </not>
</fileset>

Finally[edit]

FileSets can appear as children of the project element or inside tasks that support this feature.

Next Section, Next Cookbook Chapter



Best Practices

Here are some of the Ant best practices that have been identified for creating maintainable Ant build files. Best Practices are not enforced by any compiler but they are conventions that allow people that are maintaining many projects to become quickly familiar with your build process.

Learn Ant Best Practices[edit]

Building your Ant Vocabulary

  • Study ant build scripts for other Open Source projects
  • Learn domain-specific targets such as building jar files, doing XML transforms or complex installs
  • Depending on diversity of tasks this might take a few hours to a few weeks

What to do about local file system paths[edit]

Local Property Files

Local File Systems

Standard Targets[edit]



Best Practices/Standard Targets

Standard Targets[edit]

One of the things that you learn is that if you name things consistently between projects, it is much easier to find things you are looking for. When you work with other people, you also want to have targets that you both are familiar with.

build.xml[edit]

  • Place your main build in a file called build.xml in the main directory of your project.
  • Do not put references to local file systems (Windows C:\ etc.) in your build file. Isolate these all in a local.properties file in the main directory.

Folder standards[edit]

  • src - the location of your source code
  • build - the output of a build process

Standard ant targets[edit]

init[edit]

This target should create all temporary directories within the build folder.

clean[edit]

This target should remove all compiled and intermediate files leaving only source files. It should remove anything that can be derived from other files. This would be run just prior to creating a zip file of the project, and in case of gremlins occurring during the build process.

build[edit]

This target should compile sources and perform transforms of raw data.

install[edit]

The install target should be used to copy files to a testing or production system.

Other Standards[edit]

Use the <description> element to describe what your target does.

If you have more than around 100 targets in your build file, it becomes unwieldy. You could consider calling a separate build file, but that adds other complications such as the dependency between targets.



Best Practices/Local Property Files

Using a Property file[edit]

One of the best ways to keep your build files free of local dependencies is to use a local property file

  <property file="local.properties"/>

Here is a sample of a property file:

 # Property file for Project X
 # Author
 # Date
 # Note that the format of this file adheres to the Java Property file specification
 # http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html#load(java.io.Reader)
 # to use the file put the following in your ant file:
 # <propertyfile file="my.properties">
 # All file names on local hard drives should be stored in this directory
 # Where Ant is installed.  Will not work with 1.5 due to exec/spawn calls
 antHome=C:/Apps/apache-ant-1.6.5
 Saxon8HomeDir=C:/Apps/saxon8
 saxon8jar=${Saxon8HomeDir}/saxon8.jar
 # used to make sure Saxon gets the right XSLT 2.0 processor
 processor=trax



Best Practices/Local File Systems

Dealing with Local File System Issues[edit]

  • Each developer or user has the right (or is forced by administrators) to put resource such as jar files and libraries in different locations
  • Try to avoid having ANY local file system location dependencies in your build files. Make sure you NEVER put C: in a build file. This is just plain bad behavior
  • Separate local file system access points in an external "property file"
  • Warning: property files are read by Java tools and are not always path separator aware. You can use "\\" on java, or, knowing that Ant expands existing properties, ${path.separator}
  • Allow people to check out all the files in a project including the build.xml file, customize their local library paths and build
  • Third party projects such as Ivy and Maven2 Ant tasks try to automate the entire library management process. Consider these on a very large/complex project.



Depends

The depends attribute can be included in the target tag to specify that this target requires another target to be executed prior to being executed itself. Multiple targets can be specified and separated with commas.

<target name="one" depends="two, three">

Here, target "one" will not be executed until the targets named "two" and "three" are, first.

Example of using the depends attribute[edit]

Here is an example of a build file that executes three targets in order, first, middle and last. Note that the order the targets appear in the build file is unimportant:

  <?xml version="1.0" encoding="UTF-8"?>
  <project default="three">
     <target name="one">
        <echo>Running One</echo>
     </target>
  
     <target name="two" depends="one">
        <echo>Running Two</echo>
     </target>
  
     <target name="three" depends="two">
        <echo>Running Three</echo>
     </target>
  </project>

Sample Output:

  Buildfile: build.xml
  
  one:
     [echo] Running One
  
  two:
     [echo] Running Two
  
  three:
     [echo] Running Three
  
  BUILD SUCCESSFUL
  Total time: 0 seconds

Redundant dependency[edit]

Ant keeps track of what targets have already run and will skip over targets that have not changed since they were run elsewhere in the file, for example:

  <?xml version="1.0" encoding="UTF-8"?>
  <project default="three">
     <target name="one">
        <echo>Running One</echo>
     </target>
  
     <target name="two" depends="one">
        <echo>Running Two</echo>
     </target>
  
     <target name="three" depends="one, two">
        <echo>Running Three</echo>
     </target>
  </project>

will produce the same output as above - the target "one" will not be executed twice, even though both "two" and "three" targets are run and each specifies a dependency on one.

Circular dependency[edit]

Similarly, ant guards against circular dependencies - one target depending on another which, directly or indirectly, depends on the first. So the build file:

  <?xml version="1.0" encoding="UTF-8"?>
  <project default="one">
     <target name="one" depends="two">
        <echo>Running One</echo>
     </target>
  
     <target name="two" depends="one">
        <echo>Running Two</echo>
     </target>
  </project>

Will yield an error:

  Buildfile: build.xml
  
  BUILD FAILED
  Circular dependency: one <- two <- one
  
  Total time: 1 second

Next Chapter, Next Cookbook Chapter



Property

Ant does not have variables like in most standard programming languages. Ant has a structure called properties. Understanding how properties work is critical to understanding how (and why) Ant works so well.

Here is a simple demonstration of how to set and use properties

  <project name="My Project" default="MyTarget">
      <!-- set global properties -->
      <property name="SrcDir" value="src"/>
      <property name="BuildDir" value="build"/>
      <target name="MyTarget">
         <echo message = "Source directory is = ${SrcDir}"/>
         <echo message = "Build directory is ${BuildDir}"/>
      </target>
   </project>

Note that to use a property you have to put a dollar sign and left curly brace before it and a right curly brace after it. Don't get these confused with parens.

When you run this you should get the following:

  Buildfile: C:\AntClass\PropertiesLab\build.xml
  MyTarget:
       [echo] Source directory is = src
       [echo] Build directory is build
  BUILD SUCCESSFUL
  Total time: 204 milliseconds

Ant properties are immutable meaning that once they are set they cannot be changed within a build process! This may seem somewhat odd at first, but it is one of the core reasons that once targets are written they tend to run consistently without side effects. This is because targets only run if they have to and you cannot predict the order a target will run.

Properties do not have to be used only inside a target. They can be set anywhere in a build file (or an external property file) and referenced anywhere in a build file after they are set.

Here is a small Ant project that demonstrates the immutability of a property:

   <project name="My Project" default="MyTarget">
      <target name="MyTarget">
         <property name="MyProperty" value="One"/>
         <!-- check to see that the property gets set -->
         <echo>MyProperty = ${MyProperty}</echo>
         <!-- now try to change it to a new value -->
         <property name="MyProperty" value="Two"/>
         <echo>MyProperty = ${MyProperty}</echo>
      </target>
   </project>

When you run this, you should get the following output:

  Buildfile: C:\AntClass\PropertiesLab\build.xml
  MyTarget:
       [echo] MyProperty = One
       [echo] MyProperty = One
  BUILD SUCCESSFUL
  Total time: 343 milliseconds

Note that despite trying to change MyProperty to be "Two", the value of MyProperty does not change. Ant will not warn you of this.

For newcomers this might seem strange, but this is ideally suited for building up complex trees of values that are set once and used over and over again. It makes your build scripts easy to maintain and reliable.

Ant also has a nice set of "built in" properties that you can use:

This demonstrates how to read system properties

   <project name="MyProject" default="Display-Builtins">
      <target name="Display-Builtins" description="Display Builtin Properties">
         <!-- the absolute path to the location of the buildfile -->
         <echo>${basedir}</echo>
         <!-- the absolute path of the buildfile -->
         <echo>${ant.file}</echo>
         <!-- ant.version - the version of Ant that you are running -->
         <echo>${ant.version}</echo>
         <!-- ant.project.name - the name of the project that is currently executing; it is set in the name attribute of <project>. -->
         <echo>${ant.project.name}</echo>
         <!-- ant.java.version - the JVM version Ant detected; currently it can hold the values "1.1", "1.2", "1.3", "1.4" and "1.5". -->
         <echo>${ant.java.version}</echo>
      </target>
   </project>

When you run this program you should get an output similar to the following:

  Buildfile: C:\eclipse\workspace\Ant Examples\build.xml
  Display-Builtins:
    [echo] C:\AntClass\PropertiesLab
    [echo] C:\AntClass\PropertiesLab\build.xml
    [echo] Apache Ant version 1.6.2 compiled on July 16, 2004
    [echo] MyProject
    [echo] 1.5
  BUILD SUCCESSFUL
  Total time: 188 milliseconds

See the ant reference manual for a full list of built-in ant and Java properties or you can try the following link for the Java properties: getProperties

Next Chapter, Next Cookbook Chapter



Fileset

FileSets are ant's way of creating groups of files to do work on. These files can be found in a directory tree starting in a base directory and are matched by patterns taken from a number of PatternSets and Selectors.

FileSet identifies the base directory tree with its dir attribute. Then the FileSet's enclosed pattern elements, both named (PatternSets) and selected by wildcards (Selectors), choose the files and folders within the base tree.

If any selector within the FileSet do not select a given file, that file is not considered part of the FileSet. This makes FileSets equivalent to an <and> selector container.

Wildcards[edit]

Wildcards are used by ant to specify groups of files that have a pattern to their names.

  •  ? : is used to match any character.
  • * : is used to match zero or more characters.
  • ** : is used to match zero or more directories.

Examples[edit]

The below FileSets all select the files in directory ${server.src} that are Java source files without "Test" in their name.

<fileset dir="${server.src}" casesensitive="yes">
  <include name="**/*.java"/>
  <exclude name="**/*Test*"/>
</fileset>
<fileset dir="${server.src}" casesensitive="yes">
  <patternset id="non.test.sources">
    <include name="**/*.java"/>
    <exclude name="**/*Test*"/>
  </patternset>
</fileset>
<fileset dir="${client.src}">
 <patternset refid="non.test.sources"/>
</fileset>
<fileset dir="${server.src}" casesensitive="yes">
 <filename name="**/*.java"/>
 <filename name="**/*Test*" negate="true"/>
</fileset>
<fileset dir="${server.src}" casesensitive="yes">
 <filename name="**/*.java"/>
 <not>
   <filename name="**/*Test*"/>
 </not>
</fileset>

Finally[edit]

FileSets can appear as children of the project element or inside tasks that support this feature.

Next Section, Next Cookbook Chapter



XML

Ant provides targets to validate and transform XML documents.

XMLwellformed - how to use Apache ant to check an XML file for well formedness

XMLvalidate - how to use Apache ant to validate an XML file agains an XML Schema

XSLT - how to use Apache ant to run an XML transform



XMLwellformed

You can use Apache ant to check a file or group of files for well-formedness. This is different from validation. Checking for well formedness simply checks for the consistency of begin and end tags. No XML Schema file is used.

This is done by using the <xmlvalidate> task. The xmlvalidate ant task will use a standard ant <fileset> and go through each file. In the example below, we specify a directory called "in" using a property. We then use the fileset to find all XML files in that directory and all subdirectories of that directory.

<project default="CheckXML">

   <property name="MYROOTDIR" value="in"/>
   <target name="CheckXML" description="Checks that all files at or below MYROOTDIR are well formed">
     <xmlvalidate>
        <fileset dir="${MYROOTDIR}" includes="**/*.xml"/>
        <attribute name="http://xml.org/sax/features/validation" value="false"/>
        <attribute name="http://apache.org/xml/features/validation/schema"  value="false"/>
     </xmlvalidate>
   </target>
 
 </project>

This target will run the default XML parser that comes with Ant (usually Xerces) and report any file that is not well-formed.

To test this example, add a folder called "in" and put several XML files in the folder that are malformed. In this case we created a mal-formed file called MyInputBad.xml. When we type "build" at the command line the following was the output:

 CheckXML:
 [xmlvalidate] C:\XMLClass\Ant\in\MyInputBad.xml:5:32: The element type "MyMessag
 e" must be terminated by the matching end-tag "</MyMessage>".

See also[edit]



XMLvalidate

Motivation[edit]

You want a command-line interface to validate one or more XML files.

Instructors Note: This file is used as a lab exercise for an Apache Ant class that includes extensive use of XML.

Method[edit]

You can use Apache ant to check a file or group of files for their validity. This is done by using the <xmlvalidate> Apache Ant task. The xmlvalidate ant task will use a standard ant <fileset> and go through and check each file. In the example below, we specify a directory called "in" using a property. We then use the fileset to find all XML files in that directory and all subdirectories of that directory. Each file is successfully checked for validity against an XML schema.

Sample Ant Task to Validate All XML Files in a Folder[edit]

<project default="ValidateXML">

   <property name="MYROOTDIR" value="in"/>
   <target name="ValidateXML" description="Checks that all files at or below MYROOTDIR are well formed">
     <xmlvalidate>
        <fileset dir="${MYROOTDIR}" includes="**/*.xml"/>
        <attribute name="http://xml.org/sax/features/validation" value="true"/>
        <attribute name="http://apache.org/xml/features/validation/schema"  value="true"/>
        <attribute name="http://xml.org/sax/features/namespaces" value="true"/>
     </xmlvalidate>
   </target>
 
 </project>

In the above example, we assume that each XML file has a directive that tells it where to get its XML Schema.

This target will run the default XML parser that comes with Ant (usually Xerces) and report any file that is not well-formed.

Sample XML Schema MyMessages.xsd[edit]

To test this you will need a small XML Schema file. The following file read a files of three messages:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
    <xs:element name="MyMessage" type="xs:string"/>
    <xs:element name="MyMessages">
    <xs:complexType>
        <xs:sequence>
           <xs:element ref="MyMessage" maxOccurs="3"/>
        </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

Sample Valid Data File[edit]

Here is a sample message file:

<?xml version="1.0" encoding="UTF-8"?>
<MyMessages xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="MyMessages.xsd">
   <MyMessage>Hello World!</MyMessage>
   <MyMessage>ANT AND XML Schema ROCK</MyMessage>
</MyMessages>

Note that the noNamespaceSchemaLocation attribute of the root element tells it to look in the current directory to find the XML schema file (MyMessages.xsd)

Sample Invalid Data File[edit]

If you add a fourth message the file should fail validation according to the rules in the XML Schema above.

<?xml version="1.0" encoding="UTF-8"?>
<MyMessages xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
xsi:noNamespaceSchemaLocation="MyMessages.xsd">
   <MyMessage>Hello From XSLT</MyMessage>
   <MyMessage>From input: Hi</MyMessage>
   <MyMessage>ANT AND XSLT ROCK</MyMessage>
   <MyMessage>I am the fourth message.</MyMessage>
</MyMessages>

To test this example, add a folder called "in" and put several XML files in the folder that are not valid. In this case we created a invalid file called MyInputBad.xml. When we type "build" at the command line the following was the output:

Sample Output[edit]

 ValidateXML:
 [xmlvalidate] C:\XMLClass\Ant\validate\in\MyInput.xml:6:15: cvc-complex type.2.4.d:    
 Invalid content was found starting with element 'MyMessage'. No child element is expected at this point.

This is a sample output. Note that the error message does not indicate that you exceed a count of 3 data elements.

Supplying an XML Schema definition file[edit]

If you are working in the null namespace add the following attribute:

  <attribute name="http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation" value="${xsd.file}"/>

If your documents have a namespace use the following:

  <attribute name="http://xml.org/sax/features/namespaces" value="true"/>
  <attribute name="http://apache.org/xml/properties/schema/external-schemaLocation" value="${xsd.file}"/>

If the XML files do not include their own schema, you can also create an ant task that includes where to find the XML schema. This is done using an special ant property.

 <property
   name="http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation"
   value="${xsd.file}"/>
 
 <xmlvalidate file="xml/endpiece-noSchema.xml" lenient="false" failonerror="true" warn="true">
    <attribute name="http://apache.org/xml/features/validation/schema" value="true"/>
    <attribute name="http://xml.org/sax/features/namespaces" value="true"/>
 </xmlvalidate>

Schematron Validate[edit]

Apache ant also has an element to validate against a schematron rules file

<taskdef name="schematron"
classname="com.schematron.ant.SchematronTask"
classpath="lib/ant-schematron.jar"/>

<schematron schema="rules.sch" failonerror="false">
   <fileset includes="schmatron-input.xml"/>
</schematron>

See http://www.schematron.com/resource/Using_Schematron_for_Ant.pdf for more details.

Navigation[edit]

Previous Chapter, Next Chapter

See also[edit]

References[edit]



XSLT

Apache Ant has a task called <xslt> (or its synonym <style>) that performs an XML transform on a file or group of files.

Here is an example XML transformation target:

<target name="MyXSLT">
   <xslt in="MyInput.xml" 
      out="MyOutput.xml"
      style="MyTransform.xslt">
   </xslt>
</target>

In the ant target there are three files you must specify:

  • in The name of the source XML input file
  • out The name of the XML output file
  • style The name of the XSLT file

To test this you can create a "dummy" input file:

<?xml version="1.0" encoding="UTF-8"?>
<root>
   <Input>Hi</Input>
</root>

Hello World XSLT Transform[edit]

To get started, here is a small "hello world" transform file. The transform looks for the root data element of the input file but does not actually process any of the input file data elements:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    <xsl:template match="/">
        <MyMessage>Hello World!</MyMessage>
    </xsl:template>
</xsl:stylesheet>

You can now execute this from a command line. The following is an example run from a Microsoft Windows command shell:

 C:\XMLClass\XSLT\Lab1>ant
 Buildfile: build.xml
 
 MyXSLT:
      [xslt] Processing C:\XMLClass\XSLT\Lab1\MyInput.xml to C:\XMLClass\XSLT\Lab
 1\MyOutput.xml
      [xslt] Loading stylesheet C:\XMLClass\XSLT\Lab1\MyTransform.xslt
 
 BUILD SUCCESSFUL
 Total time: 1 second

The output will appear in a file called MyOutput.xml

<?xml version="1.0" encoding="UTF-8"?>
<MyMessage>Hello World!</MyMessage>

Transforming Files containing external References[edit]

Sometimes you may need to transform XML files containing external references, like URLs in DTDs or Schema definitions.

Quite often, parsing or validating against such external files can not be totally disabled. Saxon for example will want to read DTDs even if parsing them is disabled (parameter "-dtd:off" or equivalent).

In such cases it may also occur, that the development workstation is connected to a company intranet that is protected by a firewall from the internet, and needs some sort of proxy or socks configuration.

In these cases, the only solution to successfully execute the transformation is by adding this connection configuration to the ant script.

Example (taken from a bigger build.xml file):

   <target name="xdoclet-merge-top" depends="init, proxy-set" >
     <xslt style="${XDocletDescDir}/merge.xslt" 
       in="${XDocletDescDir}/merge1.xml" 
       out="${XDocletDescDir}/jboss-2.xml" force="true" >			
       <classpath location="${ZubehoerDir}/SaxonHE9-4-0-1J/saxon9he.jar" />	
     </xslt>
   </target>
   <target name="proxy-set">
     <setproxy proxyhost="proxy.mynet.de" proxyport="8080" proxyuser="" proxypassword=""/>
   </target>

Passing Parameters from Ant into an XSLT script[edit]

You can also pass parameters from an ant build file into an XSLT. This is handy if you need to run the same transform with small variations. You can do this by simply adding the param tag the <xslt> target:

<param name="MyParameter" expression="ANT AND XSLT ROCK"/>

The ant task now looks like the following:

<?xml version="1.0" encoding="UTF-8"?>
<project default="MyXSLT">
    <target name="MyXSLT">
       <xslt
          in="MyInput.xml"
          out="MyOutput.xml"
          style="MyTransform.xslt">
          <param name="MyParameter" expression="ANT AND XSLT ROCK"/>
        </xslt>
    </target>
</project>

Here is a sample transform that takes a single input parameter:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
   <xsl:param name="MyParameter"/>
   <xsl:template match="/">
       <MyMessages>
           <MyMessage>Hello From XSLT</MyMessage>
           <MyMessage>From input: <xsl:value-of select="/root/Input"/>
           </MyMessage>
           <MyMessage>
               <xsl:value-of select="$MyParameter"/>
           </MyMessage>
       </MyMessages>
   </xsl:template>
</xsl:stylesheet>

This will create the following output:

<?xml version="1.0" encoding="UTF-8"?>
<MyMessages>
    <MyMessage>Hello From XSLT</MyMessage>
    <MyMessage>From input: Hi</MyMessage>
    <MyMessage>ANT AND XSLT ROCK</MyMessage>
</MyMessages>

Note that there are three different lines. One came from the transform file, one came from the input XML file and one was passed directly in from the ant file.

Other ways to use XSLT within Apache Ant[edit]

Checking dependencies[edit]

By default, the XSLT task will check the file time stamps to see if the output file is newer than the input file. If the outputs are newer the task should not have to re-run the transform. But sometimes a transform will import other transforms files and Ant does not check the timestamps of imported files. (Perhaps they will add that as an option in the future.) But all is not lost. We can achieve the same results by using the <dependset> tag. Here is an example:

<dependset>
   <srcfilelist dir="${XSLTDir}"
      files="Content2HTML.xsl, HTMLHeader.xsl,PageHeader.xsl,LeftNav.xsl,PageFooter.xsl"/>
   <targetfileset
      dir="${BuildDir}"
      includes="*.htm"/>
</dependset>

In the above example the source transform (Content2HTML.xsl) imported the other four page fragment transforms located in the XSLTDir (HTMLHeader.xsl, PageHeader.xsl, LeftNav.xsl and PageFooter.xsl). It created the files in the BuildDir directory. If any of the inputs files change, the outputs will be regenerated.

This is a handy way to build a little ant-based web content management system. You just put the HTML content in a directory and the transform can wrap the HTML headers, navigation bars and footers around your content. The HTML for each page can just be a <div> section that is copied into the output using the <xsl:copy-of> command.

References[edit]



Running Saxon

Motivation[edit]

You want to have an Apache Ant task that runs the Saxon XSLT transform.

Method[edit]

Download the Saxon jar file. Put the saxon.jar file in a lib folder. Run the following test.

Source Code[edit]

Build File[edit]

The following is how Saxon is invoked from Apache Ant.

<target name="test-saxon">
   <xslt classpath="lib\saxon8.jar"
      in="in.xml" 
      out="out.html" 
      style="check-version.xsl">
      <factory name="net.sf.saxon.TransformerFactoryImpl"/>
   </xslt>
</target>

Note that if you are running in Eclipse you will have to go to the "Preferences" menu and add the saxon9.jar file to Ant/Runtime/Ant Home Entries. Just click the "Add JARs" and add the saxon9jar file the end of this list.

XSLT Version Check[edit]

check-version.xsl:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:template match="/">
        <results>
            <Version><xsl:value-of select="system-property('xsl:version')" /></Version>
            <Vendor><xsl:value-of select="system-property('xsl:vendor')" /></Vendor>
            <Vendor-URL><xsl:value-of select="system-property('xsl:vendor-url')" /></Vendor-URL>
        </results>
    </xsl:template>
</xsl:stylesheet>

Or if you are generating a web page:

<xsl:stylesheet version="1.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:template match="/">
      <html>
        <head>
         <title>XSL Version</title>
        </head>
        <body>
           <p>Version:
           <xsl:value-of select="system-property('xsl:version')" />
           <br />
           Vendor:
           <xsl:value-of select="system-property('xsl:vendor')" />
           <br />
           Vendor URL:
           <xsl:value-of select="system-property('xsl:vendor-url')" />
           </p>
        </body>
     </html>
  </xsl:template>
</xsl:stylesheet>

Results for XALAN[edit]

Results for Apache XALAN

  1.0
  Vendor: Apache Software Foundation (Xalan XSLT)
  Vendor URL: http://xml.apache.org/xalan-j

Results for Saxon[edit]

Version: 2.0 Vendor: SAXON 9.1.0.7 from Saxonica Vendor URL: http://www.saxonica.com/



Passing Parameters to XSLT

Motivation[edit]

You want to call a transform with a set of parameters. You want to be able to set these parameters from a build file.

Build File Target[edit]

<!-- sample target to demonstrate the use of passing parameters from an ant file to a XSL tranform -->
<target name="Parameterized XSLT Test">
	<echo>Running conditional XSLT test</echo>
	<xslt in="null.xml" 
           out="tmp/param-output.xhtml"
	   style="xslt/TransformWithParametersTest.xsl">
           <factory name="net.sf.saxon.TransformerFactoryImpl"/>
           <param name="Parameter1" expression="true"/>
           <param name="Parameter2" expression="Hello World"/>
           <param name="Parameter3" expression="1234567"/>
        </xslt>
        <concat>
           <fileset dir="tmp" file="param-output.xml"/>
        </concat>
</target>

Input null.xml[edit]

XSLT must have an input, but this example does not use it.

   <root/>

XSLT[edit]

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    version="2.0">
    <xsl:param name="Parameter1" select="true()" as="xs:boolean"/>
    <xsl:param name="Parameter2" required="yes" as="xs:string"/>
    <xsl:param name="Parameter3" required="yes" as="xs:integer"/>
    <xsl:output method="xhtml" omit-xml-declaration="yes"/>
    
    <xsl:template match="/">
        <html xmlns="http://www.w3.org/1999/xhtml">
            <head>
                <title>Test of Passing Three Parameters (boolean, string, integer)</title>
            </head>
            <body>
                <h1>Test of Passing Three Parameters (boolean, string, integer)</h1>
                <p>The following parameters have been set by the Apache Ant build file.</p>
                <ul>
                    <li><b>Parameter1: </b><xsl:value-of select="$Parameter1"/>
                    </li>
                    <li><b>Parameter2: </b><xsl:value-of select="$Parameter2"/>
                    </li>
                    <li><b>Parameter3: </b><xsl:value-of select="$Parameter3"/>
                    </li>
                </ul>
            </body>
        </html>
    </xsl:template>
</xsl:stylesheet>



XQuery

Motivation[edit]

You want to transform an XML document with XQuery using an ant task.

Method[edit]

We will use the Saxon library to demonstrate this.

Steps:

  1. Download the Saxon library from Sourceforge
  2. Download a sample XQuery from the samples (for example tour.xq from the samples area)
  3. Copy the Saxon jar file into your project. In the example below just a single jar file is copied into the location saxonhe9-2-0-6j/saxon9he.jar

Sample Ant Target[edit]

This sample uses the java task to run an XQuery program using the Saxon Java library. In the example below the XQuery tour.xq is executed and the output is copied into the file output.html.

Note that the starting point is set by passing the arg as a parameter to the XQuery.

<target name="run-saxon-xquery">
    <java classname="net.sf.saxon.Query" output="output.html">
             <arg value="tour.xq"/>
             <classpath>
                <pathelement location="saxonhe9-2-0-6j/saxon9he.jar"/>
            </classpath>
             <arg value="start=e5"/> 	
     </java>
    <!-- On Windows, this will open FireFox after the Transform is done -->
    <exec command="C:\Program Files\Mozilla Firefox\firefox.exe
    	C:\ws\Saxon-Test\output.html"/>
</target>



Converting Excel to XML

Motivation[edit]

You want to automatically extract a well-formed XML file from a binary Excel document.

Method[edit]

We will us the java Ant task within a build target.

Input File[edit]

We will create a sample Microsoft Excel file that has two columns like the following:

Screen image for spreadsheet input

Save this into a file 'sample.xls'.

Next, download the Apache Tika jar file and put is on your local hard drive.

You can get the downloads from here: http://tika.apache.org/download.html the Main Tika jar file is about 27MB.

I put the tika jar file in D:\Apps\tika but you can change this.

Create a file called "build.xml"

Sources[edit]

<project name="extract-xml-from-xsl" default="extract-xml-from-xsl">
    <description>Sample Extract XML from Excel xsl file with Apache Tika</description>
    <property name="lib.dir" value="D:\Apps\tika"/>
    <property name="input-file" value="sample.xls"/>
    
    <target name="extract-xml-from-xsl">
        <echo message="Extracting XML from Excel file: ${input-file}"/>
        <java jar="${lib.dir}/tika-app-1.3.jar" fork="true" failonerror="true"
            maxmemory="128m" input="${input-file}" output="sample.xml">
            <arg value="-x" />
        </java>
    </target> 
</project>

The <java> task will run tika. The argument "-x" (for XML will extract the XML from the input.

Other command line options are listed here: http://tika.apache.org/1.3/gettingstarted.html

Now open your DOS or UNIX shell and cd into the place with your build file. Type "ant" into a command shell.

Run[edit]

$ ant
Buildfile: D:\ws\doc-gen\trunk\build\tika\build.xml

extract-xml-from-xsl:
     [echo] Extracting XML from Excel file: sample.xls

BUILD SUCCESSFUL
Total time: 1 second

Sample Output[edit]

Note that the output is a well formed HTML file with a table in it:

<html xmlns="http://www.w3.org/1999/xhtml">
    <head>
        <meta name="meta:last-author" content="Dan" />
        <meta name="meta:creation-date" content="2013-03-04T17:20:19Z" />
        <meta name="dcterms:modified" content="2013-03-04T17:22:01Z" />
        <meta name="meta:save-date" content="2013-03-04T17:22:01Z" />
        <meta name="Last-Author" content="Dan" />
        <meta name="Application-Name" content="Microsoft Excel" />
        <meta name="dc:creator" content="Dan" />
        <meta name="Last-Modified" content="2013-03-04T17:22:01Z" />
        <meta name="Author" content="Dan" />
        <meta name="dcterms:created" content="2013-03-04T17:20:19Z" />
        <meta name="date" content="2013-03-04T17:22:01Z" />
        <meta name="modified" content="2013-03-04T17:22:01Z" />
        <meta name="creator" content="Dan" />
        <meta name="Creation-Date" content="2013-03-04T17:20:19Z" />
        <meta name="meta:author" content="Dan" />
        <meta name="extended-properties:Application" content="Microsoft Excel" />
        <meta name="Content-Type" content="application/vnd.ms-excel" />
        <meta name="Last-Save-Date" content="2013-03-04T17:22:01Z" />
        <title></title>
    </head>
    <body>
        <div class="page"><h1>Sheet1</h1>
            <table>
                <tbody>
                    <tr>
                        <td>Name</td>
                        <td>Phone</td>
                    </tr>
                    <tr>
                        <td>Peg</td>
                        <td>123</td>
                    </tr>
                    <tr>
                        <td>Dan</td>
                        <td>456</td>
                    </tr>
                    <tr>
                        <td>John</td>
                        <td>789</td>
                    </tr>
                    <tr>
                        <td>Sue</td>
                        <td>912</td>
                    </tr>
                </tbody>
            </table>
        </div>

</html>



Cleaning up HTML

Motivation[edit]

We want to clean up HTML that is not well formed. We will use the Apache Tika tools to convert dirty HTML to well-formed XHTML.

Sample Ant File[edit]

<project name="tika tests" default="extract-xhtml-from-html">
    <description>Sample invocations of Apache Tika</description>
    <property name="lib.dir" value="../lib"/>
    
    <property name="input-dirty-html-file" value="input-dirty.html"/>
    <property name="output-clean-xhtml-file" value="output-clean.xhtml"/>
    <target name="extract-xhtml-from-html">
        <echo message="Cleaning up dirty HTML file: ${input-dirty-html-file} to ${output-clean-xhtml-file}"/>
        <java jar="${lib.dir}/tika-app-1.3.jar" fork="true" failonerror="true"
            maxmemory="128m" input="${input-dirty-html-file}" output="${output-clean-xhtml-file}">
            <arg value="-x" />
        </java>
    </target>
</project>

Sample Input[edit]

<html xmlns="http://www.w3.org/1999/xhtml">
    <head>
        <title>Dirty HTML</title>
    </head>
    <body>
        <p><b>test</b></p>
        <p><b>test<b></p>
        <p>test<br/>test</p>
        <p>test<br>test<br>test</p>
        <p>This is <B>bold, <I>bold italic, </b>italic, </i>normal text</p>
    </body>
</html>

Sample Output[edit]

<?xml version="1.0" encoding="UTF-8"?><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="Content-Encoding" content="ISO-8859-1"/>
<meta name="Content-Type" content="application/xhtml+xml"/>
<meta name="dc:title" content="Dirty HTML"/>
<title>Dirty HTML</title>
</head>
<body>
        <p>test</p>

        <p>test</p>

        <p>test
test</p>

        <p>test
test
test</p>

        <p>This is bold, bold italic, italic, normal text</p>

    </body></html>



Converting PDF to XML

Apache Ant Project to Extract Text From PDF[edit]

<project name="extract-text-from-pdf" default="extract-text-from-pdf">
    <description>Sample invocations of Apache Tika</description>
    <property name="lib.dir" value="../lib"/>
 
    <property name="input-pdf-file" value="myDocument.pdf"/>
    <property name="output-clean-xhtml-file" value="output-clean.xhtml"/>
    <target name="extract-text-from-pdf">
        <echo message="Extracting XML from PDF: ${input-pdf-file} to ${output-clean-xhtml-file}"/>
        <java jar="${lib.dir}/tika-app-1.3.jar" fork="true" failonerror="true"
            maxmemory="128m" input="${input-pdf-file}" output="${output-clean-xhtml-file}">
            <arg value="-x" />
        </java>
    </target>
</project>



Store XML data

Motivation[edit]

You want to upload a file or a hierarchy of files into eXist.

Method[edit]

We will use the xdb:store function and demonstrate how to use its options to load subfolders.

Sample Code[edit]

Each build file must have four key components

  1. a reference to internal files on your hard drive (ideally in a properties file)
  2. a typedef for your Ant eXist extensions
  3. a path to tell it where to get the jar files
  4. a target to do the load
<project xmlns:xdb="http://exist-db.org/ant" default="upload-collection-to-exist">
 
    <!-- This is where I put my copy of the eXist trunk code -->
    <!-- It is the result of a subversion checkout from https://exist.svn.sourceforge.net/svnroot/exist/trunk -->
    <property name="exist-home" value="C:\ws\exist-trunk"/>
    
    <!-- this tells us where to find the key jar files relative to the ${exist-home} property -->
    <path id="classpath.core">
        <fileset dir="${exist-home}/lib/core">
            <include name="*.jar"/>
        </fileset>
        <pathelement path="${exist-home}/exist.jar"/>
        <pathelement path="${exist-home}/exist-optional.jar"/>
    </path>

    <typedef resource="org/exist/ant/antlib.xml" uri="http://exist-db.org/ant">
        <classpath refid="classpath.core"/>
    </typedef>
    
    <target name="upload-collection-to-exist">
        <echo message="Loading Documents to eXist."/>
        <xdb:store 
            uri="xmldb:exist://localhost:8080/xmlrpc/db/my-project"
            createcollection="true"
            createsubcollections="true"
            user="admin" password="">
            <fileset dir="C:\ws\my-project\trunk\db\my-project"> 
                <include name="**/*.*"/>
            </fileset>
        </xdb:store>
    </target>

</project>

Using a local.properties File to Load XML Data[edit]

The script above will work fine if you have a single use with one set of local files. But if you have many user each user may put their local files in a different location. If that is the case then you will want to isolate all local file references in a file called local.properties.

The following example is from the eXist documentation project for a server running on port 8080 with the context being set to be "/":

# Local Property file for eXist documentation project
#
# this file is loaded into the build.xml file using the <property file="local.properties"/>
# it contains any local references to your
# Properties on a Windows system
exist-home=C:\\ws\\exist-trunk
exist-docs=C:\\ws\\exist-docs
user=admin
password=
uri=xmldb:exist://localhost:8080/xmlrpc/db/apps/exist-docs
<project xmlns:xdb="http://exist-db.org/ant" default="upload-exist-docs-app" 
   name="eXist Load Example">
    
    <!-- this is where we set our exist-home, user, password and the place that we will load the docs -->
    <property file="local.properties"/>
    
    <!-- this tells us where to find the key jar files relative to the ${exist-home} property -->
    <path id="classpath.core">
        <fileset dir="${exist-home}/lib/core">
            <include name="*.jar"/>
        </fileset>
        <pathelement path="${exist-home}/exist.jar"/>
        <pathelement path="${exist-home}/exist-optional.jar"/>
    </path>
    <typedef resource="org/exist/ant/antlib.xml" uri="http://exist-db.org/ant">
        <classpath refid="classpath.core"/>
    </typedef>
    
    <!-- upload app -->
    <target name="upload-exist-docs-app">
        <echo message="Loading eXist documentation system to eXist."/>
        <xdb:store uri="${uri}" createcollection="true" 
                 createsubcollections="true" user="admin" password="">
            <fileset dir="${exist-docs}">
                <include name="**/*.*"/>
            </fileset>
        </xdb:store>
    </target>

    <target name="show-properties">
        <echo message="exist-home=${exist-home}"/>
        <echo message="exist-docs=${exist-docs}"/>
        <echo message="uri=${uri}"/>
    </target>
</project>

References[edit]

The eXist store task is documented here: http://exist-db.org/exist/apps/doc/ant-tasks.xml#D2.2.10



Reindex a Collection

Motivation[edit]

You want a simple ant task that will reindex a collection.

Method[edit]

We will us the ant task that will call an XQuery that has the reindex() command in it. Because there is no ant task that does this we will use the xquery task to execute a remote XQuery that performs this task.

Here is a link to the ant task to run an XQuery http://exist-db.org/ant-tasks.html#N1041F

Call a remote XQuery by file name[edit]

<target name="reindex-collection">
    <xdb:xquery user="${user}" password="${password}"
        uri="${test-server}$(collection)" query="reindex.xq"
        outputproperty="result">
     </xdb:xquery>
     <echo message="Result = ${result}"/>
</target>

Supply the Body of an XQuery[edit]

<target name="inline-query">
   <xdb:xquery uri="${test-server}/db"  
       user="${user}" password="${password}"
       outputproperty="result">
       reindex('/db/mycollection')
     </xdb:xquery>
     <!-- note, this only returns a SINGLE line -->
     <echo message="Result = ${result}"/>
</target>



Execute an XQuery

Motivation[edit]

You want to execute an XQuery that is stored in an eXist database.

Remote execution of an inline query[edit]

<target name="run-one-inline-test-local">
        <description>Execute a single xUnit test on a local system</description>
        <echo message="Run an inline XQuery"/>
        <xdb:xquery uri="xmldb:exist://localhost/xmlrpc/db" user="${user}" password="${password}"      
            outputproperty="result">
        xquery version "1.0";
        let $message := 'Hello World!'
        return $message
        </xdb:xquery>
        <echo message="Result = ${result}"/>
</target>

Note that you only can return a string in this example. Any XML content in the query will generate an error.

If you want to return an XML file into a property you will need to wrap you query in a CDATA structure:

 <!-- This version uses CDATA to put an XML file into the result property -->
<target name="run-xquery-cdata">
        <xdb:xquery user="admin" password="" uri="${test-server}/db" outputproperty="result"><![CDATA[
            xquery version "1.0";
            let $message := 'Hello World'
            return
              <result>{$message}</result>
        ]]></xdb:xquery>
        <echo message="Result = ${result}"/>
</target>

Execute an XQuery Stored in Local Drive[edit]

hello-world.xq:

xquery version "1.0";
let $message := 'Hello World'
return
   <result>{$message}</result>

This is similar to the version above but you will note that the queryfile attribute has been added.

<target name="run-in-database-query" depends="load-test-resources">
    <xdb:xquery user="${user}" password="${password}"
       uri="xmldb:exist://localhost/xmlrpc/db" queryfile="hello-world.xq"
            outputproperty="result"/>
    <echo message="Result = ${result}"/>
</target>

Note for the above to work the file hello-word.xq MUST be in the same directory as the build script.

Adding Execute Permissions[edit]

<target name="add-execute">
    <!-- make the controller.xql file executable -->
    <xdb:chmod uri="${local-uri}/apps/myapp" resource="controller.xql" permissions="group=+execute,other=+execute"/>
</target>

Where the local-uri is something like: xmldb:exist://localhost:8080/exist/xmlrpc/db for the default installation path



Creating a .xar file

Motivation[edit]

This example is under development!

You want to create an XML archive file (.xar file) directly from your source code that can be used to load library modules or applications into a native XML database. This makes it much easier for users to install your module or application. The packaging process does all the work of uploading your files into the correct location on a running eXist server and also sets all the permissions of the XQuery files (.xq) for you automatically.

Method[edit]

We need to create a "zip" file with all the right components in it.

The format of the package is here:

http://expath.org/spec/pkg

The eXist-specific package documentation is here:

http://demo.exist-db.org/exist/apps/doc/repo.xml

GUI Package vs. On-Disk Library vs. In DB Library[edit]

There are three types of installation packages:

  1. A external library that is not in the database
  2. A library that is loaded into the database
  3. A full application with a GUI

For all library apps without GUI but deployed into db you must use two attributes, one for the target the type="library" use the following structure:

 target="some /db path" + type="library"

For a simple XQuery library package, which only needs to be registered with eXist but not deployed within the exist database the target attribute should not be used.

 no target + type="library"

Sample Package Structure[edit]

The archive must contain two XML descriptor files in the root directory: expath-pkg.xml and repo.xml

Sample expath-pkg.xml file

<package xmlns="http://expath.org/ns/pkg" name="http://example.com/apps/myapp" 
         abbrev="myapp" version="0.1" spec="1.0">
    <title>My Cool Application</title>
    <dependency package="http://exist-db.org/apps/xsltforms"/>
</package>

Note that the file name and the string in the namespace are "pkg" but the element name and the attribute in the dependency are "package". Make sure to keep these clear.

The format of this XML file is describe in the EXPath documentation.

Sample repo.xml file that contains instructions for the eXist-specific packaging

<meta xmlns="http://exist-db.org/xquery/repo">
    <description>My eXist application</description>
    <author>Dan McCreary</author>
    <website>http://danmccreary.com</website>
    <status>alpha</status>
    <license>GNU-LGPL</license>
    <copyright>true</copyright>
    <!-- set this to "application" (without quotes) for system that have a GUI -->
    <type>application</type>
    <target>myapp</target>
    <prepare>pre-install.xql</prepare>
    <finish>post-install.xql</finish>
    <permissions user="admin" password="" group="dba" mode="rw-rw-r--"/>
    <!-- this element is automatically added by the deployment tool -->
    <deployed>2012-11-28T23:15:39.646+01:00</deployed>
</meta>

Sample Apache Ant Target to Generate an Application .xar file[edit]

This ant target needs the following inputs:

 source-dir - the place you keep your source code
 package-dir - a temp dir such as /tmp/my-package to store temporary files
 app-name - the name of your application
 app-version - the version of your application
  1. verify that repo.xml and expath-package.xml exist in the source dir and copy them into temp.dir
  2. copy all application files temp.dir
  3. create zip file from contents of temp.dir in the packages area and upload it to repositories if needed
<target name="generate-app-xar" description="Generate Application xar archive file">
   <echo>Making Package for ${app-name} use source from ${source-dir}</echo>
   <zip destfile="${package-dir}/${app-name}-${app-version}.xar">
         <fileset dir="${source-dir}">
            <include name="**/*.*" />
            <exclude name="**/.svn" />
      </fileset>
   </zip>
   <echo>Package is stored at ${package-dir}/${app-name}-${app-version}.xar</echo>
</target>

Sample Apache Ant Target to Generate a Library .xar file[edit]

This script depends on the following Ant properties:

 ant.project.name - the name of the project
 xslt.dir - the directory that the XSLT script are stored
 temp.dir - a temp dir such as /tmp to store temporary files
 web.specs.dir - the place to put the results
<target name="generate-xar" description="Generate xar archive">
        <echo>Making ${ant.project.name}.xar...</echo>

        <!-- run a transform in the input specification file to create the a.xml file -->
        <xslt force="true" style="${xslt.dir}/generate-xar-descriptors.xsl" 
              in="${web.specs.dir}/${ant.project.name}/${ant.project.name}.xml" 
              out="${temp.dir}/files/a.xml">
            <param name="module-version" expression="${module-version}" />
            <param name="eXist-main-class-name" expression="${eXist-main-class-name}" />
        </xslt>
        <delete file="${temp.dir}/files/a.xml" />
        
        <!-- now create the .xar file with all our files in the right place -->
        <zip destfile="${temp.dir}/archives/${ant.project.name}-${module-version}.xar">
            <fileset dir="${temp.dir}/files">
                <include name="**/*.*" />
                <exclude name="*-tests.jar" />
            </fileset>
        </zip>
    </target>

Sample XSLT Script[edit]

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">

	<xsl:output method="xml" />

	<xsl:param name="module-version" />
	<xsl:param name="eXist-main-class-name" />

	<xsl:template match="/">
		<xsl:variable name="module-namespace">
			<xsl:copy-of select="//element()[@id = 'module-namespace']" />
		</xsl:variable>
		<xsl:variable name="module-prefix">
			<xsl:copy-of select="//element()[@id = 'module-prefix']" />
		</xsl:variable>
		<xsl:variable name="spec-title">
			<xsl:copy-of select="concat('EXPath ', //element()[local-name() = 'title'])" />
		</xsl:variable>
		<xsl:variable name="author">
			<xsl:copy-of select="//element()[local-name() = 'author'][1]/element()[1]" />
		</xsl:variable>

		<xsl:result-document href="target/files/expath-pkg.xml">
			<package xmlns="http://expath.org/ns/pkg" name="http://expath.org/lib/{$module-prefix}" abbrev="{concat('expath-', $module-prefix)}"
				version="{$module-version}" spec="1.0">
				<title>
					<xsl:value-of select="$spec-title" />
				</title>
				<dependency processor="http://exist-db.org/" />
			</package>
		</xsl:result-document>

		<xsl:result-document href="target/files/repo.xml">
			<meta xmlns="http://exist-db.org/xquery/repo">
				<description>
					<xsl:value-of select="$spec-title" />
				</description>
				<author>
					<xsl:value-of select="$author" />
				</author>
				<website />
				<status>stable</status>
				<license>GNU-LGPL</license>
				<copyright>true</copyright>
				<type>library</type>
			</meta>
		</xsl:result-document>

		<xsl:result-document href="target/files/exist.xml">
			<package xmlns="http://exist-db.org/ns/expath-pkg">
				<jar>
					<xsl:value-of select="concat('expath-', $module-prefix, '.jar')" />
				</jar>
				<java>
					<namespace>
						<xsl:value-of select="$module-namespace" />
					</namespace>
					<class>
						<xsl:value-of select="concat('org.expath.exist.', $eXist-main-class-name)" />
					</class>
				</java>
			</package>
		</xsl:result-document>

		<xsl:result-document href="target/files/cxan.xml">
			<package xmlns="http://cxan.org/ns/package" id="{concat('expath-', $module-prefix, '-exist')}" name="http://expath.org/lib/{$module-prefix}"
				version="{$module-version}">
				<author id="{$author/element()/@id}">
					<xsl:value-of select="$author" />
				</author>
				<category id="libs">Libraries</category>
				<category id="exist">eXist extensions</category>
				<tag>
					<xsl:value-of select="$module-prefix" />
				</tag>
				<tag>expath</tag>
				<tag>library</tag>
				<tag>exist</tag>
			</package>
		</xsl:result-document>

	</xsl:template>
</xsl:stylesheet>

Sample XQuery Script[edit]

Acknowledgements[edit]

The Apache Ant target and the XSLT script were provided by Claudius Teodorescu.



References

References[edit]

Links[edit]
Books[edit]
  • Ant: The Definitive Guide, 2nd Edition by Holzner Steve (April 14, 2005)
  • Pro Apache Ant by Matthew Moodie (Nov 16, 2005)
  • Java Development with Ant by Erik Hatcher and Steve Loughran (Aug 2002)
  • Ant Developer's Handbook by Allan Williamson, et al. (Nov 1, 2002)
Articles[edit]