XQuery/Generating PDF from XSL-FO files
[edit] Motivation
You want to generate documents with precise page layout from XML documents, for example to PDF.
[edit] Approach
Typically, the steps required to generate a PDF document are:
- retrieve or compute the base XML document
- transform to XSL-FO, perhaps using XSL
- transform the XSL-FO to PDF using Apache FOP
[edit] Method
We will use a built-in function to convert XSL-FO into PDF. (See Installing the XSL-FO module if this module is not installed and configured.)
[edit] Using the xslfo:render() function
The function is the xslfo:render(). It has the following structure:
let $pdf-binary := xslfo:render($input-xml-fo-document, 'application/pdf', $parameters)
This file can be saved directly to the XML file system. It will be stored as a non-searchable binary.
You can then view this directly by providing a link to the file or you can send it directly to the browser by using the response:stream-binary() function as follows:
return response:stream-binary($pdf-binary, 'application/pdf', 'myGeneratedPDF.pdf')
[edit] Example XQuery to Generate PDF
The following program will generate a PDF document with the text "Hello World".
xquery version "1.0"; declare namespace fo="http://www.w3.org/1999/XSL/Format"; declare namespace xslfo="http://exist-db.org/xquery/xslfo"; let $fo := <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format"> <fo:layout-master-set> <fo:simple-page-master master-name="my-page"> <fo:region-body margin="1in"/> </fo:simple-page-master> </fo:layout-master-set> <fo:page-sequence master-reference="my-page"> <fo:flow flow-name="xsl-region-body"> <fo:block>Hello World!</fo:block> </fo:flow> </fo:page-sequence> </fo:root> let $pdf := xslfo:render($fo, "application/pdf", ()) return response:stream-binary($pdf, "application/pdf", "output.pdf")
[edit] Notes on Installing XSL-FO
[edit] Enabling the XSL-FO Module
Make sure that the module extension is loaded. You can do this by going to the $EXIST_HOME/conf.xml file and un-commenting the following line (around line 769):
<module class="org.exist.xquery.modules.xslfo.XSLFOModule" uri="http://exist-db.org/xquery/xslfo"> <parameter name="processorAdapter" value="org.exist.xquery.modules.xslfo.ApacheFopProcessorAdapter"/> </module
Where the two possible values for the processorAdapter parameter are:
org.exist.xquery.modules.xslfo.ApacheFopProcessorAdapter for Apache's FOP org.exist.xquery.modules.xslfo.RenderHouseXepProcessorAdapter for RenderHouse's XEP
If the module is correctly loaded then you should see it in the function documentation.
Make sure that you have correctly edited the $EXIST_HOME/extensions/build.properties to set XSLFO to to be true:
Change:
# XSL FO transformations (Uses Apache FOP) include.module.xslfo = false
To be:
include.module.xslfo = true
Make sure that the build file can get access to the correct fop.jar file from the Apache web site.
[edit] Downloading XSL-FO Jar Files
Exist comes with a sample ant task that can automatically download the FOP distribution zip file, extract the tree jar files we need and remove the rest. Here is the ant target from the eXist 1.4 $EXIST_HOME/modules/build.xml
<target name="prepare-libs-xslfo" unless="libs.available.xslfo" if="include.module.xslfo.config"> <echo message="Load: ${include.module.xslfo}"/> <echo message="------------------------------------------------------"/> <echo message="Downloading libraries required by the xsl-fo module"/> <echo message="------------------------------------------------------"/> <!-- Apache FOP .95 --> <get src="${include.module.xslfo.url}" dest="fop-0.95-bin.zip" verbose="true" usetimestamp="true" /> <unzip src="fop-0.95-bin.zip" dest="${top.dir}/${lib.user}"> <patternset> <include name="fop-0.95/build/fop.jar"/> <include name="fop-0.95/lib/batik-all-1.7.jar"/> <include name="fop-0.95/lib/xmlgraphics-commons-1.3.1.jar"/> </patternset> <mapper type="flatten"/> </unzip> <delete file="fop-0.95-bin.zip"/> </target>
Note that fop 1.0 is now available so you can change this task to be the following:
<target name="prepare-libs-xslfo" unless="libs.available.xslfo" if="include.module.xslfo.config"> <echo message="Load: ${include.module.xslfo}"/> <echo message="------------------------------------------------------"/> <echo message="Downloading libraries required by the xsl-fo module"/> <echo message="------------------------------------------------------"/> <!-- Download the Apache FOP Processor from the Apache Web Site--> <get src="${include.module.xslfo.url}" dest="fop-1.0-bin.zip" verbose="true" usetimestamp="true" /> <unzip src="fop-1.0-bin.zip" dest="${top.dir}/${lib.user}"> <patternset> <include name="fop-1.0/build/fop.jar"/> <include name="fop-1.0/lib/batik-all-1.7.jar"/> <include name="fop-1.0/lib/xmlgraphics-commons-1.3.1.jar"/> </patternset> <mapper type="flatten"/> </unzip> <delete file="fop-1.0-bin.zip"/> </target>
[edit] Sample Transcript
The following is a sample transcript:
prepare-xslfo:
[echo] Load: true
[echo] ------------------------------------------------------
[echo] Downloading libraries required by the xsl-fo module
[echo] ------------------------------------------------------
[fetch] Getting: http://apache.cs.uu.nl/dist/xmlgraphics/fop/binaries/fop-1.0-bin.zip
[fetch] To: C:\DOCUME~1\DANMCC~1\LOCALS~1\Temp\FetchTask8407348433221748527tmp
[fetch] ....................................................
[fetch] ....................................................
[fetch] ....................................................
[fetch] ....................................................
[fetch] ....................................................
[fetch] ....................................................
[fetch] ....................................................
[fetch] ....................................................
[fetch] ....................................................
[fetch] ....................................................
[fetch] ....................................................
[fetch] ....................................................
[fetch] ....................................................
[fetch] ....................................................
[fetch] ....................
[fetch] Expanding: C:\DOCUME~1\DANMCC~1\LOCALS~1\Temp\FetchTask8407348433221748527tmp into C:\ws\exist-trunk\lib\us
At the end of this process you should see the following three jar files in your $EXIST_HOME/lib/extensions folder:
cd $EXIST_HOME/lib/extensions $ ls -l -rwxrwxrwx+ 1 Dan McCreary None 3318083 2010-12-10 09:23 batik-all-1.7.jar -rwxrwxrwx+ 1 Dan McCreary None 3079811 2010-12-10 09:23 fop.jar -rwxrwxrwx+ 1 Dan McCreary None 569113 2010-12-10 09:23 xmlgraphics-commons-1.4.jar
If you do not see these files you can manually copy them from the a download of the XSL-FO binaries.
Now go to the $EXIST_HOME directory and type "build". You should not see any error messages. If you do got to the build file and fix or remove the errors.
After you reboot you should be able to see the XSL-FO convert the file into a PDF file.
[edit] Using Config File for External References
When you reference an image you must either use an absolute reference and make sure that the server has read access or you must use a relative path reference. The root of relative path references can be set in the xslfo config file.
xquery version "1.0"; declare namespace fo="http://www.w3.org/1999/XSL/Format"; declare namespace xslfo="http://exist-db.org/xquery/xslfo"; let $fop-config := <fop version="1.0"> <!-- Base URL for resolving relative URLs --> <base>http://localhost:8080/exist/rest/db/nosql/pdf/images</base> </fop> let $fo := doc('/db/test/xslfo/fo-templates/samle-fo-file-with-external-references.fo') let $pdf := xslfo:render($fo, "application/pdf", (), $fop-config) return response:stream-binary($pdf, "application/pdf", "output.pdf")
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format"> <fo:layout-master-set> <fo:simple-page-master master-name="my-page"> <fo:region-body margin="0.5in"/> </fo:simple-page-master> </fo:layout-master-set> <fo:page-sequence master-reference="my-page"> <fo:flow flow-name="xsl-region-body"> <fo:block>Test of external SVG reference </fo:block> <fo:block> SVG Chart Test <fo:external-graphic content-width="7.5in" scaling="uniform" src="url(my-test-image.png)"/> content-width="7.5in" scaling="uniform" src="url(chart.svg)" </fo:block> </fo:flow> </fo:page-sequence> </fo:root>
[edit] Including SVG Images in your PDF files
When you create PDF documents you have the ability to include "line art" directly in the PDF files that have use the SVG format.
There are some translation issues from SVG to PDF but much of the line-art converts very well.
To get SVG rendering to work within eXist you must also load the Sun AWT libs if you reference SVG images.
http://xmlgraphics.apache.org/fop/0.95/graphics.html#batik
Which says you must tell Java to force-load the awt libraries when the JVM starts up:
-Djava.awt.headless=true
In your $EXIST_HOME/startup.bat or $EXIST_HOME/startup.sh you will need to add the following:
set JAVA_OPTS="-Xms128m -Xmx512m -Dfile.encoding=UTF-8 -Djava.endorsed.dirs=%JAVA_ENDORSED_DIRS% -Djava.awt.headless=true"
If you are using the "wrapper" tool to start your sever you will need to add the following lines to the $EXIST_HOME/tools/wrapper/conf/wrapper.conf
# make AWT load the fonts for SVG rendering inside of XSLFO wrapper.java.additional.6=-Djava.awt.headless=true
[edit] Using Inline SVG
One of easy ways to test your configuration is to use an inline reference to an SVG file. You can do this by using the fo:instream-foreign-object element. The following is an example of this.
<fo:block> Test of inline SVG reference. <fo:block> <fo:instream-foreign-object content-width="7.5in" scaling="uniform"> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="200" width="200"> <circle cx="100" cy="100" r="40" stroke="black" stroke-width="2" fill="blue"/> </svg> </fo:instream-foreign-object> </fo:block> content-width="7.5in" scaling="uniform" </fo:block>
[edit] Sample External SVG Reference
Note this assumes you have configured your <base> URL in the FOP configuration file.
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format"> <fo:layout-master-set> <fo:simple-page-master master-name="my-page"> <fo:region-body margin="0.5in"/> </fo:simple-page-master> </fo:layout-master-set> <fo:page-sequence master-reference="my-page"> <fo:flow flow-name="xsl-region-body"> <fo:block>Test of external SVG reference</fo:block> <fo:block> SVG Chart Test <fo:external-graphic content-width="7.5in" scaling="uniform" src="url(chart.svg)"/> content-width="7.5in" scaling="uniform" src="url(chart.svg)" </fo:block> </fo:flow> </fo:page-sequence> </fo:root>
[edit] Notes
See XSL-FO Tables and XSL-FO Images on how to add print quality tables and charts to your document.
When you follow trunk, sometimes conf.xml gets reset to the defaults, and you have to reenable xslfo processing in conf.xml. The error printed if you miss this reads like that: "cannot compile xquery: err:xpst0017 call to undeclared function: xslfo:render".
[edit] Acknowledgments
The user Dmitriy has been helpful in the creation of the procedure for installation on systems that do not have source code.
[edit] Discussion
The steps to enable the FOP module should be listed somewhere in the eXist administrative site and removed from this Wikibook.
This page may need to be