XQuery/DocBook to PDF

From Wikibooks, open books for an open world
Jump to: navigation, search

Motivation[edit]

You want to convert your DocBook 5 files to PDF format. PDF standardized page-layout format that allows you to to print books using standards.

Method[edit]

We will create an XQuery module with one main TypeSwitch statement for each of the main elements of DocBook. This will create an XSL-FO file that can then be converted directly to PDF using the Apache-FO 1.0 processor.

This will be done entirely using XQuery. No XSLT will be required.

Sample Input Document[edit]

We will start with a simple DocBook 5 document. This document used the DocBook namespace and includes the xlink namespace. A very small sample of the document might have the following structure:

<book xmlns="http://docbook.org/ns/docbook"
    xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0">
    <info>
        <title>Converting DocBook to PDF using XQuery</title>
        <author>
            <orgname>Kelly McCreary &amp; Associates</orgname>
            <address>
                <city>Minneapolis</city>
                <country>USA</country>
            </address>
            <email>user@example.com</email>
        </author>
    </info>
    <part>
        <title>Introduction</title>
        <subtitle>Why DocBook and PDF</subtitle>
        <chapter>
            <title>Introduction to DocBook</title>
            <subtitle>Getting Started with DocBook Version</subtitle>
            <sect1>
                <title>Page Layout vs. Scrolling HTML</title>
                <subtitle>Why PDF is Used For Printing</subtitle>
                <para>Printing and pagination will still be important till ePub becomes standardized.</para>
            </sect1>
        </chapter>
        <chapter>
            <title>XQuery Typeswitch Transforms</title>
            <subtitle>How To Get Comfortable with Recursive Programs</subtitle>
            <sect1>
                <title>Why XQuery Can Replace XSLT</title>
                <subtitle>One language for the server</subtitle>
                <para>Text</para>
            </sect1>
            <sect1>
                <title>XSL-FO</title>
                <subtitle>A language for paginated layout</subtitle>
                <para>Text</para>
            </sect1>
        </chapter>
    </part>
</book>