PHP and MySQL Programming/XML and PHP

From Wikibooks, open books for an open world
Jump to: navigation, search

Introduction[edit]

XML (eXtensible Markup Language) is used in mainstream development. It might have started off as an attempt at a web standard, but is now used even in more traditional applications as a document standard. For example, the Open Document Format employed by Sun in their StarOffice and OpenOffice suites is based on XML.

Because of its wide-spread use in the IT industry, it is fitting that we as PHP developers know how to make use of XML files in our PHP applications.

XML Structure[edit]

Since XML documents are extensible, there are no limits to the tags that you can create to define data with. Here is an example of a simple XML document :

<?xml version="1.0"?>
<document>
   <title>Isn't this simple!</title>
   <body>XML is as simple as pie. :-)</body>
</document>

The reason that it looks so simple, is because it is so simple! Just as in HTML, elements are enclosed by angled brackets : "<" and ">", where the start element differs from the end element by the exclusion of a forward slash : "/".

Creating an XML Parser in PHP[edit]

Defining the XML Parser[edit]

In PHP, you define an XML Parser by using the xml_parser_create() function as shown below.

<?
$parser = xml_parser_create(ENCODING);
?>

You can think of the $parser variable in terms of a parsing engine for the XML document. Note that the ENCODING can be either :

1. ISO-8859-1 (default)

2. US-ASCII

3. UTF-8

Defining the Element Handlers[edit]

Element handlers are defined by means of the xml_set_element_handler() function as follows :

<?
xml_set_element_handler(XML_PARSER, START_FUNCTION, END_FUNCTION);
?>

The three arguments accepted by the xml_set_element_handler() function are :

1. XML_PARSER - The variable that you created when you called the xml_parser_create() function.

2. START_FUNCTION - The name of the function to call when the parser encounters a start element.

3. END_FUNCTION - The name of the function to call when the parser encounters an end element.

e.g. :

<?
$parser = xml_parser_create();
xml_set_element_handler($parser, "startElement", "endElement");
?>


Defining Character Handlers[edit]

Character handlers are created by means of the set_character_handler() function as follows :

<?
xml_set_character_handler(XML_PARSER, CHARACTER_FUNCTION);
?>

The two arguments accepted by the set_character_handler() function are :

1. XML_PARSER - The variable that you created when you called the xml_parser_create() function.

2. CHARACTER_FUNCTION - The name of the function to call when the parser encounters character data.

Starting the Parser[edit]

To finally start the parser, we call the xml_parse() function as follows :

<?
xml_parse(XML_PARSER, XML);
?>

The two arguments accepted by the xml_parse() function are :

1. The variable that you created when you called the xml_parser_create() function.

2. The XML that is to be parsed.

e.g. :

<?
$f = fopen ("simple.xml", 'r');
$data = fread($f, filesize("simple.xml"));
xml_parse($parser, $data);
?>

Cleaning Up[edit]

After parsing an XML document, it is considered good practice to free up the memory that is holding the parser. This is done by calling the xml_parser_free() function as follows :

<?
xml_parser_free(XML_PARSER);
?>

Example[edit]

<?
# --- Element Functions ---

function startElement($parser, $name, $attributes){
   # ... some code
}

function endElement ($parser, $name){
   # ... some code
}

function characterData ($parser, $data){
   # ... some code
}

function load_data($file){
   $f = fopen ($file, 'r');
   $data = fread($f, filesize($file));
   return $data;
} 

# --- Main Program Body ---
$file = "simple.xml";
$parser = xml_parser_create();
xml_set_element_handler($parser, "startElement", "endElement");
xml_set_character_data_handler($parser, "characterData");
xml_parse ($parser, load_data($file));
xml_parser_free($parser);
?>

Parsing XML Documents[edit]

We have seen the steps needed to successfully parse a XML document with PHP. Lets take a moment to reflect on how these steps are interconnected.

When a XML parser is initialized, php will go through the XML file. When a starting tag is found, a predefined function created by you, the programmer, is called. The same thing happens when php encounters the text between tags, and the end tags.

Here is a complete example of parsing XML documents. This example is a RSS reader which can be used to display News Articles from any RSS feed which conforms to RSS 1.0 standards.

Example[edit]

<html>
<head>
<title> Google Articles </title>
</head>
<body>
<h2>Google Articles</h2>
<dl>
<?php 

$insideitem = false;
$tag = "";
$title = "";
$description = "";
$link = "";

function startElement($parser, $name, $attrs) {
        global $insideitem, $tag, $title, $description, $link; 
        if ($insideitem) {
                $tag = $name;
        }
        elseif ($name == "ITEM") {
                $insideitem = true;
        }
}

function endElement($parser, $name) {
        global $insideitem, $tag, $title, $description, $link;
        if ($name == "ITEM") {
                printf("<dt><b><a href='%s'>%s</a></b></dt>",
                trim($link),trim($title));
                printf("<dd>%s</dd>", trim($description));
                $title = "";
                $description = "";
                $link = "";
                $insideitem = false;
        }
}

function characterData($parser, $data) {
       global $insideitem, $tag, $title, $description, $link;
        if ($insideitem) {
                switch ($tag) {
                        case "TITLE":
                                $title .= $data;
                                break;
                        case "DESCRIPTION":
                                $description .= $data;
                                break;
                        case "LINK":
                                $link .= $data;
                                break;
                }
        }
}

$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");
# $fp = fopen("http://www.newsforge.com/index.rss", 'r')
$fp = fopen("http://news.google.co.za/nwshp?hl=en&tab=wn&q=&output=rss", 'r')
        or die("Error reading RSS data.");
while ($data = fread($fp, 4096)) {
        xml_parse($xml_parser, $data, feof($fp))
       or die(sprintf("XML error: %s at line %d",
       xml_error_string(xml_get_error_code($xml_parser)),
        xml_get_current_line_number($xml_parser)));
}
fclose($fp);
xml_parser_free($xml_parser);
?>
</dl>
</body>
</html>

Dumping Database Contents into an XML File[edit]

Previous: Common Functions and Operators | TOC