This function parses an XML file into 2 parallel array
structures, one (index) containing pointers
to the location of the appropriate values in the
values array. These last two parameters
must be passed by reference.
Note: xml_parse_into_struct() returns 0 for
failure and 1 for success. This is not the same as FALSE
and TRUE, be careful with operators such as ===.
Below is an example that illustrates the internal structure of
the arrays being generated by the function. We use a simple
note tag embedded inside a
para tag, and then we parse this and print out
the structures generated:
Event-driven parsing (based on the expat library) can get
complicated when you have an XML document that is complex.
This function does not produce a DOM style object, but it
generates structures amenable of being transversed in a tree
fashion. Thus, we can create objects representing the data
in the XML file easily. Let's consider the following XML file
representing a small database of aminoacids information:
Example 2. moldb.xml - small database of molecular information
And some code to parse the document and generate the appropriate
objects:
Example 3.
parsemoldb.php - parses moldb.xml into an array of
molecular objects
<?php
class AminoAcid { var $name; // aa name var $symbol; // three letter symbol var $code; // one letter code var $type; // hydrophobic, charged or neutral
function AminoAcid ($aa) { foreach ($aa as $k=>$v) $this->$k = $aa[$k]; } }
function readDatabase($filename) { // read the XML database of aminoacids $data = implode("", file($filename)); $parser = xml_parser_create(); xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, 0); xml_parser_set_option($parser, XML_OPTION_SKIP_WHITE, 1); xml_parse_into_struct($parser, $data, $values, $tags); xml_parser_free($parser);
// loop through the structures foreach ($tags as $key=>$val) { if ($key == "molecule") { $molranges = $val; // each contiguous pair of array entries are the // lower and upper range for each molecule definition for ($i=0; $i < count($molranges); $i+=2) { $offset = $molranges[$i] + 1; $len = $molranges[$i + 1] - $offset; $tdb[] = parseMol(array_slice($values, $offset, $len)); } } else { continue; } } return $tdb; }
function parseMol($mvalues) { for ($i=0; $i < count($mvalues); $i++) { $mol[$mvalues[$i]["tag"]] = $mvalues[$i]["value"]; } return new AminoAcid($mol); }
$db = readDatabase("moldb.xml"); echo "** Database of AminoAcid objects:\n"; print_r($db);
?>
After executing parsemoldb.php, the variable
$db contains an array of
AminoAcid objects, and the output of the
script confirms that: