Saturday, December 12, 2009

How to parse XML data using PHP

Parse XML data using PHP

XML stands for eXtensible Markup Language and is used primarily for data storage and organization. It is useful for many things but the main thing about it is that there are no predefined tags.If we want to parse an XML data to use, we need to convert that data for PHP use. The following is an example for XML parsing using PHP

A few more rules about XML and we will be on our way to our PHP code. XML documents must be well-formed. This means that there can be only one root element (the top most element), all child elements must be nested properly <p>foo <b>bar</b></p> not <p>foo <b>bar</p></b>, and all elements must have end tags.In XML, an element is also referred to as a node

A sample xml code could look like this:

<?xml version="1.0"?>
<datas>
<sample 1 />
<sample 2 />
<sample 3 />

</datas>

<?xml version="1.0"?> must be the first tag in a XML file. It is called the xml declaration and identifies the file as a XML file to a parser.

As we can see the code is easy to read and understand. We can clearly see every tag and easily read them in plain English, or whatever language we are most comfortable with.Each XML file can have it's own DTD or structure. The PHP file using the XML parser must be tailored to one particular structure or DTD

Step 1:

The $xmlData is a variable contains the XML data.If the xml data is in the XML file then read the file and store that datas in a variable.We can use file operations to get the value from the XML file .
$xmlData ='<?xml version="1.0"?>
<sections>
<section name="profile">
<subsection name="name" value="Sample name"/>
<subsection name="address" value="sample address" />
<subsection name="email" value="sample@email.com" />
</section>
<section name="personel">
<subsection name="phone" value="123456789" />
<subsection name="age" value="25" />
<subsection name="job" value="Software Professional" />
</section>
</sections>';

Step 2:

After that we can parse the XML data. For this we can use the following code.The following code will parse the XML data and create a two dimensional array of data. This function parse each tag and stores each tag value in the array.

To view the array content

/*
This function receives xml data as input and returns an array as output.
*/
function xml2array($contents, $get_attributes=1)
{
if(!$contents) return array();

if(!function_exists('xml_parser_create')) {
return array();
}
//Get the XML parser of PHP - PHP must have this module for the parser to work
$parser = xml_parser_create();
xml_parser_set_option( $parser, XML_OPTION_CASE_FOLDING, 0 );
xml_parser_set_option( $parser, XML_OPTION_SKIP_WHITE, 1 );
xml_parse_into_struct( $parser, $contents, $xml_values );
xml_parser_free( $parser );
if(!$xml_values) return;//Hmm...
//Initializations
$xml_array = array();
$parents = array();
$opened_tags = array();
$arr = array();
$current = &$xml_array;
//Go through the tags.
foreach($xml_values as $data) {
unset($attributes,$value);//Remove existing values, or there will be trouble
//This command will extract these variables into the foreach scope
// tag(string), type(string), level(int), attributes(array).
extract($data);//We could use the array by itself, but this cooler.
$result = '';
if($get_attributes) {//The second argument of the function decides this.
$result = array();
if(isset($value)) $result['value'] = $value;
//Set the attributes too.
if(isset($attributes)) {
foreach($attributes as $attr => $val) {
if($get_attributes == 1) $result['attr'][$attr] = $val;
//Set all the attributes in a array called 'attr'
/** :TODO: should we change the key name to '_attr'?
Someone may use the tagname 'attr'. Same goes for 'value' too */
}
}
} elseif(isset($value)) {
$result = $value;
}
//See tag status and do the needed.
if($type == "open") {//The starting of the tag ''
$parent[$level-1] = &$current;
if(!is_array($current) or (!in_array($tag, array_keys($current)))) {
//Insert New tag
$current[$tag] = $result;
$current = &$current[$tag];
} else { //There was another element with the same tag name
if(isset($current[$tag][0])) {
array_push($current[$tag], $result);
} else {
$current[$tag] = array($current[$tag],$result);
}
$last = count($current[$tag]) - 1;
$current = &$current[$tag][$last];
}
} elseif($type == "complete") { //Tags that ends in 1 line ''
//See if the key is already taken.
if(!isset($current[$tag])) { //New Key
$current[$tag] = $result;
} else { //If taken, put all things inside a list(array)
if((is_array($current[$tag]) and $get_attributes == 0)
//If it is already an array...
or (isset($current[$tag][0]) and is_array($current[$tag][0]) and $get_attributes == 1)) {
array_push($current[$tag],$result); // ...push the new element into that array.
} else { //If it is not an array...
$current[$tag] = array($current[$tag],$result);
//...Make it an array using using the existing value and the new value
}
}
} elseif($type == 'close') { //End of tag '
'
$current = &$parent[$level-1];
}
}
return($xml_array);
}

Working:
The xml_parser_set_option() function sets options in an XML parser.This function returns TRUE on success, or FALSE on failure.
eg: xml_parser_set_option(parser,option,value) ;
This function returns FALSE if parser does not refer to a valid parser, or if the option could not be set. Else the option is set and TRUE is returned.

There are three types of tags in the above example,sections,section and subsection. 'Sections' is the primary tag for the XML data. 'section' is the tag,with a reference name.'subsection' has the name and value parameters.The above function parse each tag and subtag and store the tag values in the array.

Step 3 :

After execuitng the above function, it returns all the wanted and unwanted datas.So we have to filter the datas. We are using the following the function to filter the datas.
$result = xml2array($xmlData, $get_attributes=1);
/*
The array '$result' contains all parsed datas. So we have to refined the datas
as our need. For that purpose we will iterate the '$result' array. We are using
seperate iteration for each section.
*/
$datalength = sizeof($result['sections']['section'][0]['subsection']);
$ary_profile = array();
for($x = 0;$x
The data from the XML data is now held in $result and can be accessed using a standard PHP loop.

The following is the output for the above function

Final Output
Array (
[0] => Sample name

[1] => sample address

[2] => sample@email.com

)
Array (
[0] => 123456789

[1] => 25

[2] => Software Professional

)