PHP: How to extract “content type=”application/xml" nodes from a XML file?

1.3k Views Asked by At

I have a valid XML file (generated from SharePoint) which looks like this (in browser):

Sample XML File

<?xml version="1.0" encoding="utf-8"?>
<feed xml:base="https://www.example.com/_api/" xmlns="http://www.w3.org/2005/Atom" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xmlns:georss="http://www.georss.org/georss" xmlns:gml="http://www.opengis.net/gml">
    <id>9913f043-xxxx-xxxx-xxxx-xxxx-xxxx</id>
    <title />
    <updated>2017-05-23T06:08:01Z</updated>
    <entry m:etag="&quot;23&quot;">
        <id>Web/Lists(guid'13306095-xxxx-xxxx-xxxx-xxxx-xxxx-xxxx')/Items(1)</id>
        <category term="SP.Data.XXXXXXXXXXXXXXXXXXXXX" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
        <link rel="edit" href="Web/Lists(guid'13306095-xxxx-xxxx-xxxx-xxxx-xxxx')/Items(1)" />
        <title />
        <updated>2017-05-23T06:08:01Z</updated>
        <author>
            <name />
        </author>
        <content type="application/xml">
            <m:properties>
                <d:FileSystemObjectType m:type="Edm.Int32">0</d:FileSystemObjectType>
                <d:Id m:type="Edm.Int32">1</d:Id>
                <d:ContentTypeId>0x0100B6A3B67BE96F724682CCDC8FBE9D70C2</d:ContentTypeId>
                <d:Title m:null="true" />
                <d:Topic>How to google?</d:Topic>
                <d:Cats m:type="Collection(Edm.Int32)">
                    <d:element>1</d:element>
                    <d:element>2</d:element>
                    <d:element>3</d:element>
                    <d:element>4</d:element>
                    <d:element>5</d:element>
                    <d:element>6</d:element>
                    <d:element>7</d:element>
                </d:Cats>
            </m:properties>
        </content>
    </entry>
    <entry>
    .
    .
    </entry>
    <entry>
    .
    .
    </entry>
</feed>

(Note: I cut off some repeated nodes here, because it is so long.)

Clearly, we have inner nodes <content type="application/xml"> which also contain data inside.

The Problem (When parsing with PHP)

In PHP, i used this codes to parse (trying to extract it):

$xml = simplexml_load_file("data.xml");
foreach ($xml->entry as $item) {
    echo $item->updated . PHP_EOL; // <--- This works!
    print_r($item->content);       // <--- This doesn't work as expected.
}

.. and then, it is giving me these:

2017-05-23T06:08:01Z
SimpleXMLElement Object
(
  [@attributes] => Array
    (
      [type] => application/xml
    )
)
2017-05-23T06:08:01Z
SimpleXMLElement Object
(
  [@attributes] => Array
    (
      [type] => application/xml
    )
)
.
.

Question (Help!)

How do i extract (get) the actual data inside those <content type="application/xml"> nodes, please?

Please help. Thank you in advance.

2

There are 2 best solutions below

6
Flocke On

The elements below "content" have a namespace (d:...). I had the same problem a while ago. This should help:

$xml = simplexml_load_file("data.xml");
foreach ($xml->entry as $item) {
    echo $item->updated . PHP_EOL;
    $ns = $item->content->children('http://schemas.microsoft.com/ado/2007/08/dataservices/metadata'); 
    print_r($ns->properties); 
}

I updated the code. I'm shure print_r($ns->properties) doesn't show the complete sub-elements ... because they are from another namspace. I guess you can then do this:

$nsd = $ns->properties->children("http://schemas.microsoft.com/ado/2007/08/dataservices");

and proccced with the result.

In your example namespaces can be found in the document element:
xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"
(use the URL between the quotation marks)
d: and m: are used in the document to reference these namespaces.

EDIT: There is another namespace involved. Didn't recognize that. The solution can be atapted. I changed the code a bit.

0
user2558647 On

I had a very similar issue. I was finally able to get my example working with this.

function pre($array){
    echo "<pre>";
    print_r($array);
    echo "</pre>";
}


$record[$count]['id'] = $id->id;
$xmlData = utf8_encode(file_get_contents("https://ucf.uscourts.gov/odata.svc/Creditors(guid'81044f71-fb3c-11e5-ac5b-0050569d488e')"));
$xml = new SimpleXMLElement($xmlData);
$properties = $xml->content->children('http://schemas.microsoft.com/ado/2007/08/dataservices/metadata'); 
$fields = $properties->properties->children("http://schemas.microsoft.com/ado/2007/08/dataservices"); 
pre($fields);
$key = (string)$fields->Key;
$lastName = (string)$fields->LastName;
echo $key. "<br />";
echo $lastName. "<br />";

You would need to replace the Url in file_get_contents, the Key variable and LastName variable with you namespace values that you are looking for and I like to use a pre function to have things show easier. You can remove this part. Hopes this helps someone.