Drupal Feeds - Parse XML before importing

460 Views Asked by At

I have this xml:

<xml>
  <data>
    <name id=01">
      Pippo
    </name>
    <name id=02>
      Pinco
    </name>
    <surname id=01">
      Franco
    </surname>
    <surname id=02>
      Pallino
    </surname>
  </data>
<xml>

I want to import in the node, just the stuff with attribute id=01. To do this I tried many roads but no one with success. So I am thinking about modifying the xml during a preparsing, to build an xml like:

<xml>
  <data>
    <name id=01">
      Pippo
    </name>
    <surname id=01">
      Franco
    </surname>
  </data>
  <data>
    <name id=02>
      Pinco
    </name>
    <surname id=02>
      Pallino
    </surname>
  </data>
<xml>

and have the feeds module creating the two nodes with useful data. But I found no way and no useful hooks to do that with feeds module.

1

There are 1 best solutions below

0
On

Do a custom parser extending the XPath Parser. I did something like that with JsonPath Parser. In the parse function the json_decode function takes place and converts the source to an array, and after, that array is queried, so you can insert your code after the json_decode call and before actual parsing in order to modify the data retrieved, in your case, remove array items not fitting your needs. You have to figure out how to do this with XPath Parser but the workflow shouldn't be so different. If you find the chance, just extend the Xpath Parser class and do your stuff in parse method. Here's what I did, hope it helps you.

$array = json_decode($raw, TRUE);

// Support JSON lines format.
if (!is_array($array)) {
  $raw = preg_replace('/}\s*{/', '},{', $raw);
  $raw = '[' . $raw . ']';
  $array = json_decode($raw, TRUE);
}

//ARRAY MODIFICATION - FIND THE ACTUAL VERSION AMENDMENT AND MOVE THEIR CONTENTS TO THE ROOT NODE
$fields_to_move = array("fullText","sameAs","memo","lawSection","actClause","currentCommittee","coSponsors","multiSponsors","uniBill","stricken","lawCode");
foreach ($array["result"]["items"] as &$item_result){
  if($item_result["activeVersion"]=="")
    foreach ($fields_to_move as $field_name)
      $item_result[$field_name] = $item_result["amendments"]["items"][""][$field_name];
  else foreach ($item_result["amendments"]["items"] as $item)
    if($item["version"] == $item_result["activeVersion"]){
      foreach ($fields_to_move as $field_name)
        $item_result[$field_name] = $item[$field_name];
      break;
    }
}
//END OF ARRAY MODIFICATION

if (is_array($array)) {
  require_once drupal_get_path('module', 'feeds_jsonpath_parser').'/jsonpath-0.8.1.php';
  //more stuff
}