XQuery: Updating nodes based on the updates made in the same query

247 Views Asked by At

I like to manipulate an XML file using some queries. I want to add some attributes to the existing nodes if they don't possess it before or during the execution of the query.

for example:

   // if there is a price and the parents don't have 
   // the attribute named hasPrice then add it to them

   <products hasPrice='yes' >  
        <item hasPrice='yes'> 
            <price>100 </price> 
        </item>  
        <item hasPrice='yes'> 
            <price>100 </price> 
        </item>  
    </products>

I tried the following XQuery but it says Duplicate attribute hasPrice

declare function local:propagatePrice($x)
{
  copy $t := $x 
  modify (
    for $y in $t//price,
    $z in $y/ancestor::*   
    return if ($z/@hasPrice) then () 
    else (insert node (attribute { 'hasPrice' } {'yes'}) into $z)
  )
  return $t
};

let $db := doc('products.xq')
let $temp := local:propagatePrice($db)
return $temp
3

There are 3 best solutions below

1
On

I am not quite sure to understand what difficulty you found. What you try to achieve is certainly possible with XQuery (or with XSLT, which is more convenient for transforming XML trees, especially for multiple transform passes).

In XQuery, you can manipulate some XML trees in memory, so if you have your "transform" logic in 2 functions, then you can "chain" them like this:

let $input  := <products> ... </products>
let $temp   := my:first-pass($input)
let $result := my:second-pass($temp)
return
   $result

or more succinctly:

my:second-pass(
   my:first-pass(
      <products> ... </products>))
3
On

Now the question is different then. It is about XQuery Update (which is a standard extension to XQuery, so usually if you don't mention it, we will not assume you use it).

The way an update works in XQuery Update it different than with a language like SQL, or in any language with "direct side effects". In XQuery Update, evaluating a query does something extra, besides computing the return value of the XQuery expression. It also computes a "hidden result", the pending update list.

Every time an update instruction is evaluated (like insert node), the corresponding change is recorded in something like a log. Only after the entire query has been evaluated, then all the changes are played at once.

The main effect for a developer, is that you cannot see a change made in another part of the query. All read operations will always see the world as it was at the beginning of the evaluation.

This is different than SQL for instance, but it is very well suited for a functional language. IMHO it is also more aligned with the concept of ACID transactions in databases. So it might look surprising in some cases, but when you know how it works, it is easy: in your test condition, it can never see the new attributes created in the same query at all.

5
On

You can put checking of existence of price descendant element and non-existence of hasPrice attribute in where clause instead :

declare function local:propagatePrice($x)
{
  copy $t := $x 
  modify (
    for $z in $t//*
    where not($z/@hasPrice) and $z//price
    return (insert node (attribute { 'hasPrice' } {'yes'}) into $z)
  )
  return $t
};

Main problem I see in your XQuery is, that $y/ancestor::* may return same element for different $y because they can share the same ancestor element. Besides not being efficient, the error message you got maybe related to this fact; maybe if ($z/@hasPrice) evaluated against a 'cached' value or something, so that insert statement end up executed multiple times on the same ancestor element.