How to match any element?

133 Views Asked by At

This works fine:

(sxml-match '(div)
  ((div) #t))

But this fails:

(sxml-match '(div)
  ((,element) #t))

I am wondering how to match any element?


This is a more concrete example. The following is a snippet from the XCB's "xproto.xml" file:

(define xproto '((struct (@ (name "CHAR2B"))
                         (field (@ (type "CARD8") (name "byte1")))
                         (field (@ (type "CARD8") (name "byte2"))))
                 (xidtype (@ (name "WINDOW")))
                 (xidtype (@ (name "PIXMAP")))
                 (xidtype (@ (name "ATOM")))
                 (xidunion (@ (name "DRAWABLE"))
                           (type "WINDOW")
                           (type "PIXMAP"))))

My aim is to extract the names:

(define names '((struct "CHAR2B")
                (xidtype "WINDOW")
                (xidtype "PIXMAP")
                (xidtype "ATOM")
                (xidunion "DRAWABLE")))

So I tried this:

(sxml-match xproto ((,kind (@ (name ,name)) . ,body) ...))

But I get the error:

bad pattern syntax (not an element pattern)

I do not understand what else I should do.

Is sxml-match an insufficient tool for this job?

2

There are 2 best solutions below

2
On

From the Backus-Naur form it follows, doing (,div) is not syntactically correct. This matches only the element-pattern left symbol and is waiting for a tag-symbol in the car position. But it is correct to do (sxml-match '(div) (,element 10)), as this matches the pat-var-or-cata rule for a node. So it is a syntax error to do (,something) then, because ,something matches only pat-var-or-cata rule.

These rules look very similar to the prepossessing syntax made via some kind of unification.

UPDATE for your example:

I added a constant @ respectively ww for the first 2 levels of nesting on the CAR position, otherwise I do not know if it works.

(define xproto '(@
            (ww struct (@ (name "CHAR2B"))
                (field (@ (type "CARD8") (name "byte1")))
                (field (@ (type "CARD8") (name "byte2"))))
            (ww xidtype (@ (name "WINDOW")))
            (ww xidtype (@ (name "PIXMAP")))
            (ww xidtype (@ (name "ATOM")))
            (ww xidunion (@ (name "DRAWABLE"))
                (type "WINDOW")
                (type "PIXMAP"))))

 (sxml-match xproto
             ((@ (ww ,t . ,d) ...) (list "???" t))
             (,otherwise "no match"))

 (sxml-match xproto
             ((@ (ww (@ (name ,t)) . ,d) ...) (list "???" t))
             (,otherwise "no match"))

You can zip the results of these 2 expressions to get it.

3
On

You have to include the specific element and attribute names to match and extract information from; they have to be hard coded and can't be variable. But since you know the DTD and thus the document's structure, you know what needs to be matched. It's just tedious. A named catamorphism can help keep things structured:

(define (get-names xproto)
  ; match a single struct/xidtype/xidunion element and return
  ; which element and its name attribute
  (define (struct/xid x)
    (sxml-match x
     ((struct (@ (name ,name)) . ,rest) (list 'struct name))
     ((xidtype (@ (name ,name))) (list 'xidtype name))
     ((xidunion (@ (name ,name)) . ,rest) (list 'xidunion name))))
  ; Match a nodeset of 0 or more struct/xidtype/xidunion elements
  (sxml-match xproto
   ((list ,(struct/xid -> element+name) ...)
    `(,element+name ...))))

returns ((struct "CHAR2B") (xidtype "WINDOW") (xidtype "PIXMAP") (xidtype "ATOM") (xidunion "DRAWABLE")) for your example data.


The documentation says the catamorphism function can return multiple values that can be bound to multiple ids, but my attempts to do so kept raising errors (Multiple values/ids and ... don't seem to play well together), so this just returns lists.