How to retrieve specific value from extracted document via docFilter?

43 Views Asked by At

I am using JavaScript(&DHF), now I have docFilter-extracted XHTML data in "extracted" property. Now, which cts query should I use and how to retrieve "This Value" from FileCreator? Simply, I would like to do: var filecreator = 'This Value'

Any advice will be appreciated.

"extracted": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<html 
xmlns=\"http://www.w3.org/1999/xhtml\">\n  <head>\n    <meta name=\"content- 
type\" content=\"application/pdf\"/>\n    <meta name=\"filter-capabilities\" 
content=\"text subfiles HD-HTML\"/>\n    <meta name=\"FileCreator\" 
content=\"This Value\"/>\n
1

There are 1 best solutions below

0
On

Since the value of the extracted property is a string of escaped xhtml, you could use xdmp.unquote() to parse it and return an XML object, then use .xpath() to select the value of the @content attribute of the meta element that has @name="FileCreator" and assign it to a variable:

Assuming that your JSON document is obj:

var fileCreator = fn.head(xdmp.unquote(obj.extracted))
                  .xpath("/*:html/*:head/*:meta[@name='FileCreator']/@content/string()")