How do I select <dei:DocumentType from XBRL using Cheerio (or

489 Views Asked by At

I'm trying to get the text (in this case it's '10-Q') of an entry from XBRL using cheerio.js with nodejs. The line is below:

<dei:DocumentType contextRef="D2013Q3YTD" id="Fact-DB2A50C2A485F9CC21D51934C6E61D42">10-Q</dei:DocumentType>

I've tried:

$('dei:DocumentType').text

and a few others to no avail. There is not unique id or anything else that I can see.

Sample file:

http://www.sec.gov/Archives/edgar/data/1018724/000144530513002495/amzn-20130930.xml

So how could I go about extracting this text? Thanks.

2

There are 2 best solutions below

1
On BEST ANSWER

It turns out that parsing the file above is very possible with Cheerio.

This works using Cheerio:

$('dei\\:CurrentFiscalYearEndDate').text().trim();

One must escape the special characters, twice, evidently.

1
On

XBRL is XML and it cannot be treated as HTML DOM with libraries like cheerio. You will need an XML parser with Xpath support, like xpath, libxml or o3-xml

Then you can get the value with an XPath expression like this:

/*/dei:DocumentType/text()