Parsing JSON-LD Microformat from HTML using node.js?

1.2k Views Asked by At

I'm trying to pull a rating from Microformat JSON-LD data on the Autodesk store webpage ( example ). The intention is to make a rating badge using badgen.net. However, I cannot get any parser to pull the JSON-LD data.

I'm using node.js on Runkit to pull and parse the metadata. https://runkit.com/thomasa88/60c049de4daa38001add07ca

I have tried using microformat-node and microformats-parser, but none of them seem to find the JSON-LD data on the page. How can I correctly extract the data?

In the following snippet, I have extracted the relevant HTML and try to feed it to microformats-parser:

response = {}
response.body = `<html><body>    <script type="application/ld+json">
        {
        "@context": "http://schema.org/",
        "@type": "SoftwareApplication",
        "name": "ParametricText",
        "image": "https://autodesk-exchange-apps-v-1-5-staging.s3.amazonaws.com/data/content/files/images/JLH9M8296BET/2114937992453312456/resized_930d9014-acf1-4a47-809b-a2373cf161f0_.png?AWSAccessKeyId=AKIAJGORQX2SECMU24IQ&amp;Expires=1623357670&amp;response-content-disposition=inline&amp;response-content-type=image%2Fpng&amp;Signature=T0SHZ2bodYMOIIgZDgc%2BWdWW2mU%3D",
        "operatingSystem": "Win64",
        "applicationCategory": "http://schema.org/DesktopApplication",
            "aggregateRating":{
            "@type": "AggregateRating",
            "ratingValue": "5",
            "ratingCount": "4"
            },
        "offers":{
        "@type": "Offer",
        "price": "0",
        "priceCurrency": "USD"
        }
        }
    </script></body></html>`


const { mf2 } = require("microformats-parser");

const parsed = mf2(response.body, {
  baseUrl: "https://apps.autodesk.com"
});

console.log(parsed);

Result:

{"rels":{},"rel-urls":{},"items":[]}
1

There are 1 best solutions below

0
thomasa88 On

Thanks to this answer, I came up with the following solution.

Whether to prefer superagent or got - I don't know :) .

const superagent = require('superagent');
const cheerio = require('cheerio');

/*const response = await superagent("https://apps.autodesk.com/FUSION/en/Detail/Index?id=2114937992453312456&appLang=en&os=Win64");
Gives response.text*/

var got = require("got");
var response = await got("https://apps.autodesk.com/FUSION/en/Detail/Index?id=2114937992453312456&appLang=en&os=Win64");

const $ = cheerio.load(response.body);
const jsonRaw = $("script[type='application/ld+json']")[0].children[0].data; 
const result = JSON.parse(jsonRaw);
//console.log(result);
console.log(result.aggregateRating.ratingValue);