What is going wrong with my etl process?

794 Views Asked by At

I'm using GoodData's CloudConnect (based on CloverETL) to read a massive json file and write certain elements to a .csv.

Unfortunately, I'm seeing the error pasted below in the console log. Am I running out of memory due to the error, or is that not enough memory the actual error?


ERROR [WatchDog_0] - Component [JSONReader:JSONREADER1] finished with status ERROR.
 Java heap space
ERROR [WatchDog_0] - Error details:
org.jetel.exception.JetelRuntimeException: Component [JSONReader:JSONREADER1] finished with status ERROR.
    at org.jetel.graph.Node.createNodeException(Node.java:543)
    at org.jetel.graph.Node.run(Node.java:522)
    at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.Exception: java.lang.OutOfMemoryError: Java heap space
    at org.jetel.component.TreeReader$StreamConvertingXPathProcessor.checkThrownException(TreeReader.java:766)
    at org.jetel.component.TreeReader$StreamConvertingXPathProcessor.manageThread(TreeReader.java:757)
    at org.jetel.component.TreeReader$StreamConvertingXPathProcessor.processInput(TreeReader.java:732)
    at org.jetel.component.TreeReader.execute(TreeReader.java:412)
    at org.jetel.graph.Node.run(Node.java:493)
    ... 1 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at net.sf.saxon.tinytree.TinyTree.condense(TinyTree.java:379)
    at net.sf.saxon.tinytree.TinyBuilder.close(TinyBuilder.java:177)
    at net.sf.saxon.event.ReceivingContentHandler.endDocument(ReceivingContentHandler.java:219)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endDocument(AbstractSAXParser.java:745)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:515)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649)
    at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:404)
    at net.sf.saxon.event.Sender.send(Sender.java:193)
    at net.sf.saxon.event.Sender.send(Sender.java:50)
    at net.sf.saxon.Configuration.buildDocument(Configuration.java:2973)
    at net.sf.saxon.sxpath.XPathExpression.evaluate(XPathExpression.java:154)
    at org.jetel.component.tree.reader.xml.XmlXPathEvaluator.iterate(XmlXPathEvaluator.java:79)
    at org.jetel.component.tree.reader.XPathPushParser.handleContext(XPathPushParser.java:104)
    at org.jetel.component.tree.reader.XPathPushParser.parse(XPathPushParser.java:84)
    at org.jetel.component.TreeReader$StreamConvertingXPathProcessor$PipeParser.work(TreeReader.java:827)
    at org.jetel.graph.runtime.CloverWorker.run(CloverWorker.java:87)
    ... 1 more
3

There are 3 best solutions below

0
On

This looks like the second case: this error is caused by insufficient memory for your task.

Error occurred during evaluating (one of) your JSONReader component(s).

The JSON seems to be really huge and you should consider splitting this task into smaller ones if possible.

Did you run your transformation locally or on the gooddata server?

It is really hard to advise something specific without knowing details.

0
On

Try to use JSONExtract instead if JSONReader - it uses less memory, but also reads JSON files.

0
On

From the respective help documents:

JSONReader uses DOM, so the whole input is stored in memory and therefore the component can be memory-greedy.

JSONExtract uses SAX instead of DOM, so it uses less memory than JSONReader