XML Join Step in Pentaho Kettle is very low in Performance

703 Views Asked by At

My requirement is somewhat tricky. A complex XML needs to be generated from text file input. I have successfully done that but it involves '11 XML joins'. Because of that performance of code is pathetic when data-set is of high volume.

How can I code for complex XML generation with good performance ?

Any suggestions!!

After some suggestions,I removed all joins and taken resort of stream lookup(as showed in kettle Sample) and 'modified java script' for final join.

But it's failing at 'final join step'. Below is the join where it's failing and throwing 'out of memory error'

var request = new XML()

request = <newbiz xsi:schemaLocation="http://www.crsoftwareinc.com/xml/ns/titanium/common/v1_0 newbiz.xsd" xmlns="http://www.crsoftwareinc.com/xml/ns/titanium/common/v1_0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <newbiz-header>
        <total-consumers>1</total-consumers>
        <creditor-name>DefCrdtr</creditor-name>
        <total-principal>1</total-principal>
        <total-charge>1</total-charge>
        <total-interest>1</total-interest>
        <total-balance>1</total-balance>
    </newbiz-header>
    <consumers>{xmlConsumerNewFinal}</consumers>
</newbiz>

var xmlconsumers_final=request.toXMLString();

Any suggestions!!

1

There are 1 best solutions below

3
On BEST ANSWER

yes, using the xml join step its really slow performance, i removed all this steps and replaced by script step, is e4x ecmascript (which is included in pentaho kettle, is rhino javascript engine and specially conceived to work with xml)

for instance, you can use the var foo = new XML(); not included in other js engines

everything will be faster and simple building your xmls using this technique. take a peek to the doc: http://wso2.com/project/mashup/0.2/docs/e4xquickstart.html