MLCP Copy command with redaction getting timed out

119 Views Asked by At

ML version used: 9.0-10.4

Running the MLCP COPY command on large data set (39753201 docs). On running the command getting the below error.

2020-07-29 20:38:09 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2020-07-29 20:38:09 INFO  ContentPump:227 - Job name: local_1071163736_1
2020-07-29 20:38:10 INFO  MarkLogicInputFormat:420 - Fetched 6 forest splits.
2020-07-29 20:38:10 INFO  MarkLogicInputFormat:551 - Made 39757 split(s).
2020-07-29 20:38:11 INFO  LocalJobRunner:519 -  completed 0%
2020-07-29 20:48:10 ERROR DatabaseContentReader:286 - QueryException:com.marklogic.xcc.exceptions.XQueryException: XDMP-EXTIME: for $doc in $documents -- Time limit exceeded
 [Session: user=admin, cb=#17742233824102065206 [ContentSource: user=admin, cb=cndb [provider: address=localhost/127.0.0.1:8000, pool=0/64]]]
 [Client: XCC/9.0-10, Server: XDBC/9.0-10.4]
in /MarkLogic/redaction.xqy, on line 78
expr: for $doc in $documents,
in rdt:redact((fn:doc("doc-1.xml"), fn:doc("doc-2.xml"), fn:doc("doc-3.xml"), ...), ("numeric-rules", "rule-2", "binary-rules", ...))
in /eval, on line 9
expr: for $doc in $documents

Split parameters used:

max_split_size = 1000
 thread_count = 12

Not sure why getting the timed-out error. on running the redaction on 2000 docs in qconsole, it's taking only 10-15 secs time.

Modified the above error log to hide the sensitive info (like doc-1.xml)

0

There are 0 best solutions below