Strategy for updating document state in MarkLogic Data Hub processing

53 Views Asked by At

Is there a best practice/strategy for updating document state when processing docs through MarkLogic Data Hub? I want to manage state in order to prevent re-harmonizing docs. Here's what I'm doing currently:

  1. ingest doc to STAGING with state: ingested (right now I'm using a collection 'dh.state.ingested')
  2. harmonize STAGING doc to FINAL canonical model using custom step
  3. copying the STAGING doc to same uri in STAGING with an interceptor that removes the 'dh.state.ingested' from the doc's collections and adds 'dh.state.harmonized' to it's collections (using xdmp.documentRemoveCollections and xdmp.documentAddCollections)

This third step is basically a no-op step I added just so that I'd have the correct context and content.uri to work with when using the xdmp functions. I'm using an interceptor because ML documentation says hooks are deprecated.

This seems to be an inordinately complicated way to manage state changes.

0

There are 0 best solutions below