I am trying to find a way to prevent truncation of resume token

75 Views Asked by At

I am trying to find a way to prevent truncation of resume tokenwhen error is thrown by mongodb server for 16 mb size limit. I am reading resume token to make sure documents over 16 mb is read properly but the script is not able to read the resume token as it is truncated and cant be found in db. What is way out?

can this be logging correction or debezium mongo connector or java driver? Please help here. Token string is truncated as highlighted

org.apache.kafka.connect.errors.RetriableException: An exception occurred in the change event producer. This connector will be restarted.
    at io.debezium.pipeline.ErrorHandler.setProducerThrowable(ErrorHandler.java:49)
    at io.debezium.connector.mongodb.MongoDbStreamingChangeEventSource.streamChangesForReplicaSet(MongoDbStreamingChangeEventSource.java:122)
    at 
    at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:109)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.apache.kafka.connect.errors.ConnectException: Error while attempting to read from change stream on 'gng1/mongo-lentra-los-cd-txn-node-01.aws.serviceurl.internal:10085,mongo-lentra-los-cd-txn-node-02.aws.serviceurl.internal:10085,mongo-lentra-los-cd-txn-node-03.aws.serviceurl.internal:10085'
    at io.debezium.connector.mongodb.MongoDbStreamingChangeEventSource.lambda$establishConnection$3(MongoDbStreamingChangeEventSource.java:170)
    at io.debezium.connector.mongodb.ConnectionContext$MongoPreferredNode.execute(ConnectionContext.java:381)
    at io.debezium.connector.mongodb.MongoDbStreamingChangeEventSource.streamChangesForReplicaSet(MongoDbStreamingChangeEventSource.java:115)
    ... 10 more
Caused by: com.mongodb.MongoCommandException: Command failed with error 10334 (BSONObjectTooLarge): 'BSONObj size: 19821386 (0x12E734A) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: "**826582F765000000EE2B022C01E5A1004C94025F689C1431EBF7CA2720270CD87463C5F6964003C3330323343443030313535303368425835427270325548413957483667666A6C50...**" }' on server mongo-lentra-los-cd-txn-node-03.aws.serviceurl.internal:10085. The full response is {"operationTime": {"$timestamp": {"t": 1703095214, "i": 20}}, "ok": 0.0, "errmsg": "BSONObj size: 19821386 (0x12E734A) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: \"**826582F765000000EE2B022C0100296E5A1004C94025F689C1431EBF7CA2720270CD87463C5F6964003C3330323343443030313535303368425835427270325548413957483667666A6C50...**\" }", "code": 10334, "codeName": "BSONObjectTooLarge", "$clusterTime": {"clusterTime": {"$timestamp": {"t": 1703095214, "i": 24}}, "signature": {"hash": {"$binary": {"base64": "qN+qTLJyb8//ZnqQ32GLbl2C83A=", "subType": "00"}}, "keyId": 7277445925541249101}}}
    at com.mongodb.internal.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:198)
    at com.mongodb.internal.connection.InternalStreamConnection.receiveCommandMessageResponse(InternalStreamConnection.java:413)
1

There are 1 best solutions below

4
On

The server is rejecting something because it is larger than 16MB. I would guess that the problem arises due to something in the pipeline passed when the change stream was started, but that was not provided in the question so I can't be sure.

The resume token in that error message is either:

  • the resume token you included in the request
    In this case, use the resume token you already have to resume from the same place
  • the resume token that marks the new position after the document that caused the error In this case, you haven't received document which would have corresponded with the resume token, so to have a logically continuous change stream, you should resume with the same token used in the request, i.e. the one you already have.