I'm using Apache Flink 1.2.0. According to Production Readiness Checklist (https://ci.apache.org/projects/flink/flink-docs-release-1.2/ops/production_ready.html) it is recommended to set Uids for operators to ensure compatibility for savepoints.
I couldn't find the setUid() method for a flatMap but I found uid() and setUidHash() which according to doc. says
uid
"Sets an ID for this operator.
The specified ID is used to assign the same operator ID across job submissions (for example when starting a job from a savepoint)."
uidHash
"Sets an user provided hash for this operator. This will be used AS IS the create the JobVertexID.
The user provided hash is an alternative to the generated hashes, that is considered when identifying an operator through the default hash mechanics fails (e.g. because of changes between Flink versions)."
Which one actually should be set on a flatMap for example uid() or setUidHash()? Or both?
uid()
method is recommended to be used in this case.setUidHash()
should be used only as workaround to fixup jobs created with default uids instead of user defined ones. It's stated in javadoc: