Can someone let me know how does samza generates the samza.container.id / SAMZA_CONTAINER_ID when the application is deployed in yarn? I looked around in the samza code base but not able to locate the logic for the generation of the samza.container.id
How does samza generate the container.id when the application is deployed in yarn?
57 Views Asked by tuk At
1
In YARN environment, Samza uses YARN generated containerIds as environmental variables to set each container process's samza.container.id. i.e. when containers are requested by Samza AM process in YARN, YARN RM will reply with a set of allocated container objects, which is of class org.apache.hadoop.yarn.api.records.Container. That's the resource class to uniquely identify a container in YARN and Container#getId().toString() is the container ID string we set to samza.container.id.
The code to get the container Id from YARN RM's response is in YarnClusterResourceManager#onContainersAllocated()