i have hadoop cluster with hadoop version apache 2.7.1
that is high available and consists of five nodes
mn1 ,mn2 ,dn1,dn2,dn3
if we are accesing wbhdfs from browser to open afile called myfile that has replication factor = 3 and exits on dn1,dn2 and dn3
we issue the following command from browser
http://mn1:50070/webhdfs/v1/hadoophome/myfile/?user.name=root&op=OPEN
so mn1 redirects this request as a result to dn1 or dn2 or dn3 and we get the file
and we can get the file too from hadoop by the following command
hdfs dfs -cat /hadoophome/myfile
but in the case of data nodes failure (suppose that dn1 and dn3 are down now)
if we issue the commnad
hdfs dfs -cat /hadoophome/myfile
we can retrieve the file
but if we issue webhdfs command from browser and this is my state
http://mn1:50070/webhdfs/v1/hadoophome/myfile/?user.name=root&op=OPEN
mn1 will redirect the request to dn1 or dn3 which are dead and some times it redirects the request to dn2 and i can retrieve the file
shouldn't mn1 redirect webhdfs request to alive data nodes only how to handle this error should it be handled from application ?
edit hdfs-site.xml
where this property is measured in milliseconds
and you will get a timeout of 50 seconds
because default value for heartbeat.interval is 3 seconds
and Timeout to consider data node as dead
so timeout= 2 * (10 second) + 10 * 3 second = 50 second