Hadoop distributed mode

61 Views Asked by At

This question may seem very trivial, but I'm new to Hadoop and currently confused by one question.

When starting the daemons, how can the appropriate files be located on the slave nodes?

I know you specify the masters and the slaves in the appropriate files, but how does it know about the location on the file system of that nodes(the path to where Hadoop is installed)?

Should I perhaps setup something like HADOOP_HOME (or HADOOP_PREFIX) for that?

Also, I read that in masters file you specify only the Secondary Name Node, while the Name Node and Job Tracker are considered to be located on the node from which your are calling start-all.sh. But what happens if you're not logged on that node, but on the other, client node? Maybe I didn't understand that part well...

1

There are 1 best solutions below

2
On BEST ANSWER

The installation on your different nodes of hadoop should be almost identical, and for that reason, you must specify HADOOP_HOME (i also specify HADOOP_PREFIX to the same location) pointing to your hadoop installation, in every node of your cluster.

Everyone of your nodes should be capable of connecting each other with ssh "password less" mode, so i believe that the last part of your question doesn't make much sense ;)