I run one datanode with: ./bin/hdfs datanode -conf ./etc/hadoop/datanode1.xml only one work
when i try run two: "datanode is running as process. Stop it first and ensure /tmp/hadoop-user-datanode.pid file is empty before retry.
Hadoop3.3.5
core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/user/code/hdfs/namenode/</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/user/code/hdfs/datanode1,/home/user/code/hdfs/datanode2,/home/user/code/hdfs/datanode3</value>
</property>
</configuration>
datanode1.xml:
<configuration>
<property>
<name>dfs.datanode.address</name>
<value>0.0.0.0:9011</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>0.0.0.0:9076</value>
</property>
<property>
<name>dfs.datanode.ipc.address</name>
<value>0.0.0.0:9021</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/user/code/hdfs/datanode1/</value>
</property>
</configuration>
datanode2.xml:
<configuration>
<property>
<name>dfs.datanode.address</name>
<value>0.0.0.0:9012</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>0.0.0.0:9077</value>
</property>
<property>
<name>dfs.datanode.ipc.address</name>
<value>0.0.0.0:9022</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/user/code/hdfs/datanode2/</value>
</property>
</configuration>
datanode3.xml:
<configuration>
<property>
<name>dfs.datanode.address</name>
<value>0.0.0.0:9013</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>0.0.0.0:9078</value>
</property>
<property>
<name>dfs.datanode.ipc.address</name>
<value>0.0.0.0:9023</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/user/code/hdfs/datanode3/</value>
</property>
</configuration>
There probably is some property to relocate the PID file, but you really should not run three data nodes on one machine anyway. Especially if they all use one physical hard drive.
If you want to simulate a Hadoop cluster, use Docker or VMs. Otherwise, use a service like EMR or Dataproc.