How to have a cluster of 3 datanodes that work at the same time?

56 Views Asked by Pypthon3 At 24 March 2023 at 13:32

I run one datanode with: ./bin/hdfs datanode -conf ./etc/hadoop/datanode1.xml only one work when i try run two: "datanode is running as process. Stop it first and ensure /tmp/hadoop-user-datanode.pid file is empty before retry.

Hadoop3.3.5

core-site.xml:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

hdfs-site.xml:

<configuration>
    <property>
       <name>dfs.replication</name>
       <value>3</value>
    </property>
    <property>
        <name>dfs.name.dir</name>
        <value>/home/user/code/hdfs/namenode/</value>
    </property>
    <property>
        <name>dfs.data.dir</name>
        <value>/home/user/code/hdfs/datanode1,/home/user/code/hdfs/datanode2,/home/user/code/hdfs/datanode3</value>
    </property>
</configuration>

datanode1.xml:

<configuration>
    <property>
        <name>dfs.datanode.address</name>
        <value>0.0.0.0:9011</value>
    </property>
    <property>
        <name>dfs.datanode.http.address</name>
        <value>0.0.0.0:9076</value>
    </property>
    <property>
        <name>dfs.datanode.ipc.address</name>
        <value>0.0.0.0:9021</value>
    </property>
    <property>
        <name>dfs.data.dir</name>
        <value>/home/user/code/hdfs/datanode1/</value>
    </property>
</configuration>

datanode2.xml:

<configuration>
    <property>
        <name>dfs.datanode.address</name>
        <value>0.0.0.0:9012</value>
    </property>
    <property>
        <name>dfs.datanode.http.address</name>
        <value>0.0.0.0:9077</value>
    </property>
    <property>
        <name>dfs.datanode.ipc.address</name>
        <value>0.0.0.0:9022</value>
    </property>
    <property>
        <name>dfs.data.dir</name>
        <value>/home/user/code/hdfs/datanode2/</value>
    </property>
</configuration>

datanode3.xml:

<configuration>
    <property>
        <name>dfs.datanode.address</name>
        <value>0.0.0.0:9013</value>
    </property>
    <property>
        <name>dfs.datanode.http.address</name>
        <value>0.0.0.0:9078</value>
    </property>
    <property>
        <name>dfs.datanode.ipc.address</name>
        <value>0.0.0.0:9023</value>
    </property>
    <property>
        <name>dfs.data.dir</name>
        <value>/home/user/code/hdfs/datanode3/</value>
    </property>
</configuration>

Original Q&A

There are 1 best solutions below

OneCricketeer On 25 March 2023 at 15:31

There probably is some property to relocate the PID file, but you really should not run three data nodes on one machine anyway. Especially if they all use one physical hard drive.

If you want to simulate a Hadoop cluster, use Docker or VMs. Otherwise, use a service like EMR or Dataproc.

How to have a cluster of 3 datanodes that work at the same time?

There are 1 best solutions below

Related Questions in HADOOP

Related Questions in LOCALHOST

Related Questions in HDFS

Related Questions in NAMENODE

Related Questions in DATANODE

Trending Questions

Popular # Hahtags

Popular Questions