Hadoop Replication mechanism

53 Views Asked by Shehryar Mallick At 04 October 2022 at 10:11

In HDFS the block placement policy is that it places 1 block in the same rack as of the writer while the two other replicas on different nodes of a different rack.

But why doesn't it place 1 of the other 2 replicas on the same rack as the original block of data? wouldn't that be more optimized? as it wouldn't require too much bandwidth to write the other two blocks on the other rack?

Original Q&A

There are 1 best solutions below

Alatau On 26 October 2022 at 06:21

Data replication is performed as follows:

NameNode select new data nodes to host replicas the name server performs balancing of data placement by nodes and compiles a list of nodes for replication

The 1st replica is placed on the first node from the list

The 2nd replica is copied to another node in the same server rack

The 3rd replica is written to an arbitrary node in another server rack

the rest of the replicas are placed in an arbitrary way

Hadoop Replication mechanism

There are 1 best solutions below

Related Questions in HADOOP

Related Questions in HDFS

Related Questions in HADOOP-PARTITIONING

Trending Questions

Popular # Hahtags

Popular Questions