When our cluster retart,the datanode scan the volume of configed and warning an exception is
2024-03-27 11:08:57,183 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Can't replicate block BP-1203938771-192.168.0.181-1606981736566:blk_1348086923_274386670 because the block file doesn't exist, or is not accessible
it seems like the datanode can't find the file. but I search the linux server,and the file still exists
/mnt/vdk/hadoop-2.6.5/data/datanode/current/BP-1203938771-192.168.0.181-1606981736566/current/finalized/subdir26/subdir12/blk_1348086923_274386670.meta /mnt/vdk/hadoop-2.6.5/data/datanode/current/BP-1203938771-192.168.0.181-1606981736566/current/finalized/subdir26/subdir12/blk_1348086923
and I am searching for the source code,like this
try {
data.checkBlock(block, block.getNumBytes(), ReplicaState.FINALIZED);
} catch (ReplicaNotFoundException e) {
replicaNotExist = true;
} catch (UnexpectedReplicaStateException e) {
replicaStateNotFinalized = true;
} catch (FileNotFoundException e) {
blockFileNotExist = true;
} catch (EOFException e) {
lengthTooShort = true;
} catch (IOException e) {
// The IOException indicates not being able to access block file,
// treat it the same here as blockFileNotExist, to trigger
// reporting it as a bad block
blockFileNotExist = true;
}
if (replicaNotExist || replicaStateNotFinalized) {
String errStr = "Can't send invalid block " + block;
LOG.info(errStr);
bpos.trySendErrorReport(DatanodeProtocol.INVALID_BLOCK, errStr);
return;
}
if (blockFileNotExist) {
// Report back to NN bad block caused by non-existent block file.
reportBadBlock(bpos, block, "Can't replicate block " + block
+ " because the block file doesn't exist, or is not accessible");
return;
}
The code show that blockFileNotExist,I continued to search the method data.checkBlock and find this
public void checkBlock(ExtendedBlock b, long minLength, ReplicaState state)
throws ReplicaNotFoundException, UnexpectedReplicaStateException,
FileNotFoundException, EOFException, IOException {
final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(),
b.getLocalBlock());
if (replicaInfo == null) {
throw new ReplicaNotFoundException(b);
}
if (replicaInfo.getState() != state) {
throw new UnexpectedReplicaStateException(b,state);
}
if (!replicaInfo.getBlockFile().exists()) {
throw new FileNotFoundException(replicaInfo.getBlockFile().getPath());
}
long onDiskLength = getLength(b);
if (onDiskLength < minLength) {
throw new EOFException(b + "'s on-disk length " + onDiskLength
+ " is shorter than minLength " + minLength);
}
}
they take the file from volumeMap,it means the Map store the metainformation about the datanode scaned result,If they throw the warnning,because of the Map value Object ReplicaInfo didn't contain the blockid's path... it looks really contradictory,because our disk conatin the file,but datanode's scan can't find the blockid's path. I really suspect which of the steps goes wrong so that the Map's ReplicaInfo don't have the blockid's metainfo I really need erverybody's help !!!!!!!!!!!!!!!
Find the SourceCode