Simulate a faulty block device with read errors?

14.1k Views Asked by At

I'm looking for an easier way to test my application against faulty block devices that generate i/o read errors when certain blocks are read. Trying to use a physical hard drive with known bad blocks is a pain and I would like to find a software solution if one exists.

I did find the Linux Disk Failure Simulation Driver which allows creating an interface that can be configured to generate errors when certain ranges of blocks are read, but it is for the 2.4 Linux Kernel and hasn't been updated for 2.6.

What would be perfect would be an losetup and loop driver that also allowed you to configure it to return read errors when attempting to read from a given set of blocks.


There are 4 best solutions below


It's not a loopback device you're looking for, but rather device-mapper.

Use dmsetup to create a device backed by the "error" target. It will show up in /dev/mapper/<name>.

Page 7 of the Device mapper presentation (PDF) has exactly what you're looking for:

dmsetup create bad_disk << EOF
  0 8       linear /dev/sdb1 0
  8 1       error
  9 204791 linear /dev/sdb1 9

Or leave out the sdb1 parts to and put the "error" target as the device for blocks 0 - 8 (instead of sdb1) to make a pure error disk.

See also The Device Mapper appendix from "RHEL 5 Logical Volume Manager Administration".

There's also a flakey target - a combo of linear and error that sometimes succeeds. Also a delay to introduce intentional delays for testing.


The easiest way to play with block devices is using nbd.

Download the userland sources from git:// and modify nbd-server.c to fail at reading or writing on whichever areas you want it to fail on, or to fail in a controllably random pattern, or basically anything you want.


It seems like Linux's built-in fault injection capabilities would be a good idea to use.



I would like to elaborate on Peter Cordes answer.

In bash, setup an image on a loopback device with ext4, then write a file to it named binary.bin.


sudo umount $mountDir ## make sure nothing is mounted here

dd if=/dev/zero of=$imageName bs=1M count=10
mkfs.ext4 $imageName
loopdev=$(sudo losetup -P -f --show $imageName); echo $loopdev
mkdir $mountDir
sudo mount $loopdev $mountDir
sudo chown -R $USER:$USER mount

echo "2ed99f0039724cd194858869e9debac4" | xxd -r -p > $mountDir/binary.bin

sudo umount $mountDir

in python3 (since bash struggles to deal with binary data) search for the magic binary data in binary.bin

import binascii

with open("faulty.img", "rb") as fd:
    s =
search = binascii.unhexlify("2ed99f0039724cd194858869e9debac4")

find = s.find(search, beg); beg = find+1; print(find)

start_sector = find//512; print(start_sector)

then back in bash mount the faulty block device

start_sector=## copy value from variable start_sector in python
size=$(($(wc -c $imageName|cut -d ' ' -f1)/512))

echo -e "0\t$start_sector\tlinear\t$loopdev\t0" > fault_config
echo -e "$start_sector\t1\terror" >> fault_config
echo -e "$next_sector\t$len\tlinear\t$loopdev\t$next_sector" >> fault_config

cat fault_config | sudo dmsetup create bad_drive
sudo mount /dev/mapper/bad_drive $mountDir

finally we can test the faulty block device by reading a file

cat $mountDir/binary.bin

which produces the error:

cat: /path/to/your/mount/binary.bin: Input/output error

clean up when you're done with testing

sudo umount $mountDir
sudo dmsetup remove bad_drive
sudo losetup -d $loopdev
rm fault_config $imageName