Why dd can't handle sparse files in shell scripts?

980 Views Asked by At

I have the following sparse file that I want to flash to an SD card:

647M -rw-------  1 root     root     4.2G Sep 21 16:53 make_sd_card.sh.xNws4e

As you can see, it takes ~647M on disk for an apparent size of 4.2G. If I flash it directly with dd, in my shell, it's really fast, ~6s:

$ time (sudo /bin/dd if=make_sd_card.sh.xNws4e of=/dev/mmcblkp0 conv=sparse; sync)
8601600+0 records in
8601600+0 records out
4404019200 bytes (4.4 GB, 4.1 GiB) copied, 6.20815 s, 709 MB/s

real    0m6.284s
user    0m1.920s
sys     0m4.336s

But when I do the very same commands inside a shell script, it behaves like if it was copying all the zeroes and takes a big amount of time (~2m10):

$ time sudo ./plop.sh ./make_sd_card.sh.xNws4e
+ dd if=./make_sd_card.sh.xNws4e of=/dev/mmcblk0 conv=sparse
8601600+0 records in
8601600+0 records out
4404019200 bytes (4.4 GB, 4.1 GiB) copied, 127.984 s, 34.4 MB/s
+ sync

real    2m9.885s
user    0m3.520s
sys     0m15.560s

If I watch the dirty section of /proc/meminfo, I can see that this counter is much higher when dd-ing from a shell script than directly from the shell.

My shell is bash an for the record, the script is:

#!/bin/bash
set -xeu
dd if=$1 of=/dev/mmcblk0 conv=sparse bs=512
sync

[EDIT] I'm resurrecting this topic, because a developer I work with, has found these commands: bmap_create and bmap_copy which seems to do exactly what I was trying with achieve clumsily with dd. In debian, they are part of the bmap-tools package. With it, it takes 1m2s to flash a 4.1GB sparse SD image, with a real size of 674MB, when it takes 6m26s with dd or cp.

1

There are 1 best solutions below

5
On BEST ANSWER

This difference is caused by a typo in the non-scripted invocation, which did not actually write to your memory card. There is no difference in dd behavior between scripted and interactive invocation.


Keep in mind what a sparse file is: It's a file on a filesystem that's able to store metadata tracking which blocks have values at all, and thus for which zero blocks have never been allocated any storage on disk whatsoever.

This concept -- of a sparse file -- is specific to files. You can't have a sparse block device.


The distinction between your two lines of code is that one of them (the fast one) has a typo (mmcblkp0 instead of mmcblk0), so it's referring to a block device name that doesn't exist. Thus, it creates a file. Files can be sparse. Thus, it creates a sparse file. Creating a sparse file is fast.

The other one, without the typo, writes to the block device. Block devices can't be sparse. Thus, it always takes the full execution time to run.