tl;dr: I want to see/get the filenames that the reported block numbers on dmesg are part of, when vm.block_dump=1
example dmesg: bash(13515): READ block 5434824 on xvda3 (32 sectors)
When eg. sudo sysctl -w vm.block_dump=1
or eg. echo '1' | sudo tee /proc/sys/vm/block_dump
then "Linux reports all disk read and write operations that take place, and all block dirtyings done to files. [...] The output of block_dump is written to the kernel output, and it can be retrieved using "dmesg". When you use block_dump and your kernel logging level also includes kernel debugging messages, you probably want to turn off klogd, otherwise the output of block_dump will be logged, causing disk activity that is not normally there." (quote from here)
The "block dirtyings" are not an issue, eg.
[ 3140.559675] systemd-journal(291): dirtied inode 399135 (system.journal) on xvda3
I can see its filename like this:
$ echo -e 'open /dev/xvda3\n ncheck 399135' | sudo debugfs -f -
debugfs 1.44.2 (14-May-2018)
debugfs: open /dev/xvda3
debugfs: ncheck 399135
Inode Pathname
399135 /var/log/journal/12c5e521101c444594b96b53751551a8/system.journal
The problem is with "Linux reports all disk read and write operations that take place" (to quote from the above) as they are reported in blocks, eg.
[ 3140.376827] kworker/u24:3(21616): WRITE block 11037768 on xvda3 (8 sectors)
[ 3140.724725] bash(13515): READ block 5434824 on xvda3 (32 sectors)
[ 3140.725483] date(13515): READ block 5434896 on xvda3 (160 sectors)
[ 3140.728946] sed(13519): READ block 5143680 on xvda3 (32 sectors)
[ 3140.736022] sleep(13522): READ block 5379184 on xvda3 (24 sectors)
[ 3140.804803] qubes-gui(522): READ block 5179952 on xvda3 (16 sectors)
[ 3140.806519] Xorg(599): READ block 7420192 on xvda3 (176 sectors)
[ 3140.810348] InputThread(613): READ block 7418560 on xvda3 (112 sectors)
[ 3140.815866] at-spi2-registr(812): READ block 5654512 on xvda3 (8 sectors)
[ 3140.816860] xdg-desktop-por(888): READ block 5795168 on xvda3 (8 sectors)
[ 3140.818716] gnome-terminal-(865): READ block 5804672 on xvda3 (16 sectors)
[ 3141.064524] sed(13531): READ block 3446048 on xvda3 (16 sectors)
[ 3141.130808] systemd(571): READ block 4744136 on xvda3 (184 sectors)
The kernel code responsible for showing these kind of messages can be seen here: https://elixir.bootlin.com/linux/v4.18.5/source/block/blk-core.c#L2542
None of those blocks yield any inode numbers, using this:
$ echo -e 'open /dev/xvda3\n icheck 11037768' |sudo debugfs -f -
debugfs 1.44.2 (14-May-2018)
debugfs: open /dev/xvda3
debugfs: icheck 11037768
Block Inode number
11037768 <block not found>
Instead of <block not found>
above I should get an inode number which I can then use with the previous echo to see the file name.
What I tried: just in case the block number is sector size(512 bytes) multiple and I know that ext4 block size is 4096 bytes(sudo blockdev --getbsz /dev/xvda
), I also tried(using the result of multiplication for): 11037768*2, 11037768*4 and 11037768*8 with the same result: <block not found>
What am I missing here? Are those blocks pointing to directory entries or something that's not a file name? Even so, shouldn't debugfs still yield an inode number?
Is there a better way to get to the file name?
EDIT: The block number that icheck
(inside debugfs
) expects are numbers like what debugfs's stat
reports under EXTENTS
, for example any number in range 2172716-2172721
seen below as:
$ sudo debugfs -R "stat /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py" /dev/xvda3
debugfs 1.44.2 (14-May-2018)
Inode: 550529 Type: regular Mode: 0644 Flags: 0x80000
Generation: 1781055959 Version: 0x00000000:00000001
User: 0 Group: 0 Project: 0 Size: 22179
File ACL: 0
Links: 1 Blockcount: 48
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x5b6ec29d:1b2e0200 -- Sat Aug 11 13:03:57 2018
atime: 0x5b33b5a9:00000000 -- Wed Jun 27 18:04:57 2018
mtime: 0x5b33b5a9:00000000 -- Wed Jun 27 18:04:57 2018
crtime: 0x5b6ec29d:1af0f900 -- Sat Aug 11 13:03:57 2018
Size of extra inode fields: 32
EXTENTS:
(0-5):2172716-2172721
Another way:
$ filefrag -s -v /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py
Filesystem type is: ef53
File size of /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py is 22179 (6 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 5: 2172716.. 2172721: 6: last,eof
/usr/lib/python2.7/site-packages/salt/modules/zonecfg.py: 1 extent found
Now the question remains: How do those device block numbers (the block numbers reported for xvda3 in dmesg) relate to these physical offsets ?
EDIT2: I've just confirmed that these physical offset numbers are the same as block device numbers (just, apparently not the same as the ones reported on dmesg); the following shows me the last block of the above file and I can confirm it being the same as when viewing the file with vim
:
$ sudo dd bs=4096 skip=2172721 count=1 if=/dev/xvda3 | hexdump -C
I've tested this under a 4.18.5 kernel inside a Qubes OS R4.0 Fedora 28 AppVM. (I can recompile a custom kernel with custom .config
/patches if necessary - suggestions welcome)
The block numbers reported on dmesg when
vm.block_dump=1
are sector numbers -512
bytes per block - but the block numbers that thedebugfs
command expects for itsicheck
command are usually4096
bytes per block for ext4, so all that is required is a division by8
.While block size is usually 4096 bytes, you have to be sure, so get the block size like this:
While this value is usually 4096, it can be different in certain cases like eg.
mkfs.ext4
with-b
can create an ext4 filesystem with a different block size; vfat has 512 byte block size.To get the divider for the division, from the block size, all in one go (in bash):
Let's use this line as input:
[ 3140.736022] sleep(13522): READ block 5379184 on xvda3 (24 sectors)
Get the block number for
icheck
by dividing the above by8
:I used
bc -l
to make sure that I didn't typo the number(if it end with.00000000000000000000
I most likely didn't)Get the path name by using
debugfs
:I am unsure if the block numbers reported in dmesg can ever be multiples of values other than 512 bytes per sector. For example, are they still 512 bytes per block if the underlying disk has 4096 bytes per sector physically? If I were to guess, I'd say it's safe to assume they're always 512 bytes per block.
Here's the script version of the above:
Sample usage:
Here's a script to process an entire dmesg log:
Sample output: