I want to know each symbol's size in elf executable or dynamic library and I assume the total symble size and other stuff's size can add up to the file size.
From size command I can see all section size, but they don't add up to the file size, and I need to know what's missing here.
file libfoo.so size:
-rwxr-x---. 1 root root 20080 Dec 2 11:32 libfoo.so
size -A -d libfoo.so
.note.gnu.build-id 36 728
.gnu.hash 36 768
.dynsym 192 808
.dynstr 233 1000
.gnu.version 16 1234
.gnu.version_r 32 1256
.rela.dyn 168 1288
.rela.plt 72 1456
.init 27 4096
.plt 64 4128
.text 232 4192
.fini 13 4424
.rodata 16 8192
.eh_frame_hdr 28 8208
.eh_frame 100 8240
.init_array 8 15784
.fini_array 8 15792
.data.rel.ro 8 15800
.dynamic 544 15808
.got 32 16352
.got.plt 48 16384
.bss 8 16432
.comment 46 0
.GCC.command.line 101 0
.gnu.build.attributes 288 24632
.debug_aranges 48 0
.debug_info 1744 0
.debug_abbrev 388 0
.debug_line 161 0
.debug_str 873 0
.debug_line_str 348 0
Total 5966
After figuring out the complete sections, I can analyze main sections such as .text, .data with nm
You're headline question is:
You tried something and it didn't work, so your knock-on question is:
The short answer to the headline question is that you can't decompose an ELF file size into the sizes of the sections or symbols, because:-
There's more to an ELF file than sections and symbols (and the symbols are in sections anyway).
Most frequently, the size of a section as specified in an ELF file is the size of the section's runtime memory requirement. In that case the section's size may be less than, equal to or greater than the number of bytes of file storage used for the section.
Those points will be fleshed out as we address the knock-on question.
What's missing? (1)
Some of what's missing is the stuff that
size -A
shouldn't count, because it isn't sections.An ELF executable or shared library does not consist simply of sections. In addition to sections it contains:-
An ELF File Header, a structure defining global properties of the file.
A Program Header Table, an array of Program Header structures each of which defines the properties of one memory segment into which one or more sections are mapped.
A Section Header Table, an array of Section Header structures each of which defines the properties of one linkage section of the file.
The file may also contain padding bytes that are not part of any section or header.
The file, program and section header structures are respectively defined in
<elf.h>
byElfN_Ehdr
, (Elf32_Phdr
|Elf64_Phdr
) and (Elf32_Shdr
|Elf64_Shdr
). These structures are laid out in the file on boundaries that satisfy their natural alignment as per__alignof__(ElfN_Ehdr)
etc.What's missing? (2)
The rest of what's missing is the sections that
size -A
should count, but doesn't.The
size
command, likeobjdump
, is not a wholly dependable parser of modern ELF files. They do not consider all the sections that may actually exist in an input ELF, andsize -A
makes the same omissions asobjdump -h
. This is because both of these utilities (and others inbinutils
) rely onlibbfd
, the GNU Binary File Descriptor Library, to parse ELF files, andlibbfd
is not a wholly dependable parser of ELF files1. You can count onsize
orobjump
to recognize sections in a ELF binary that will be memory-mapped. (At least I haven't yet seen the contrary). But for sections that have no memory footprint and just support the work of tools, your mileage may vary. Moral: To investigate ELF binaries, preferreadelf
overlibbfd
-based tools.And in any case...
The size of a section is not the measure of its file storage.
A section has an alignment property: either no alignment or 2,4,8 ... byte alignment. A segment also has an alignment property, which may again be none, 2,4,8 ... or page-alignment for loadable segments. In the file layout, a section
N
will occupy a region that is padded as necessary to make sectionN + 1
begin at a correctly aligned byte. That alignment contraint may stem either from the alignment of sectionN + 1
within the same segment or from the alignment of the next segment, in which sectionN + 1
comes first. Since some segments are page-aligned, this means that the region occupied by a section might be padded with up to($ getconf PAGE_SIZE)
bytes - typically 4K.The size of a section may even be larger than the size of the file region it occupies, or indeed larger than the whole file. The section header defining a section may specify that its size =
N
, that memory is to be allocated for it at runtime, and that it uses 0 bytes on disk. The.bss
section - for uninitialized static symbols - is like that:static char arr[1024]
contributes 1K to the size of.bss
but nothing to the size of files into which it is compiled.Optional reading: How to account for the size of a shared library using
readelf
, and howsize -A
failsLet's make a shared library
libfoo.so
:This example is contrived to have large amount of uninitialized static data.
Here's the size of
libfoo.so
:Let's see how
size -A
fares:At 52217 bytes,
size -A
makes the sum of the sections more than 3 times the size of the file. So it should be, because of those 50032 bytes assigned tobss
that take no space in the file. Let's subtract them then. 52217 - 50032 = 2185. But that is 16112 - 2185 = 13927 bytes smaller than the actual file. This is your puzzle.Note that every section listed is memory-mapped (it has a non-0
addr
), except the.comment
section. How many sections are listed? -Turning to
readelf
Here's the ELF file header of
libfoo.so
:This tells us that:
Sandwiched between the program headers and the section headers we have 14192 - 680 = 13512 bytes still to account for. That should be for the sections.
Here are the program header details:
We don't need to make much of these details, but they serve to explain some of the padding we'll find later amongst the sections.
And here are the section details:
Note that
readelf
lists 30 sections, not the 26 reported bysize
. Per the section headers, section 0 is of typeNULL
- no name, no address, no size. This is mandatory. Reasonably enough, the null section was ignored bysize -A
. Section 1,.note.gnu.property
is the first non-null section and was reported bysize
. The final sections 27 - 29,.symtab
,.strtab
and.shstrtab
were also ignored bysize
. They have no memory footprint -Address
= 0 - but they do occupy respectively 0x330, 0x2ed and 0x10d bytes in the file. That's 1834 bytes of non-null sections thatsize -A
disregarded.Section 1 starts at offset 0x2a8 = byte 680 = end of program headers, so the sections start right after the program headers.
The last section, 29, starts at offset 0x365d = byte 13917. It is 0x10d = 269 bytes in size, so it continues until byte 13917 + 269 = 14186. Then 6 bytes of padding takes us to byte 14192 = start of section headers, which requires 8-byte alignment (
__alignof__(Elf64_Shdr)
== 8 on my machine). It's easily observed that various other sections are followed by some padding. That's either to satisfy the alignment requirement of the next section, as specified in theAl
column - e.g. the.rodata
section - or else to let the next one be aligned at the start of the next memory segment, per the program header details (Section to Segment mapping
) - e.g. the.fini
section. That.fini
section starts at offset 0x117c and has size 0xd. The next section.rodata
is at page-aligned offset 0x2000; so there's 3703 bytes of padding in.fini
's chunk of the file.Note the
.bss
section. It starts at offset 0x3018 and has size 0xc370 = 50032. That's ourstatic char a[50000];
. It has section typeNOBITS
, meaning no file data, and accordingly the next section,.comment
, also starts at 0x3018.So, the 16112 bytes of
libfoo
decompose into its ELF components as follow:Further Reading
libbfd
"is a package which allows applications to use the same routines to operate on object files whatever the object file format. A new object file format can be supported simply by creating a new BFD back end and adding it to the library.". A weakness of this architecture is that the BFD front-end embodies an abstraction of all the supported back-end formats that elides ill-fitting features some of them.