I run a CI server which I use to build a custom linux kernel. The CI server is not powerful and has a time limit of 3h per build. To work within this limit, I had the idea to cache kernel builds using ccache. My hope was that I could create a cache once every minor version release and reuse it for the patch releases e.g. I have a cache I made for 4.18 which I want to use for all 4.18.x kernels.
After removing the build timestamps, this works great for the exact kernel version I am building for. For the 4.18 kernel referenced above, building that on the CI gives the following statistics:
$ ccache -s
cache directory
primary config
secondary config (readonly) /etc/ccache.conf
stats zero time Thu Aug 16 14:36:22 2018
cache hit (direct) 17812
cache hit (preprocessed) 38
cache miss 0
cache hit rate 100.00 %
called for link 3
called for preprocessing 29039
unsupported code directive 4
no input file 2207
cleanups performed 0
files in cache 53652
cache size 1.4 GB
max cache size 5.0 GB
Cache hit rate of 100% and an hour to complete the build, fantastic stats and as expected.
Unfortunately, when I try to build 4.18.1, I get
cache directory
primary config
secondary config (readonly) /etc/ccache.conf
stats zero time Thu Aug 16 10:36:22 2018
cache hit (direct) 0
cache hit (preprocessed) 233
cache miss 17658
cache hit rate 1.30 %
called for link 3
called for preprocessing 29039
unsupported code directive 4
no input file 2207
cleanups performed 0
files in cache 90418
cache size 2.4 GB
max cache size 5.0 GB
That's a 1.30% hit rate and the build time reflects this poor performance. That from only a single patch version change.
I would have expected the caching performance to degrade over time but not to this extent, so my only thought is that there is more non-determinism than simply the timestamp. For example, are most/all of the source files including the full kernel version string? My understanding is that something like that would break the caching completely. Is there a way to make the caching work as I'd like it to or is it impossible?
There is
include/generated/uapi/linux/version.h
header (generated in the top Makefile https://elixir.bootlin.com/linux/v4.16.18/source/Makefile)which includes exact kernel version as macro:
So, version.h for linux 4.16.18 will be generated like (266258 is (4 << 16) + (16 << 8) + 18 = 0x41012)
Later, for example in module building there should be way to read LINUX_VERSION_CODE macro value https://www.tldp.org/LDP/lkmpg/2.4/html/lkmpg.html (4.1.6. Writing Modules for Multiple Kernel Versions)
How version.h is included? The sample module includes
<linux/kernel.h>
<linux/module.h>
and<linux/modversions.h>
, and one of these files probably indirectly includes globalversion.h
. And most or even all kernel sources will include version.h.When your build timestamps were compared, version.h may be regenerated and disables ccache. When timestamps are ignored,
LINUX_VERSION_CODE
is same only for exactly same linux kernel version, and it is changed for next patchlevel.Update: Check
gcc -H
output of some kernel object compilation, there will be another header with full kernel version macro definition. For example:include/generated/utsrelease.h
(UTS_RELEASE
macro),include/generated/autoconf.h
(CONFIG_VERSION_SIGNATURE
).Or even do
gcc -E
preprocessing of same kernel object compilation between two patchlevels and compare the generated text. With simplest linux module I have-include ./include/linux/kconfig.h
directly in gcc command line, and its includesinclude/generated/autoconf.h
(but this is not visible in-H
output, is it bug or feature of gcc?).https://patchwork.kernel.org/patch/9326051/
It actually does: https://elixir.bootlin.com/linux/v4.16.18/source/Makefile
LINUXINCLUDE is exported to env and used in
source/scripts/Makefile.lib
to define compiler flags https://elixir.bootlin.com/linux/v4.16.18/source/scripts/Makefile.lib