I have a C++ program that I'm building with Clang 3.9's profile-guided optimization feature. Here's what's supposed to happen:
- I build the program with instrumentation enabled.
- I run that program, creating a file with profile-data:
prof.raw
. - I use
llvm-profdata
to convertprof.raw
to a new file,prof.data
. - I create a new build of that same program, with a few changes:
- When compiling each .cpp file to a .o file, I use the compiler flag
-fprofile-use=prof.data
. - When linking the executable, I also specify
-fprofile-use
.
- When compiling each .cpp file to a .o file, I use the compiler flag
I have a Gnu Makefile for this, and it works great. My problem arises now that I'm trying to port that Makefile to CMake (3.7, but I could upgrade ). I need the solution to work with (at least) the Unix Makefiles generator, but ideally it would work for all generators.
In CMake, I've defined two executable targets: foo-gen
and foo-use
:
- When
foo-gen
is executed, it creates theprof.raw
file. - I use
add_custom_command
to create a rule to createprof.data
fromprof.raw
.
My problem is that I can't figure out how to tell CMake that each of the object files depended upon by foo-use
has a dependency on the file prof.data
.
The most-promising idea I had was to (1) find a way to enumerate all of the
.o
files upon whichfoo-use
depenends, and then (2) iterate over each of those.o
files, callingadd_dependency
for each one.The problem with this approach is I can't find an idiomatic way, in my CMakeLists.txt file, to enumerate the list of object files upon which an executable depends. This might be an open problem with CMake.
I also considered using
set_source_files_properties
to set theOBJECT_DEPENDS
property on each of my.cpp
files used byfoo-use
, addingprof.data
to that property's list.The problem with this (AFAICT) is that each of my
.cpp
files is used to create two different.o
files: one forfoo-gen
and one forfoo-use
. I want the.o
files that get linked intofoo-use
to have this compile-time dependency onprof.data
; but the.o
files that get linked intofoo-gen
must not have a compile-time dependency onprof.data
.And AFAIK,
set_source_files_properties
doesn't let me set theOBJECT_DEPENDS
property to have one of two values, contingent on whetherfoo-gen
orfoo-use
is the current target of interest.
Any suggestions for a clean(ish) way to make this work?
The standard way of doing this is to go through the configure+build workflow twice.
The first time, you create the initial build tree and build with both
CMAKE_C_FLAGS
andCMAKE_CXX_FLAGS
set to a value containing-fprofile-generate
. Then you run the unit tests you're interested in running. For instance, you might have a CTest label namedperformance
.Then the second time, you reconfigure in place with
-fprofile-use
instead. Then you can build, run all of your tests and install or package.This can all be orchestrated without too much hassle using CMake presets, which I'll show in a minute.
First off, here's the
CMakeLists.txt
.This is the easy part. My
main.cpp
is just a basic "Hello, world!", provided here for ease of copy/paste.Now we can write the
CMakePresets.json
. I'd consult the documentation before reading this.Basically what happens here is that we create two configure presets,
pgo-gen
andpgo-use
, both of which have a commonbase
they inherit from. This is just to show a technique for incrementally changing parts of cache variables using the environment as a side-channel.Then we create two corresponding build presets and a test preset for pgo-gen that only runs the
performance
-labeled test cases.Finally, we create two workflows to drive this. The end-to-end commands are:
And if I delete the build directory and try
pgo-use
withoutgen
:Then I'm missing the profile data as expected.
I initially wanted to write a single
pgo
workflow that consisted of all five steps, but sadly CMake currently doesn't allow twoconfigure
steps in a workflow. Why? I don't know. I'm going to open a bug report.