My goal is to record the number of processor instructions executed by a given binary program through the duration of its run. While it's easy to get the actual machine code from the source code (through gdb or any other disassembler), this does not take into account function calls and branches within the program that cause instructions to be executed more than once or skipped altogether.
Is there a straightforward solution to this?
This is very hardware specific, but most processors offer a facility that counts the exact number of machine instructions (and other events) that have flowed through them. That's how profilers work to capture things like cache misses: by querying these internal registers.
The PAPI library provides calls to query this data on a variety of major processors. If you're on Linux+x86, PerfSuite gives you some more high-level tools which may be easier to start with.
Intel has a monitor app you can use to watch the chip's internal counters in realtime, and their Performance Analysis Guide describes the various Performance Monitoring Units on the chip and how to read them.