embedded C - using "volatile" to assert consistency

577 Views Asked by At

Consider the following code:

// In the interrupt handler file:
volatile uint32_t gSampleIndex = 0; // declared 'extern'
void HandleSomeIrq()
{
    gSampleIndex++;
}

// In some other file
void Process()
{
    uint32_t localSampleIndex = gSampleIndex;   // will this be optimized away?
    PrevSample      = RawSamples[(localSampleIndex + 0) % NUM_RAW_SAMPLE_BUFFERS];
    CurrentSample   = RawSamples[(localSampleIndex + 1) % NUM_RAW_SAMPLE_BUFFERS];
    NextSample      = RawSamples[(localSampleIndex + 2) % NUM_RAW_SAMPLE_BUFFERS];
}

My intention is that PrevSample, CurrentSample and NextSample are consistent, even if gSampleIndex is updated during the call to Process().

Will the assignment to the localSampleIndex do the trick, or is there any chance it will be optimized away even though gSampleIndex is volatile?

2

There are 2 best solutions below

8
On BEST ANSWER

In your function you access volatile variable just once (and it's the only volatile one in that function) so you don't need to worry about code reorganization that compiler may do (and volatile prevents). What standard says for these optimizations at §5.1.2.3 is:

In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).

Note last sentence: "...no needed side effects are produced (...accessing a volatile object)".

Simply volatile will prevent any optimization compiler may do around that code. Just to mention few: no instruction reordering respect other volatile variables. no expression removing, no caching, no value propagation across functions.

BTW I doubt any compiler may break your code (with or without volatile). Maybe local stack variable will be elided but value will be stored in a registry (for sure it won't repeatedly access a memory location). What you need volatile for is value visibility.

EDIT

I think some clarification is needed.

Let me safely assume you know what you're doing (you're working with interrupt handlers so this shouldn't be your first C program): CPU word matches your variable type and memory is properly aligned.

Let me also assume your interrupt is not reentrant (some magic cli/sti stuff or whatever your CPU uses for this) unless you're planning some hard-time debugging and tuning.

If these assumptions are satisfied then you don't need atomic operations. Why? Because localSampleIndex = gSampleIndex is atomic (because it's properly aligned, word size matches and it's volatile), with ++gSampleIndex there isn't any race condition (HandleSomeIrq won't be called again while it's still in execution). More than useless they're wrong.

One may think: "OK, I may not need atomic but why I can't use them? Even if such assumption are satisfied this is an *extra* and it'll achieve same goal" . No, it doesn't. Atomic has not same semantic of volatile variables (and seldom volatile is/should be used outside memory mapped I/O and signal handling). Volatile (usually) is useless with atomic (unless a specific architecture says it is) but it has a great difference: visibility. When you update gSampleIndex in HandleSomeIrq standard guarantees that value will be immediately visible to all threads (and devices). with atomic_uint standard guarantees it'll be visible in a reasonable amount of time.

To make it short and clear: volatile and atomic are not the same thing. Atomic operations are useful for concurrency, volatile are useful for lower level stuff (interrupts, devices). If you're still thinking "hey they do *exactly* what I need" please read few useful links picked from comments: cache coherency and a nice reading about atomics.

To summarize:
In your case you may use an atomic variable with a lock (to have both atomic access and value visibility) but no one on this earth would put a lock inside an interrupt handler (unless absolutely definitely doubtless unquestionably needed, and from code you posted it's not your case).

19
On

In principle, volatile is not enough to guarantee that Process only sees consistent values of gSampleIndex. In practice, however, you should not run into any issues if uinit32_t is directly supported by the hardware. The proper solution would be to use atomic accesses.

The problem

Suppose that you are running on a 16-bit architecture, so that the instruction

localSampleIndex = gSampleIndex;

gets compiled into two instructions (loading the upper half, loading the lower half). Then the interrupt might be called between the two instructions, and you'll get half of the old value combined with half of the new value.

The solution

The solution is to access gSampleCounter using atomic operations only. I know of three ways of doing that.

C11 atomics

In C11 (supported since GCC 4.9), you declare your variable as atomic:

#include <stdatomic.h>

atomic_uint gSampleIndex;

You then take care to only ever access the variable using the documented atomic interfaces. In the IRQ handler:

atomic_fetch_add(&gSampleIndex, 1);

and in the Process function:

localSampleIndex = atomic_load(gSampleIndex);

Do not bother with the _explicit variants of the atomic functions unless you're trying to get your program to scale across large numbers of cores.

GCC atomics

Even if your compiler does not support C11 yet, it probably has some support for atomic operations. For example, in GCC you can say:

volatile int gSampleIndex;
...
__atomic_add_fetch(&gSampleIndex, 1, __ATOMIC_SEQ_CST);
...
__atomic_load(&gSampleIndex, &localSampleIndex, __ATOMIC_SEQ_CST);

As above, do not bother with weak consistency unless you're trying to achieve good scaling behaviour.

Implementing atomic operations yourself

Since you're not trying to protect against concurrent access from multiple cores, just race conditions with an interrupt handler, it is possible to implement a consistency protocol using standard C primitives only. Dekker's algorithm is the oldest known such protocol.