sunfishcode.github.io

Follow me on GitHub

How should C-style volatile be defined?

C’s volatile, and closely related features in several other languages, is esoteric, and the standards all use vague language to describe it.

A common modern assumption is that volatile is useless for memory shared between threads, however this isn’t entirely clear from the standards themselves.

An idea

What if the we could decouple the memory access from the I/O in the abstract machine?

Specifically, define a read of a volatile glvalue

  x = *p;

to be as if it were:

  auto t = *static_cast<remove_volatile<T>::type>(p);
  x = __record_volatile_load(p, sizeof(T), t);

And a write of a volatile glvalue

  *p = x;

to be as if:

  *static_cast<remove_volatile<T>::type>(p) = x;
  __record_volatile_store(p, sizeof(T), x);

__record_volatile_load and __record_volatile_store would be functions with implementation-defined meaning, except that they wouldn’t do any loads or stores. They’d just do I/O. __record_volatile_load would return a value from the outside world, or t, or some function of both.

Beyond simple loads and stores

For accesses which are both volatile and atomic, it may be necessary to add similar rules for atomics, using functions with names like __record_volatile_atomic_*, which would have extra parameters to describe the specifics of atomic operations.

Implementations supporting extensions such as non-temporal memory accesses may conceptually need to extend the design sketch here to pass such information through to the I/O routines.

Conceptually, anything a memory-mapped I/O device might be sensitive to should be passed into the I/O routines.

setjmp and sig_atomic_t

Besides memory-mapped I/O, there are two other standard uses for volatile. Local variables declared volatile have defined values after a setjmp, and signal handlers invoked asynchronously can read and write variables declared as volatile sig_atomic_t.

In most compilers, disabling optimization around volatile makes these use cases work, however we’re looking for a definition here that doesn’t depend on vague terms like “optimization”.

A possible solution is to make one extra provision: Declared volatile variables that aren’t attached to actual memory-mapped I/O devices (through implementation-specific means) are conceptually attached to an I/O “device” which is essentially an otherwise unaddressable shadow memory which __record_volatile_load loads from and __record_volatile_store stores to. This part is admittedly a little awkward, but it does preserve a clean separation of the program’s main address space and memory semantics from the I/O.

One could argue the underlying problem is that the volatile design in C89 tied memory-mapped I/O to setjmp/sig_atomic_t in the first place, however the goal here is to define rules describing the existing volatile concept, not reshape it into something different.

That’s it!

This way, the memory model doesn’t need to think about volatile, and we don’t have to change much in practice.

In most compilers, we wouldn’t need to make any changes. We’d simply say that it’s not possible to split the I/O from the memory access unless one knows how to implement __record_volatile_load and __record_volatile_store, which would of course be platform-dependent. Compilers wouldn’t do this until (conceptually) the very last phase of codegen, at which point it’s ok.

Notes

This idea came up in the context of WebAssembly, however it’s not specific to WebAssembly. It essentially outlines a less vague (though still abstract) model for what compilers are already doing with volatile.