A plain variable is insuffecient for inter-thread communication. Both the
compiler and the processor may reorder accesses. The compiler could even
cache dataLoaded with the result that STATE_FINISHED becomes unreachable.
Fix this by using C11 atomic_bool, which guarantees sequential consistency.
This fixes#827.