Correct our implementation of Thread.VolatileRead ()/VolatileWrite ().
We have various issues:
* All of the functions assumed that doing volatile loads/stores was
enough, when load-acquire, store-release semantics were actually
needed. This may have worked before purely by chance. We now use
proper memory barriers so that we don't have to hope for the
compiler and CPU to do the right thing.
* Removes the hack for 64-bit quantities on 32-bit systems. This is
no longer needed now that we use explicit memory barriers in
these functions. Also, these functions are not supposed to do
atomic reads/writes in the first place - they're purely about the
memory barrier semantics.
* The VolatileWrite (object&, object) overload was not using the
volatile qualifier at all, thus not getting volatile semantics
as per: http://gcc.gnu.org/onlinedocs/gcc/Volatiles.html