Jeff Squyres, le Fri 12 Mar 2010 08:05:04 -0800, a écrit :
> On Mar 12, 2010, at 7:51 AM, Samuel Thibault wrote:
> > > To support that, do we need to make internal variables and fields be volatile?
> > ?! I fail to see why we would need that.
> > If some threads uses a function that modifies a topology object, no
> > other thread should be reading it of course, since the reader will
> > possibly read incoherent data. A volatile qualifier can not fix that,
> > only mutexes (or transactional memory :) ) can.
> Right -- that's not what I'm asking about.
> Even in this scenario:
> 1. thread A calls hwloc_topology_init(&a)
> 2. thread A calls hwloc_topology_load(a)
> 3. thread A launches thread B
> 4. thread B calls hwloc_topology_get_*(a...)
> 5. threads A and B synchronize
> 6. thread A calls hwloc_topology_load(a)
> 7. thread B calls hwloc_topology_get_*(a...)
> If the topology struct is not marked volatile (or the fields or whatever), then the compiler *might* assume that all the data in cache/registers from step 4 may still be valid in step 7.
This is like with any libc structure that you pass between threads. If
step 5 (synchronization) does not perform a compiler and hardware
barrier, you may have incoherencies, yes.
> volatile effectively forces cache misses so that step 7 will guarantee to read from memory again,
That is not the proper way to deal with it. A full memory barrier at
step 5 is just enough and much more efficient.
> > > If we say that applications need to provide their own synchronization
> > > between readers and writers, atomic writes shouldn't be an issue,
> > > right?
> > I do not understand this either.
> Since writes back to memory may be delayed, it could be possible that a write of a value in a topology struct only gets partially written before a read for that same value comes in from another thread (even if the threads *think* they have synchronized, such as above).
Ok, same answer: use a memory barrier in the application (semaphores,
mutexes and spinlocks already do that for you actually).
> If we say that applications need to provide their own synchronization
> between readers and writers, atomic writes **could still** be an issue,
With a full memory barrier, you do not have any issue.