Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] thread safety
From: Samuel Thibault (samuel.thibault_at_[hidden])
Date: 2010-03-12 11:16:23


Jeff Squyres, le Fri 12 Mar 2010 08:05:04 -0800, a écrit :
> On Mar 12, 2010, at 7:51 AM, Samuel Thibault wrote:
> > > To support that, do we need to make internal variables and fields be volatile?
> >
> > ?! I fail to see why we would need that.
> > If some threads uses a function that modifies a topology object, no
> > other thread should be reading it of course, since the reader will
> > possibly read incoherent data. A volatile qualifier can not fix that,
> > only mutexes (or transactional memory :) ) can.
>
> Right -- that's not what I'm asking about.
>
> Even in this scenario:
>
> 1. thread A calls hwloc_topology_init(&a)
> 2. thread A calls hwloc_topology_load(a)
> 3. thread A launches thread B
> 4. thread B calls hwloc_topology_get_*(a...)
> 5. threads A and B synchronize
> 6. thread A calls hwloc_topology_load(a)
> 7. thread B calls hwloc_topology_get_*(a...)
>
> If the topology struct is not marked volatile (or the fields or whatever), then the compiler *might* assume that all the data in cache/registers from step 4 may still be valid in step 7.

This is like with any libc structure that you pass between threads. If
step 5 (synchronization) does not perform a compiler and hardware
barrier, you may have incoherencies, yes.

> volatile effectively forces cache misses so that step 7 will guarantee to read from memory again,

That is not the proper way to deal with it. A full memory barrier at
step 5 is just enough and much more efficient.

> > > If we say that applications need to provide their own synchronization
> > > between readers and writers, atomic writes shouldn't be an issue,
> > > right?
> >
> > I do not understand this either.
>
> Since writes back to memory may be delayed, it could be possible that a write of a value in a topology struct only gets partially written before a read for that same value comes in from another thread (even if the threads *think* they have synchronized, such as above).

Ok, same answer: use a memory barrier in the application (semaphores,
mutexes and spinlocks already do that for you actually).

> If we say that applications need to provide their own synchronization
> between readers and writers, atomic writes **could still** be an issue,
> right?

With a full memory barrier, you do not have any issue.

Samuel