Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] -DUSE_TSD_DATA_HACK problem in openmpi's ptmalloc2
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2012-09-26 05:01:19


Thanks for bringing this to our attention.

Brian just committed a fix on the trunk (https://svn.open-mpi.org/trac/ompi/changeset/27371). We'll let that soak for a day or three and then bring it over to v1.6 and v1.7.

On Sep 20, 2012, at 8:25 AM, <tmishima_at_[hidden]> <tmishima_at_[hidden]> wrote:

>
> Hello, I found a problem in openmpi's ptmalloc2. The problem is that
> TSD (thread specific data) does not work properly and it may cause
> peformance loss and segfault. In my case, heavy memory allocating
> applications sometimes make segfault.
>
> Please see opal/mca/memory/linux/sysdeps/pthread/malloc-machine.h.
>
> When USE_TSD_DATA_HACK is defined, which is default of openmpi, the
> hacked TSD is used as shown below.
>
> #if defined(__sgi) || defined(USE_TSD_DATA_HACK)
> typedef void *tsd_key_t[256];
> #define tsd_key_create(key, destr) do { \
> int i; \
> for(i=0; i<256; i++) (*key)[i] = 0; \
> } while(0)
> #define tsd_setspecific(key, data) \
> (key[(unsigned)pthread_self() % 256] = (data))
> #define tsd_getspecific(key, vptr) \
> (vptr = key[(unsigned)pthread_self() % 256])
>
> On the other hand, thread ID(=pthread_self()) generated by pthread is
> not a continuous number, at least in my environment.
>
> An example of threads created by t-test1 included in ptmalloc2:
> [mishima_at_manage ptmalloc2]$ ./t-test1 4 4
> Using posix threads.
> total=4 threads=4 i_max=10000 size=10000 bins=200
> Created thread 41cb4940.
> Created thread 41eb5940.
> Created thread 420b6940.
> Created thread 422b7940.
>
> Since the interval of ID number is much larger than 256, each thread may
> share key-array address. Most of [pthread_self() % 256] is 64 as shown
> above, which means that the hacked TSD does not function at all.
>
> I think -DUSE_TSD_DATA_HACK=1 should be removed from openmpi's
> configuration. As far as I checked, when I use pthread's TSD by
> "#undef USE_DATA_HACK", the problem goes away.
>
> One more request is PGI compiler issue. PGI compiler does not have
> pre-defined macro __GNUC__. Therefore, PGI does not use fast inline
> mutex_lock wrriten in malloc-machine.h. Please consider to add 4 lines
> arround the head of malloc.c.
>
> --- opal/mca/memory/linux/malloc.c.org 2012-08-30 16:15:19.000000000 +0900
> +++ opal/mca/memory/linux/malloc.c 2012-08-31 07:57:16.000000000 +0900
> @@ -43,6 +43,11 @@
> #define MORECORE opal_memory_linux_free_ptmalloc2_sbrk
> #define munmap(a,b) opal_memory_linux_free_ptmalloc2_munmap(a,b,1)
>
> +/* For PGI compiler to activate inline mutex_lock */
> +#if defined(__PGI)
> +#define __GNUC__ 1
> +#endif
> +
> /* make some non-GCC compilers happy */
> #ifndef __GNUC__
> #define __const const
>
> P.S.
> Since GNU and Intel compiler uses inline mutex_lock, mutex initialization
> is very fast and the hacked TSD problem does not cause segfault. Only
> the perfomance loss could be induced. The reason is a very long story,
> please let it omitted today.
>
> Best regards,
> Tetsuya Mishima
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/