Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Mpi_leave_pinned=1 is thread safe?
From: tmishima_at_[hidden]
Date: 2012-08-02 02:31:22


Dear openmpi developers

Unfortunately, there's no reply, but I have been continuing test run.
Then, as far as I checked, there's no segfault with mpi_leave_pinned=0 or
OMP_NUM_THREADS < 4. On the other hand, when I set mpi_leave_pinned=1
(and OMP_NUM_THREADS>=4), I often get segfault.

Because most of segfaults occur in opal_memory_ptmalloc2 as shown below,
I doubt that the cause of segfault is openmpi's memory management in
multi-thread. I ask you to investigate this problem again.

Best regards,
Tetsuya Mishima

> Dear openmpi developers,
> I have been developing our hybrid(MPI+OpenMP) application using openmpi
> for five years.
>
> This time, I tyied to install a new function, which is c++ based multi-
> threaded library and it heavily repeats new and delete objects in each
> thread.
>
> Our application is so called "MPI_THREAD_FUNNELED", and openmpi-1.6
> is built using --with-tm --with-openib --disable-ipv6.
>
> My trouble is that it works very well with "--mca mpi_leave_pinned 0"
> but, when mpi_leave_pinned is enabled, it often causes segfault like
below.
>
> I note that it works fine on Windows multi-threaded platform combined
> with mpich2. Furthermore, regarding multi-thread(none MPI) version,
> it also works fine enven on linux environment.
>
> #0 0x00002b36f1ab35fa in malloc_consolidate (av=0x2aaab0c00020)
> at ./malloc.c:4556
> #1 0x00002b36f1ab34d9 in opal_memory_ptmalloc2_int_free
> (av=0x2aaab0c00020, mem=0x2aaab0c00a70) at ./malloc.c:4453
> #2 0x00002b36f1ab1ce2 in opal_memory_ptmalloc2_free (mem=0x2aaab0c00a70)
> at ./malloc.c:3511
> #3 0x00002b36f1ab0ca9 in opal_memory_linux_free_hook
> (__ptr=0x2aaab0c00a70, caller=0xa075c8) at ./hooks.c:705
> #4 0x00000037b4a758a7 in free () from /lib64/libc.so.6
> #5 0x0000000000a075c8 in CErrorReporter<std::basic_ostringstream<char,
> std::char_traits<char>, std::allocator<char> > >
> ::Clear ()
> #6 0x0000000000a01eec in IPhreeqc::AccumulateLine ()
> #7 0x0000000000a01180 in AccumulateLine ()
> #8 0x0000000000a0078e in accumulatelinef_ ()
> #9 0x0000000000576ce6 in initial_conditions_ () at ./PHREEQC-model.f:307
> #10 0x0000000000577b3a in iphreeqc_main_ () at ./PHREEQC-model.f:505
> #11 0x0000000000577fa1 in basicphreeqc_ () at ./PHREEQC-model.f:944
> #12 0x00000000004b492a in phrqbl_ () at ./MULTI-COM.f:8371
> #13 0x00000000004aa6e9 in smxmknp:qois_ () at ./MULTI-COM.f:5112
> #14 0x00000000004a2c5e in solvenpois_ () at ./MULTI-COM.f:4276
> #15 0x000000000049e731 in solducom_ () at ./MULTI-COM.f:3782
> #16 0x000000000048b60c in MAIN () at ./MULTI-COM.f:1208
> #17 0x0000000000481350 in main ()
> #18 0x00000037b4a1d974 in __libc_start_main () from /lib64/libc.so.6
> #19 0x0000000000481259 in _start ()
>
> Best regard,
> Tetsuya Mishima
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>