Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_COMM_DUP freeze with OpenMPI 1.4.1
From: George Bosilca (bosilca_at_[hidden])
Date: 2011-05-10 14:11:41


On May 10, 2011, at 08:10 , Tim Prince wrote:

> On 5/10/2011 6:43 AM, francoise.roch_at_[hidden] wrote:
>>
>> Hi,
>>
>> I compile a parallel program with OpenMPI 1.4.1 (compiled with intel
>> compilers 12 from composerxe package) . This program is linked to MUMPS
>> library 4.9.2, compiled with the same compilers and link with intel MKL.
>> The OS is linux debian.
>> No error in compiling or running the job, but the program freeze inside
>> a call to "zmumps" routine, when the slaves process call MPI_COMM_DUP
>> routine.
>>
>> The program is executed on 2 nodes of 12 cores each (westmere
>> processors) with the following command :
>>
>> mpirun -np 24 --machinefile $OAR_NODE_FILE -mca plm_rsh_agent "oarsh"
>> --mca btl self,openib -x LD_LIBRARY_PATH ./prog
>>
>> We have 12 process running on each node. We submit the job with OAR
>> batch scheduler (the $OAR_NODE_FILE variable and "oarsh" command are
>> specific to this scheduler and are usually working well with openmpi )
>>
>> via gdb, on the slaves, we can see that they are blocked in MPI_COMM_DUP :

Francoise,

Based on your traces the workers and the master are not doing the same MPI call. The workers are blocked in an MPI_Comm_dup in sub_pbdirect_init.f90:44, while the master is blocked in an MPI_Barrier in sub_pbdirect_init.f90:62. Can you verify that the slaves and the master are calling the MPI_Barrier and the MPI_Comm_dup in the same logical order?

  george.

>>
>> (gdb) where
>> #0 0x00002b32c1533113 in poll () from /lib/libc.so.6
>> #1 0x0000000000adf52c in poll_dispatch ()
>> #2 0x0000000000adcea3 in opal_event_loop ()
>> #3 0x0000000000ad69f9 in opal_progress ()
>> #4 0x0000000000a34b4e in mca_pml_ob1_recv ()
>> #5 0x00000000009b0768 in
>> ompi_coll_tuned_allreduce_intra_recursivedoubling ()
>> #6 0x00000000009ac829 in ompi_coll_tuned_allreduce_intra_dec_fixed ()
>> #7 0x000000000097e271 in ompi_comm_allreduce_intra ()
>> #8 0x000000000097dd06 in ompi_comm_nextcid ()
>> #9 0x000000000097be01 in ompi_comm_dup ()
>> #10 0x00000000009a0785 in PMPI_Comm_dup ()
>> #11 0x000000000097931d in pmpi_comm_dup__ ()
>> #12 0x0000000000644251 in zmumps (id=...) at zmumps_part1.F:144
>> #13 0x00000000004c0d03 in sub_pbdirect_init (id=..., matrix_build=...)
>> at sub_pbdirect_init.f90:44
>> #14 0x0000000000628706 in fwt2d_elas_v2 () at fwt2d_elas.f90:1048
>>
>>
>> the master wait further :
>>
>> (gdb) where
>> #0 0x00002b9dc9f3e113 in poll () from /lib/libc.so.6
>> #1 0x0000000000adf52c in poll_dispatch ()
>> #2 0x0000000000adcea3 in opal_event_loop ()
>> #3 0x0000000000ad69f9 in opal_progress ()
>> #4 0x000000000098f294 in ompi_request_default_wait_all ()
>> #5 0x0000000000a06e56 in ompi_coll_tuned_sendrecv_actual ()
>> #6 0x00000000009ab8e3 in ompi_coll_tuned_barrier_intra_bruck ()
>> #7 0x00000000009ac926 in ompi_coll_tuned_barrier_intra_dec_fixed ()
>> #8 0x00000000009a0b20 in PMPI_Barrier ()
>> #9 0x0000000000978c93 in pmpi_barrier__ ()
>> #10 0x00000000004c0dc4 in sub_pbdirect_init (id=..., matrix_build=...)
>> at sub_pbdirect_init.f90:62
>> #11 0x0000000000628706 in fwt2d_elas_v2 () at fwt2d_elas.f90:1048
>>
>>
>> Remark :
>> The same code compiled and run well with intel MPI library, from the
>> same intel package, on the same nodes.
>>
> Did you try compiling with equivalent options in each compiler? For example, (supposing you had gcc 4.6)
> gcc -O3 -funroll-loops --param max-unroll-times=2 -march=corei7
> would be equivalent (as closely as I know) to
> icc -fp-model source -msse4.2 -ansi-alias
>
> As you should be aware, default settings in icc are more closely equivalent to
> gcc -O3 -ffast-math -fno-cx-limited-range -funroll-loops --param max-unroll-times=2 -fnostrict-aliasing
>
> The options I suggest as an upper limit are probably more aggressive than most people have used successfully with OpenMPI.
>
> As to run-time MPI options, Intel MPI has affinity with Westmere awareness turned on by default. I suppose testing without affinity settings, particularly when banging against all hyperthreads, is a more severe test of your application. Don't you get better results at 1 rank per core?
> --
> Tim Prince
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

"To preserve the freedom of the human mind then and freedom of the press, every spirit should be ready to devote itself to martyrdom; for as long as we may think as we will, and speak as we think, the condition of man will proceed in improvement."
  -- Thomas Jefferson, 1799