Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Begginers question: why does this program hangs?
From: Andreas Schäfer (gentryx_at_[hidden])
Date: 2008-03-18 07:48:28


OK, this is strange. I've rerun the test and got it to block,
too. Although repeated tests show that those are rare (sometimes the
program runs smoothly without blocking, but in about 30% of the cases
it hangs just like you said).

On 08:11 Tue 18 Mar , Giovani Faccin wrote:
> I'm using openmpi-1.2.5. It was installed using my distro's (Gentoo) default package:
>
> sys-cluster/openmpi-1.2.5 USE="fortran ipv6 -debug -heterogeneous -nocxx -pbs -romio -smp -threads"

Just like me.

I've attached gdb to all three processes. On rank 0 I get the
following backtrace:

(gdb) bt
#0 0x00002ada849b3f16 in mca_btl_sm_component_progress ()
   from /usr/lib64/openmpi/mca_btl_sm.so
#1 0x00002ada845a71da in mca_bml_r2_progress () from /usr/lib64/openmpi/mca_bml_r2.so
#2 0x00002ada7e6fbbea in opal_progress () from /usr/lib64/libopen-pal.so.0
#3 0x00002ada8439a9a5 in mca_pml_ob1_recv () from /usr/lib64/openmpi/mca_pml_ob1.so
#4 0x00002ada7e2570a8 in PMPI_Recv () from /usr/lib64/libmpi.so.0
#5 0x000000000040c9ae in MPI::Comm::Recv ()
#6 0x0000000000409607 in main ()

On rank 1:

(gdb) bt
#0 0x00002baa6869bcc0 in mca_btl_sm_send () from /usr/lib64/openmpi/mca_btl_sm.so
#1 0x00002baa6808a93d in mca_pml_ob1_send_request_start_copy ()
   from /usr/lib64/openmpi/mca_pml_ob1.so
#2 0x00002baa680855f6 in mca_pml_ob1_send () from /usr/lib64/openmpi/mca_pml_ob1.so
#3 0x00002baa61f43182 in PMPI_Send () from /usr/lib64/libmpi.so.0
#4 0x000000000040ca04 in MPI::Comm::Send ()
#5 0x0000000000409700 in main ()

On rank 2:

(gdb) bt
#0 0x00002b933d555ac7 in sched_yield () from /lib/libc.so.6
#1 0x00002b9341efe775 in mca_pml_ob1_send () from /usr/lib64/openmpi/mca_pml_ob1.so
#2 0x00002b933bdbc182 in PMPI_Send () from /usr/lib64/libmpi.so.0
#3 0x000000000040ca04 in MPI::Comm::Send ()
#4 0x0000000000409700 in main ()

Anyone got a clue?

-- 
============================================
Andreas Schäfer
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany
PGP/GPG key via keyserver
I'm a bright... http://www.the-brights.net
============================================
(\___/)
(+'.'+)
(")_(")
This is Bunny. Copy and paste Bunny into your 
signature to help him gain world domination!


  • application/pgp-signature attachment: stored