Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Cezary Sliwa (sliwa_at_[hidden])
Date: 2006-03-10 06:01:06


Jeff Squyres wrote:
> Please note that I replied to your original post:
>
> http://www.open-mpi.org/community/lists/users/2006/02/0712.php
>
> Was that not sufficient? If not, please provide more details on what
> you are attempting to do and what is occurring. Thanks.
>
>
I have a simple program in which the rank 0 task dispatches compute
tasks to other processes. It works fine on one 4-way SMP machine, but
when I try to run it on two nodes, the processes on the other machine
seem to spin in a loop inside MPI_SEND (a message is not delivered).

Cezary Sliwa

>
> On Mar 7, 2006, at 2:36 PM, Cezary Sliwa wrote:
>
>
>> Hello again,
>>
>> The problem is that MPI_SEND blocks forever (the message is still
>> not delivered after many hours).
>>
>> Cezary Sliwa
>>
>>
>> From: Cezary Sliwa <sliwa_at_[hidden]>
>> Date: February 22, 2006 10:07:04 AM EST
>> To: users_at_[hidden]
>> Subject: MPI_SEND blocks when crossing node boundary
>>
>>
>>
>> My program runs fine with openmpi-1.0.1 when run from the command
>> line (5 processes with empty host file), but when I schedule it
>> with qsub to run on 2 nodes it blocks on MPI_SEND
>>
>> (gdb) info stack
>> #0 0x00000034db30c441 in __libc_sigaction () from /lib64/tls/
>> libpthread.so.0
>> #1 0x0000000000573002 in opal_evsignal_recalc ()
>> #2 0x0000000000582a3c in poll_dispatch ()
>> #3 0x00000000005729f2 in opal_event_loop ()
>> #4 0x0000000000577e68 in opal_progress ()
>> #5 0x00000000004eed4a in mca_pml_ob1_send ()
>> #6 0x000000000049abdd in PMPI_Send ()
>> #7 0x0000000000499dc0 in pmpi_send__ ()
>> #8 0x000000000042d5d8 in MAIN__ () at main.f:90
>> #9 0x00000000005877de in main (argc=Variable "argc" is not available.
>> )
>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>