Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_PROC_NULL
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-02-13 08:39:12


What is your PATH / LD_LIBRARY_PATH when you rsh/ssh to other nodes?

ssh othernode which mpirun
ssh othernode env | grep PATH

On Feb 13, 2009, at 5:11 AM, jody wrote:

> Well all i do seems to verify that only one version is running:
>
> [jody_at_localhost 3D]$ ls -ld /opt/openmp*
> lrwxrwxrwx 1 root root 26 2009-02-13 14:09 /opt/openmpi ->
> /opt/openmpi-1.3.1a0r20534
> drwxr-xr-x 7 root root 4096 2009-02-12 22:19 /opt/
> openmpi-1.3.1a0r20432
> drwxr-xr-x 7 root root 4096 2009-02-12 21:58 /opt/
> openmpi-1.3.1a0r20520
> drwxr-xr-x 7 root root 4096 2009-02-13 13:46 /opt/
> openmpi-1.3.1a0r20534
> drwxr-xr-x 7 root root 4096 2009-02-12 22:41 /opt/openmpi-1.4a1r20525
> [jody_at_localhost 3D]$ echo $PATH
> /opt/openmpi/bin:/opt/jdk/jdk1.6.0_07/bin:/opt/jdk/jdk1.6.0_07/bin:/
> opt/jdk/jdk1.6.0_07/bin:/usr/kerberos/bin:/usr/lib/ccache:/usr/local/
> bin:/usr/bin:/bin:/usr/X11R6/bin:/home/jody/bin:/home/jody/utils
> [jody_at_localhost 3D]$ which mpirun
> /opt/openmpi/bin/mpirun
> [jody_at_localhost 3D]$ mpirun --version
> mpirun (Open MPI) 1.3.1a0r20534
>
> Report bugs to http://www.open-mpi.org/community/help/
> [jody_at_localhost 3D]$ /opt/openmpi-1.3.1a0r20534/bin/mpirun --version
> mpirun (Open MPI) 1.3.1a0r20534
>
> Report bugs to http://www.open-mpi.org/community/help/
> [jody_at_localhost 3D]$
>
> BTW the same strange misbehaviour happen with the other versions
>
> Jody
>
>
> On Fri, Feb 13, 2009 at 1:54 PM, jody <jody.xha_at_[hidden]> wrote:
>> Forgot to add.
>> i have /opt/openmpi/bin in my $PATH
>>
>> I tried around some more and found that it
>> also works without errors if use
>> /opt/openmpi/bin/mpirun -np 2 ./sr
>>
>> I don't understand this, because 'mpirun' alone should be the same
>> thing:
>> [jody_at_localhost 3D]$ which mpirun
>> /opt/openmpi/bin/mpirun
>>
>> Thank You for an explanation
>>
>> Jody
>>
>> On Fri, Feb 13, 2009 at 1:39 PM, jody <jody.xha_at_[hidden]> wrote:
>>> Yes, it was doing no sensible work -
>>> It was only intended to show the error message.
>>>
>>> I now downloaded the latest nightly tarball and installed it,
>>> and used your version of the test programm. It works -
>>> *if* is use the entire path to mpirun:
>>>
>>> [jody_at_localhost 3D]$ /opt/openmpi-1.3.1a0r20534/bin/mpirun -np
>>> 2 ./sr
>>>
>>> but if i use the name alone, i get the error:
>>>
>>> [jody_at_localhost 3D]$ mpirun -np 2 ./sr
>>> [localhost.localdomain:29285] *** An error occurred in MPI_Sendrecv
>>> [localhost.localdomain:29285] *** on communicator MPI_COMM_WORLD
>>> [localhost.localdomain:29285] *** MPI_ERR_RANK: invalid rank
>>> [localhost.localdomain:29285] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>> [localhost.localdomain:29286] *** An error occurred in MPI_Sendrecv
>>> [localhost.localdomain:29286] *** on communicator MPI_COMM_WORLD
>>> [localhost.localdomain:29286] *** MPI_ERR_RANK: invalid rank
>>> [localhost.localdomain:29286] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>
>>> interestingly, it seems to be the same version:
>>> [jody_at_localhost 3D]$ mpirun --version
>>> mpirun (Open MPI) 1.3.1a0r20534
>>>
>>> i.e. the version is ok.
>>>
>>> I have my Open-MPI versions installed in directories
>>> /opt/openmpi-1.xxx
>>> and create a link
>>> ln -s /opt/opnmpi-1.xxx /opt/openmpi
>>> I do it like this so i can easily switch between different version
>>>
>>> Could the diffferent behavour of mpirun and
>>> /opt/openmpi-1.3.1a0r20534/bin/mpirun
>>> hab its cause in this setup?
>>>
>>> Thank You
>>> Jody
>>>
>>> On Fri, Feb 13, 2009 at 1:18 AM, Jeff Squyres <jsquyres_at_[hidden]>
>>> wrote:
>>>> On Feb 12, 2009, at 2:00 PM, jody wrote:
>>>>
>>>>> In my application i use MPI_PROC_NULL
>>>>> as an argument in MPI_Sendrecv to simplify the
>>>>> program (i.e. no special cases for borders)
>>>>> With 1.3 it works, but under 1.3.1a0r20520
>>>>> i get the following error:
>>>>> [jody_at_localhost 3D]$ mpirun -np 2 ./sr
>>>>> [localhost.localdomain:29253] *** An error occurred in
>>>>> MPI_Sendrecv
>>>>> [localhost.localdomain:29253] *** on communicator MPI_COMM_WORLD
>>>>> [localhost.localdomain:29253] *** MPI_ERR_RANK: invalid rank
>>>>> [localhost.localdomain:29253] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>> [localhost.localdomain:29252] *** An error occurred in
>>>>> MPI_Sendrecv
>>>>> [localhost.localdomain:29252] *** on communicator MPI_COMM_WORLD
>>>>> [localhost.localdomain:29252] *** MPI_ERR_RANK: invalid rank
>>>>> [localhost.localdomain:29252] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>
>>>> Your program as written should hang, right? You're trying to
>>>> receive from
>>>> MCW rank 1 and no process is sending.
>>>>
>>>> I slightly modified your code:
>>>>
>>>> #include <stdio.h>
>>>> #include "mpi.h"
>>>>
>>>> int main() {
>>>> int iRank;
>>>> int iSize;
>>>> MPI_Status st;
>>>>
>>>> MPI_Init(NULL, NULL);
>>>> MPI_Comm_size(MPI_COMM_WORLD, &iSize);
>>>> MPI_Comm_rank(MPI_COMM_WORLD, &iRank);
>>>>
>>>> if (1 == iRank) {
>>>> MPI_Send(&iSize, 1, MPI_INT, 0, 77, MPI_COMM_WORLD);
>>>> } else if (0 == iRank) {
>>>> MPI_Sendrecv(&iRank, 1, MPI_INT, MPI_PROC_NULL, 77,
>>>> &iSize, 1, MPI_INT, 1, 77, MPI_COMM_WORLD, &st);
>>>> }
>>>>
>>>> MPI_Finalize();
>>>> return 0;
>>>> }
>>>>
>>>> And that works fine for me at the head of the v1.3 branch:
>>>>
>>>> [16:17] svbu-mpi:~/svn/ompi-1.3 % svnversion .
>>>> 20538
>>>>
>>>> We did have a few bad commits on the v1.3 branch recently; could
>>>> you try
>>>> with a tarball from tonight, perchance?
>>>>
>>>> --
>>>> Jeff Squyres
>>>> Cisco Systems
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems