Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Tim Prins (tprins_at_[hidden])
Date: 2007-10-24 20:23:15


Glad you found the problem.

Don't worry about the '--num_proc 3'. This does not refer to the number
of application processes, but rather the number of 'daemon' processes
plus 1 for mpirun. However, this is an internal interface which changes
on different versions of Open MPI, so this explanation is subject to
change :)

Tim

Jorge Parra wrote:
> Hi Tim,
>
> Thank you for your reply.
>
> You are right, my openMPI version is rather old. However I am stuck with
> it while I can compile v1.2.4. I have had some problems with it (I already
> opened a case on Oct 15th).
>
> You were also right about my hostname. uname -n reports (none) and the
> "hostname" command did not exist in the nodes of my cluster. I already
> added it to the nodes and modified the /etc/hosts file. The error went
> away and now I can see that orted runs in the remote node. It is strange
> to me that orted runs with --num_proc 3 when mpirun was executed with -np
> 2. Does this sound correct to you? I might open a new case for it
> though...
>
>
> Thank you for your help,
>
> Jorge
>
> On Mon, 22 Oct 2007, Tim Prins wrote:
>
>> Sorry to reply to my own mail.
>>
>> Just browsing through the logs you sent, and I see that 'hostname' should be
>> working fine. However, you are using v1.1.5 which is very old. I would
>> strongly suggest upgrading to v1.2.4. It is a huge improvement over the old
>> v1.1 series (which is not being maintained anymore).
>>
>> Tim
>>
>> On Monday 22 October 2007 08:41:30 pm Tim Prins wrote:
>>> Hi Jorge,
>>>
>>> This is interesting. The problem is the universe name:
>>> root@(none):default-universe
>>>
>>> The "(none)" part is supposed to be the hostname where mpirun is executed.
>>> Try running:
>>> hostname
>>>
>>> and:
>>> uname -n
>>>
>>> These should both return valid hostnames for your machine.
>>>
>>> Open MPI pretty much assumes that all nodes have a valid (preferably
>>> unique) hostname. If the above commands don't work, you probably need to
>>> fix your cluster.
>>>
>>> Let me know if this does not work.
>>>
>>> Thanks,
>>>
>>> Tim
>>>
>>> On Thursday 18 October 2007 09:22:09 pm Jorge Parra wrote:
>>>> Hi,
>>>>
>>>> When trying to execute an application that spawns to another node, I
>>>> obtain the following message:
>>>>
>>>> # ./mpirun --hostfile /root/hostfile -np 2 greetings
>>>> Syntax error: "(" unexpected (expecting ")")
>>>> -------------------------------------------------------------------------
>>>> - Could not execute the executable
>>>> "/opt/OpenMPI/OpenMPI-1.1.5b/exec/bin/greetings
>>>> ": Exec format error
>>>>
>>>> This could mean that your PATH or executable name is wrong, or that you
>>>> do not
>>>> have the necessary permissions. Please ensure that the executable is
>>>> able to be
>>>>
>>>> found and executed.
>>>> -------------------------------------------------------------------------
>>>> -
>>>>
>>>> and in the remote node:
>>>>
>>>> # pam_rhosts_auth[183]: user root has a `+' user entry
>>>> pam_rhosts_auth[183]: allowed to root_at_192.168.1.102 as root
>>>> PAM_unix[183]: (rsh) session opened for user root by (uid=0)
>>>> in.rshd[184]: root_at_192.168.1.102 as root: cmd='( ! [ -e ./.profile ] || .
>>>> ./.pro
>>>> file; orted --bootproxy 1 --name 0.0.1 --num_procs 3 --vpid_start 0
>>>> --nodename 1
>>>> 92.168.1.103 --universe root@(none):default-universe --nsreplica
>>>> "0.0.0;tcp://19
>>>> 2.168.1.102:32774" --gprreplica "0.0.0;tcp://192.168.1.102:32774"
>>>> --mpi-call-yie
>>>> ld 0 )'
>>>> PAM_unix[183]: (rsh) session closed for user root
>>>>
>>>> I suspect the command that rsh is trying to execute in the remote node
>>>> fails. It seems to me that the first parenthesis in cmd='( ! is not well
>>>> interpreted, thus causing the syntax error. This might prevent .profile
>>>> to run and to correctly set PATH. Therefore, "greetings" is not found.
>>>>
>>>> I am attaching to this email the appropiate configuration files of my
>>>> system and openmpi on it. This is a system in an isolated network, so I
>>>> don't care too much for security. Therefore I am using rsh on it.
>>>>
>>>> I would really appreciate any suggestions to correct this problem.
>>>>
>>>> Thank you,
>>>>
>>>> Jorge
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users