Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI out of band TCP retry exceeded
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-04-28 16:19:38


We figured out that in the case where you provide the full path to mpirun -and- the -prefix option, we ignore the latter anyway. :-/

I'm working on a patch to at least warn you we are ignoring it.

On Apr 28, 2011, at 2:03 PM, Sindhi, Waris PW wrote:

> The --prefix directory is a typo and no longer exists on our system.
>
> We are running 1.4-4 version of OpenMPI
>
> % /opt/openmpi/x86_64/bin/ompi_info
>
> Package: Open MPI
> mockbuild_at_[hidden] Distribution
> Open MPI: 1.4
> Open MPI SVN revision: r22285
> Open MPI release date: Dec 08, 2009
> Open RTE: 1.4
>
>
> Sincerely,
>
> Waris Sindhi
> High Performance Computing, TechApps
> Pratt & Whitney, UTC
> (860)-565-8486
>
> -----Original Message-----
> From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On
> Behalf Of Ralph Castain
> Sent: Thursday, April 28, 2011 9:02 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] OpenMPI out of band TCP retry exceeded
>
>
> On Apr 28, 2011, at 6:49 AM, Jeff Squyres wrote:
>
>> On Apr 28, 2011, at 8:45 AM, Ralph Castain wrote:
>>
>>> What lead you to conclude 1.2.8?
>>>
>>>>>>> /opt/openmpi/i386/bin/mpirun -mca btl_openib_verbose 1 --mca btl
> ^tcp
>>>>>>> --mca pls_ssh_agent ssh -mca oob_tcp_peer_retries 1000 --prefix
>>>>>>> /usr/lib/openmpi/1.2.8-gcc/bin -np 239 --app procgroup
>>
>> His command line has "1.2.8" in it.
>
> Actually, that isn't totally correct and may point to the problem. The
> mpirun cmd itself points to a version of OMPI located in /opt/openmpi.
> The error messages are clearly from a 1.3+ version - they look totally
> different for 1.2
>
> However, the prefix passed to the backend nodes points to /usr/lib, and
> indeed looks like a 1.2.8 version.
>
> Waris: is this a mistype? Are these two versions actually the same?
>
> If not, that would explain the problem - you can't mix OMPI versions. As
> written, the cmd line has the potential to mix one version of mpirun
> with another version of the daemons.
>
>
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users