Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] OpenMPI out of band TCP retry exceeded
From: Sindhi, Waris PW (Waris.Sindhi_at_[hidden])
Date: 2011-04-28 16:03:18


The --prefix directory is a typo and no longer exists on our system.

We are running 1.4-4 version of OpenMPI

% /opt/openmpi/x86_64/bin/ompi_info

                 Package: Open MPI
mockbuild_at_[hidden] Distribution
                Open MPI: 1.4
   Open MPI SVN revision: r22285
   Open MPI release date: Dec 08, 2009
                Open RTE: 1.4

Sincerely,

Waris Sindhi
High Performance Computing, TechApps
Pratt & Whitney, UTC
(860)-565-8486

-----Original Message-----
From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On
Behalf Of Ralph Castain
Sent: Thursday, April 28, 2011 9:02 AM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI out of band TCP retry exceeded

On Apr 28, 2011, at 6:49 AM, Jeff Squyres wrote:

> On Apr 28, 2011, at 8:45 AM, Ralph Castain wrote:
>
>> What lead you to conclude 1.2.8?
>>
>>>>>> /opt/openmpi/i386/bin/mpirun -mca btl_openib_verbose 1 --mca btl
^tcp
>>>>>> --mca pls_ssh_agent ssh -mca oob_tcp_peer_retries 1000 --prefix
>>>>>> /usr/lib/openmpi/1.2.8-gcc/bin -np 239 --app procgroup
>
> His command line has "1.2.8" in it.

Actually, that isn't totally correct and may point to the problem. The
mpirun cmd itself points to a version of OMPI located in /opt/openmpi.
The error messages are clearly from a 1.3+ version - they look totally
different for 1.2

However, the prefix passed to the backend nodes points to /usr/lib, and
indeed looks like a 1.2.8 version.

Waris: is this a mistype? Are these two versions actually the same?

If not, that would explain the problem - you can't mix OMPI versions. As
written, the cmd line has the potential to mix one version of mpirun
with another version of the daemons.

>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users