Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem moving from 1.4 to 1.6
From: Jeffrey A Cummings (Jeffrey.A.Cummings_at_[hidden])
Date: 2014-06-27 14:11:56


Once again, you guys are assuming (incorrectly) that all your users are
working in an environment where everyone is free (based on corporate IT
policies) to do things like that. As an aside, you're also assuming that
all your users are Unix/Linux experts. I've been following this list for
several years and couldn't even begin to count the number of questions
from the non-experts who are struggling with something which is trivial
for you but not for them.

Jeffrey A. Cummings
Engineering Specialist
Performance Modeling and Analysis Department
Systems Analysis and Simulation Subdivision
Systems Engineering Division
Engineering and Technology Group
The Aerospace Corporation
571-307-4220
jeffrey.a.cummings_at_[hidden]

From: Reuti <reuti_at_[hidden]>
To: Open MPI Users <users_at_[hidden]>,
Date: 06/27/2014 02:03 PM
Subject: Re: [OMPI users] Problem moving from 1.4 to 1.6
Sent by: "users" <users-bounces_at_[hidden]>

Hi,

Am 27.06.2014 um 19:56 schrieb Jeffrey A Cummings:

> I appreciate your response and I understand the logic behind your
suggestion, but you and the other regular expert contributors to this list
are frequently working under a misapprehension. Many of your openMPI
users don't have any control over what version of openMPI is available on
their system. I'm stuck with whatever version my IT people choose to
bless, which in general is the (possibly old and/or moldy) version that is
bundled with some larger package (i.e., Rocks, Linux). The fact that I'm
only now seeing this 1.4 to 1.6 problem illustrates the situation I'm in.
I really need someone to did into their memory archives to see if they can
come up with a clue for me.

You can freely download the Open MPI source and install it for example in
your personal ~/local/openmpi-1.8 or alike. Pointing your $PATH and
$LD_LIBRARY_PATH to your own version will supersede installed system one.

-- Reuti

> Jeffrey A. Cummings
> Engineering Specialist
> Performance Modeling and Analysis Department
> Systems Analysis and Simulation Subdivision
> Systems Engineering Division
> Engineering and Technology Group
> The Aerospace Corporation
> 571-307-4220
> jeffrey.a.cummings_at_[hidden]
>
>
>
> From: Gus Correa <gus_at_[hidden]>
> To: Open MPI Users <users_at_[hidden]>,
> Date: 06/27/2014 01:45 PM
> Subject: Re: [OMPI users] Problem moving from 1.4 to 1.6
> Sent by: "users" <users-bounces_at_[hidden]>
>
>
>
> It may be easier to install the latest OMPI from the tarball,
> rather than trying to sort out the error.
>
> http://www.open-mpi.org/software/ompi/v1.8/
>
> The packaged built of (somewhat old) OMPI 1.6.2 that came with
> Linux may not have built against the same IB libraries, hardware,
> and configuration you have.
> [The error message reference to udapl is ominous.]
>
> > The mpirun command line contains the argument '--mca btl ^openib',
which
> > I thought told mpi to not look for the ib interface.
>
> As you said, the mca parameter above tells OMPI not to use openib,
> although it may not be the only cause of the problem.
> If you want to use openib switch to
> --mca btl openib,sm,self
>
> Another thing to check is whether there is a mixup of enviroment
> variables, PATH and LD_LIBRARY_PATH perhaps pointing to the old OMPI
> version you may have installed.
>
> My two cents,
> Gus Correa
>
> On 06/27/2014 12:53 PM, Jeffrey A Cummings wrote:
> > We have recently upgraded our cluster to a version of Linux which
comes
> > with openMPI version 1.6.2.
> >
> > An application which ran previously (using some version of 1.4) now
> > errors out with the following messages:
> >
> > librdmacm: Fatal: no RDMA devices found
> > librdmacm: Fatal: no RDMA devices found
> > librdmacm: Fatal: no RDMA devices found
> >
> >
--------------------------------------------------------------------------
> > WARNING: Failed to open "OpenIB-cma" [DAT_INTERNAL_ERROR:].
> > This may be a real error or it may be an invalid entry in the
> > uDAPL
> > Registry which is contained in the dat.conf file. Contact
your
> > local
> > System Administrator to confirm the availability of the
> > interfaces in
> > the dat.conf file.
> >
> >
--------------------------------------------------------------------------
> > [tupile:25363] 2 more processes have sent help message
> > help-mpi-btl-udapl.txt / dat_ia_open fail
> > [tupile:25363] Set MCA parameter "orte_base_help_aggregate"
to
> > 0 to see all help / error messages
> >
> > The mpirun command line contains the argument '--mca btl ^openib',
which
> > I thought told mpi to not look for the ib interface.
> >
> > Can anyone suggest what the problem might be? Did the relevant syntax
> > change between versions 1.4 and 1.6?
> >
> >
> > Jeffrey A. Cummings
> > Engineering Specialist
> > Performance Modeling and Analysis Department
> > Systems Analysis and Simulation Subdivision
> > Systems Engineering Division
> > Engineering and Technology Group
> > The Aerospace Corporation
> > 571-307-4220
> > jeffrey.a.cummings_at_[hidden]
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post:
http://www.open-mpi.org/community/lists/users/2014/06/24721.php
> >
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
http://www.open-mpi.org/community/lists/users/2014/06/24722.php
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
http://www.open-mpi.org/community/lists/users/2014/06/24723.php

_______________________________________________
users mailing list
users_at_[hidden]
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/06/24724.php