Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Mac OS X Static PGI
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-03-01 21:20:21


On Mar 1, 2011, at 1:34 PM, David Robertson wrote:

> Hi,
>
> > Error means OMPI didn't find a network interface - do you have your
> > networks turned off? Sometimes people travel with Airport turned off.
> > If you haven wire connected, then no interfaces exist.
>
> I am logged in to the machine remotely through the wired interface. The Airport is always off. I have Open MPI built and running fine with gcc/ifort and gcc/gfortran using shared libraries. I have compiled and run successfully with both shared and static libraries with gcc/ifort. I have not tried the static libraries with gfortran/gcc.
>
> ifconfig gives me:
>
> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
> inet6 ::1 prefixlen 128
> inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
> inet 127.0.0.1 netmask 0xff000000
> gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
> stf0: flags=0<> mtu 1280
> en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
> ether 10:9a:dd:55:bb:52
> inet6 fe80::129a:ddff:fe55:bb52%en0 prefixlen 64 scopeid 0x4
> inet 192.168.30.13 netmask 0xffffc000 broadcast 192.168.63.255
> media: autoselect (1000baseT <full-duplex>)
> status: active
> fw0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 4078
> lladdr 70:cd:60:ff:fe:2f:01:8e
> media: autoselect <full-duplex>
> status: inactive
> en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
> ether c8:bc:c8:c9:fc:a9
> media: autoselect (<unknown type>)
> status: inactive
> vnic0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
> ether 00:1c:42:00:00:08
> inet 10.211.55.2 netmask 0xffffff00 broadcast 10.211.55.255
> media: autoselect
> status: active
> vnic1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
> ether 00:1c:42:00:00:09
> inet 10.37.129.2 netmask 0xffffff00 broadcast 10.37.129.255
> media: autoselect
> status: active
> vboxnet0: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
> ether 0a:00:27:00:00:00
>
> Are you saying that Open MPI is only looking for the Airport (en1) card and not en0?

No, it isn't. However, what the error message says is as I indicated - it is failing because it is getting an error when trying to open a port on an available network. I can't debug your network to find out why. I know that Mac doesn't really like (nor does Apple really support) static builds, and it has been a long time since I have built it that way on my Mac. Looking at my old static config file, I don't see anything special in it.

That said, I know we had some early problems with static builds on the Mac (like I said, Apple doesn't really support it). Those were solved, though, and none of those problems had this symptom.

Could be something strange about PGI and socket libs when running static, but I wouldn't know - I don't use PGI.

Sorry I can't be of help - I suggest asking PGI about issues re socket support with their compiler on the Mac, or not using PGI if they only support static builds given Apple's lack of support for that mode of operation on the Mac (seems bizarre that PGI would demand it).

> Why would it do that for PGI only?

It doesn't, nor does it care what compiler is used.

>
> Thanks,
> Dave
>
>
> On Mar 1, 2011, at 11:50 AM, David Robertson <robertson_at_[hidden]> wrote:
>
> > Hi all,
> >
> > I am having trouble with PGI on Mac OS X 10.6.6. PGI's support staff has informed me that PGI does not "support 64-bit shared library creation" on the Mac. Therefore, I have built Open MPI in static only mode (--disable-shared --enable-static).
> >
> > I have to do some manipulation to get my application to pass the final linking stage (more on that at the bottom) but I get an immediate crash at runtime:
> >
> >
> > <<<<<<<<<<<<<<<<<<<<<<<< start of output
> > bash-3.2$ mpirun -np 4 oceanG ocean_upwelling.in
> > [flask.marine.rutgers.edu:14186] opal_ifinit: unable to find network interfaces.
> > [flask.marine.rutgers.edu:14186] [[65522,0],0] ORTE_ERROR_LOG: Error in file ess_hnp_module.c at line 181
> > --------------------------------------------------------------------------
> > It looks like orte_init failed for some reason; your parallel process is
> > likely to abort. There are many reasons that a parallel process can
> > fail during orte_init; some of which are due to configuration or
> > environment problems. This failure appears to be an internal failure;
> > here's some additional information (which may only be relevant to an
> > Open MPI developer):
> >
> > orte_rml_base_select failed
> > --> Returned value Error (-1) instead of ORTE_SUCCESS
> > --------------------------------------------------------------------------
> > [flask.marine.rutgers.edu:14186] [[65522,0],0] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 132
> > --------------------------------------------------------------------------
> > It looks like orte_init failed for some reason; your parallel process is
> > likely to abort. There are many reasons that a parallel process can
> > fail during orte_init; some of which are due to configuration or
> > environment problems. This failure appears to be an internal failure;
> > here's some additional information (which may only be relevant to an
> > Open MPI developer):
> >
> > orte_ess_set_name failed
> > --> Returned value Error (-1) instead of ORTE_SUCCESS
> > --------------------------------------------------------------------------
> > [flask.marine.rutgers.edu:14186] [[65522,0],0] ORTE_ERROR_LOG: Error in file orterun.c at line 543
> > >>>>>>>>>>>>>>>>>>>>>>>> end of output
> >
> >
> > When I google for this error the only result I find is for a patch to version 1.1.2 which doesn't even resemble the current state of the Open MPI code.
> >
> > iMac info:
> >
> > ProductName: Mac OS X
> > ProductVersion: 10.6.6
> > BuildVersion: 10J567
> >
> > Has anyone seen this before or have an idea what to try?
> >
> > Thanks,
> > Dave
> >
> > P.S. I get the same results with Open MPI configured with:
> >
> > ./configure --prefix=/opt/pgisoft/openmpi/openmpi-1.4.3 CC=pgcc CXX=pgcpp F77=pgf77 FC=pgf90 --enable-mpirun-prefix-by-default --disable-shared --enable-static --without-memory-manager --without-libnuma --disable-ipv6 --disable-io-romio --disable-heterogeneous --enable-mpi-f77 --enable-mpi-f90 --enable-mpi-profile
> >
> > and
> >
> > ./configure --prefix=/opt/pgisoft/openmpi/openmpi-1.4.3 CC=pgcc CXX=pgcpp F77=pgf77 FC=pgf90 --disable-shared --enable-static
> >
> >
> >
> > P.P.S. Linking workarounds:
> >
> > Snow Leopard ships with Open MPI libraries that interfere when linking programs built with my compiled mpif90. The problem is that 'ld' searches every directory in the search path for shared objects before it will look for static archives. That means a line like:
> >
> > pgf90 x.o -o a.out -L/opt/openmpi/lib -lmpi_f90 -lmpi_f77 -lmpi
> >
> > will use the .a file in /opt/openmpi/lib because Snow Leopard doesn't ship with Fortran bindings but when it gets to -lmpi it picks up the libmpi.dylib from /usr/lib and causes undefined references. Note the line above is inferred using the -show:link option to mpif90.
> >
> > I have found two workarounds to this. Edit the share/openmpi/mpif90-wrapper-data.txt file to have full paths to the static libraries (this is what the PGI shipped version of Open MPI does). The other option is to add the line:
> >
> > switch -search_paths_first is replace(-search_paths_first) positional(linker);
> >
> > to the /path/to/pgi/bin/siterc file and set LDFLAGS to -search_paths_first in my application.
> >
> > from the ld manpage:
> >
> > -search_paths_first
> > By default the -lx and -weak-lx options first search for a file
> > of the form `libx.dylib' in each directory in the library search
> > path, then a file of the form `libx.a' is searched for in the
> > library search paths. This option changes it so that in each
> > path `libx.dylib' is searched for then `libx.a' before the next
> > path in the library search path is searched.
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users