Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Problem of running 'mpirun' on a cross-compiled openmpi-1.6.5 for armv7
From: Allan Wu (allwu_at_[hidden])
Date: 2014-04-08 18:58:15


Thank you, Ralph and Jeff!

After I downloaded the new version (1.8) and recompiled it based on your
suggestions, it finally works for me, at least for a few helloworld-like
applications. For future references, here is the configuration script I
used:

./configure --build=arm-linux-gnueabi --host=armv7-linux-gnueabi \
--disable-mpi-fortran \
--disable-mpi-cxx \
--prefix=`pwd`/install \
--enable-static \
--disable-dlopen

As I mentioned in my original post, I disabled the shmem components because
I suspected the problem could be related to them. I thought since I do not
need them I can just disable them to see if that helps. I guess the
previous problem is more related to the specific ARM device than the shared
memory support.

Thanks again for the help. I will update if more problems come up.

Regards,
Allan

On Apr 7, 2014, at 10:29 PM, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]>
 wrote:

Also -- could you upgrade to Open MPI 1.8? It was released last week, and
should be much more ARM-friendly than Open MPI 1.6.x.

Also, I see some extra configure options. I suggest the following:

# I assume your --build and --host options are correct
# --disable-mpi-f77/f90 changed to --disable-mpi-fortran in 1.8
# You don't need --with-devel-headers; it's for OMPI developers only
# You don't need --enable-binaries; it's the default (and always will be)
# Do you really need *both* enable-static and enable-shared? Usually one
is sufficient
# --enable-static will automatically invoke --disable-dlopen
./configure --build=arm-linux-gnueabi --host=armv7-linux-gnueabi \
--disable-mpi-fortran \
--disable-mpi-cxx --prefix=`pwd`/install \
--enable-shared --enable-static \
--disable-mmap-shmem --disable-posix-shmem --disable-sysv-shmem \
--disable-dlopen

On Apr 4, 2014, at 12:02 AM, Ralph Castain <rhc_at_[hidden]> wrote:

> I'm afraid the current system will refuse to run without any shmem
components. Even if you remove the error check for that situation, you may
hit other problems where the system is expecting that framework to perform
some function - not having an active module could cause an issue at that
point.
>
> Since you aren't going to use it anyway, does it really matter that it
exists? Or is the problem that none of the shmem components can build or
run in that environment? If so, then we can take a look at what might be
involved in completely disabling it.
>
> I'm afraid that hwloc isn't relevant here - doesn't really have anything
to do with the shmem situation.
>
> On Apr 1, 2014, at 2:52 PM, Allan Wu <allwu_at_[hidden]> wrote:
>
>> Hello everyone,
>>
>> I am trying to run OpenMPI-1.6.5 on a Linux on a system based on ARM
Cortex A9. The linux system and the hardware is provided by Xilinx Inc.,
and for those who may have related experiences the system is called Zynq,
which is an embedded SoC system with ARM cores and FPGA fabrics. Xilinx has
provided cross-compiler for the system, which I used to compile openmpi,
and the compilation is successful. Here is the configuration script I used
for the
>>
>> compilation:
>> ./configure --build=arm-linux-gnueabi --host=armv7-linux-gnueabi \
>> --disable-mpi-f77 --disable-mpi-f90 \
>> --disable-mpi-cxx --prefix=`pwd`/install \
>> --with-devel-headers --enable-binaries \
>> --enable-shared --enable-static \
>> --disable-mmap-shmem --disable-posix-shmem --disable-sysv-shmem \
>> --disable-dlopen
>>
>> For the cross-compiler, I have set the environmental variables "CC" and
"CXX".
>>
>> When I launch 'mpirun' on the ARM linux, I got the error like this:
>>
>> It looks like opal_init failed for some reason; your parallel process is
>> likely to abort. There are many reasons that a parallel process can
>> fail during opal_init; some of which are due to configuration or
>> environment problems. This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>> opal_shmem_base_select failed
>> --> Returned value -1 instead of OPAL_SUCCESS
>> ------------------------------------------------------------
--------------
>> [ZC702:01353] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file
runtime/orte_init.c at line 79
>> [ZC702:01353] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file
orterun.c at line 694
>>
>> I have compressed the information from 'ompi-info --all' in the
attachment.
>>
>> For some more related information, I have been tuning the configuration
settings for a while, and I am afraid some of them may not be quite
appropriate. My general goal is to enable message passing in my system of
several such chips connected via Ethernet. So I will not launch more than
one process on any single machine. That's why I wanted to disable the
shared memory support. Although that doesn't change the outcome for me.
>> I also got a lot of error messages on mca failing to find components,
that is why I tried disable dlopen.
>>
>> I am also looking for suggestions. Basically I want to compile a "clean"
version of OpenMPI with only the core message passing support, that may
automatically get rid of a lot of the headache of the cross-compilation.
>> When I searched through the documentation, I came to notice the idea of
Portable Hardware locality (hwloc), however, the idea is completely new to
me so I do not know if that would be relevant for my case.
>>
>> Thank you in advance for your suggestions! Please tell me if I need to
provide further information of my system.
>>
>> Regards,
>> --
>> Di Wu (Allan)
>> VAST Labortory (http://vast.cs.ucla.edu/),
>> Department of Computer Science, UC Los Angeles
>> Email: allwu_at_[hidden]
>>
>> <log.tar.gz>_______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
http://www.open-mpi.org/community/lists/devel/2014/04/14440.php
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
http://www.open-mpi.org/community/lists/devel/2014/04/14459.php

--
Di Wu (Allan)
PhD student, VAST Laboratory <http://vast.cs.ucla.edu/>,
Department of Computer Science, UC Los Angeles
Email: allwu_at_[hidden]