Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Problem of running 'mpirun' on a cross-compiled openmpi-1.6.5 for armv7
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-04-04 00:02:11


I'm afraid the current system will refuse to run without any shmem components. Even if you remove the error check for that situation, you may hit other problems where the system is expecting that framework to perform some function - not having an active module could cause an issue at that point.

Since you aren't going to use it anyway, does it really matter that it exists? Or is the problem that none of the shmem components can build or run in that environment? If so, then we can take a look at what might be involved in completely disabling it.

I'm afraid that hwloc isn't relevant here - doesn't really have anything to do with the shmem situation.

On Apr 1, 2014, at 2:52 PM, Allan Wu <allwu_at_[hidden]> wrote:

> Hello everyone,
>
> I am trying to run OpenMPI-1.6.5 on a Linux on a system based on ARM Cortex A9. The linux system and the hardware is provided by Xilinx Inc., and for those who may have related experiences the system is called Zynq, which is an embedded SoC system with ARM cores and FPGA fabrics. Xilinx has provided cross-compiler for the system, which I used to compile openmpi, and the compilation is successful. Here is the configuration script I used for the
>
> compilation:
> ./configure --build=arm-linux-gnueabi --host=armv7-linux-gnueabi \
> --disable-mpi-f77 --disable-mpi-f90 \
> --disable-mpi-cxx --prefix=`pwd`/install \
> --with-devel-headers --enable-binaries \
> --enable-shared --enable-static \
> --disable-mmap-shmem --disable-posix-shmem --disable-sysv-shmem \
> --disable-dlopen
>
> For the cross-compiler, I have set the environmental variables "CC" and "CXX".
>
> When I launch 'mpirun' on the ARM linux, I got the error like this:
>
> It looks like opal_init failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during opal_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> opal_shmem_base_select failed
> --> Returned value -1 instead of OPAL_SUCCESS
> --------------------------------------------------------------------------
> [ZC702:01353] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 79
> [ZC702:01353] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file orterun.c at line 694
>
> I have compressed the information from 'ompi-info --all' in the attachment.
>
> For some more related information, I have been tuning the configuration settings for a while, and I am afraid some of them may not be quite appropriate. My general goal is to enable message passing in my system of several such chips connected via Ethernet. So I will not launch more than one process on any single machine. That's why I wanted to disable the shared memory support. Although that doesn't change the outcome for me.
> I also got a lot of error messages on mca failing to find components, that is why I tried disable dlopen.
>
> I am also looking for suggestions. Basically I want to compile a "clean" version of OpenMPI with only the core message passing support, that may automatically get rid of a lot of the headache of the cross-compilation.
> When I searched through the documentation, I came to notice the idea of Portable Hardware locality (hwloc), however, the idea is completely new to me so I do not know if that would be relevant for my case.
>
> Thank you in advance for your suggestions! Please tell me if I need to provide further information of my system.
>
> Regards,
> --
> Di Wu (Allan)
> VAST Labortory (http://vast.cs.ucla.edu/),
> Department of Computer Science, UC Los Angeles
> Email: allwu_at_[hidden]
>
> <log.tar.gz>_______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/04/14440.php