Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Cuda Aware MPI Problem
From: Özgür Pekçağlıyan (ozgur.pekcagliyan_at_[hidden])
Date: 2013-12-13 08:02:44


Hello again,

I have compiled openmpi--1.9a1r29873 from nightly build trunk and so far
everything looks alright. But I have not test the cuda support yet.

On Fri, Dec 13, 2013 at 2:38 PM, Özgür Pekçağlıyan <
ozgur.pekcagliyan_at_[hidden]> wrote:

> Hello,
>
> I am having difficulties with compiling openMPI with CUDA support. I have
> followed this (http://www.open-mpi.org/faq/?category=building#build-cuda)
> faq entry. As below;
>
> $ cd openmpi-1.7.3/
> $ ./configure --with-cuda=/urs/local/cuda-5.5
> $ make all install
>
> everything goes perfect during compilation. But when I try to execute
> simplest mpi hello world application I got following error;
>
> $ mpicc hello.c -o hello
> $ mpirun -np 2 hello
>
> hello: symbol lookup error: /usr/local/lib/openmpi/mca_pml_ob1.so:
> undefined symbol: progress_one_cuda_htod_event
> hello: symbol lookup error: /usr/local/lib/openmpi/mca_pml_ob1.so:
> undefined symbol: progress_one_cuda_htod_event
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 0 with PID 30329 on
> node cudalab1 exiting improperly. There are three reasons this could occur:
>
> 1. this process did not call "init" before exiting, but others in
> the job did. This can cause a job to hang indefinitely while it waits
> for all processes to call "init". By rule, if one process calls "init",
> then ALL processes must call "init" prior to termination.
>
> 2. this process called "init", but exited without calling "finalize".
> By rule, all processes that call "init" MUST call "finalize" prior to
> exiting or it will be considered an "abnormal termination"
>
> 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
> orte_create_session_dirs is set to false. In this case, the run-time cannot
> detect that the abort call was an abnormal termination. Hence, the only
> error message you will receive is this one.
>
> This may have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
>
> You can avoid this message by specifying -quiet on the mpirun command line.
>
> --------------------------------------------------------------------------
>
> $ mpirun -np 1 hello
>
> hello: symbol lookup error: /usr/local/lib/openmpi/mca_pml_ob1.so:
> undefined symbol: progress_one_cuda_htod_event
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 0 with PID 30327 on
> node cudalab1 exiting improperly. There are three reasons this could occur:
>
> 1. this process did not call "init" before exiting, but others in
> the job did. This can cause a job to hang indefinitely while it waits
> for all processes to call "init". By rule, if one process calls "init",
> then ALL processes must call "init" prior to termination.
>
> 2. this process called "init", but exited without calling "finalize".
> By rule, all processes that call "init" MUST call "finalize" prior to
> exiting or it will be considered an "abnormal termination"
>
> 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
> orte_create_session_dirs is set to false. In this case, the run-time cannot
> detect that the abort call was an abnormal termination. Hence, the only
> error message you will receive is this one.
>
> This may have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
>
> You can avoid this message by specifying -quiet on the mpirun command line.
>
> --------------------------------------------------------------------------
>
>
> Any suggestions?
> I have two PCs with Intel I3 CPUs and Geforce GTX 480 GPUs.
>
>
> And here is the hello.c file;
>
> #include <stdio.h>
> #include <mpi.h>
>
>
> int main (int argc, char **argv)
> {
> int rank, size;
>
> MPI_Init (&argc, &argv); /* starts MPI */
> MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */
> MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */
> printf( "Hello world from process %d of %d\n", rank, size );
> MPI_Finalize();
> return 0;
> }
>
>
>
>
> --
> Özgür Pekçağlıyan
> B.Sc. in Computer Engineering
> M.Sc. in Computer Engineering
>

-- 
Özgür Pekçağlıyan
B.Sc. in Computer Engineering
M.Sc. in Computer Engineering