Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] shared memory (sm) module not working properly?
From: Nicolas Bock (nicolasbock_at_[hidden])
Date: 2010-01-15 17:48:18


Sorry, I forgot to give more details on what versions I am using:

OpenMPI 1.4
Ubuntu 9.10, kernel 2.6.31-16-generic #53-Ubuntu
gcc (Ubuntu 4.4.1-4ubuntu8) 4.4.1

On Fri, Jan 15, 2010 at 15:47, Nicolas Bock <nicolasbock_at_[hidden]> wrote:

> Hello list,
>
> I am running a job on a 4 quadcore AMD Opteron. This machine has 16 cores,
> which I can verify by looking at /proc/cpuinfo. However, when I run a job
> with
>
> mpirun -np 16 -mca btl self,sm job
>
> I get this error:
>
> --------------------------------------------------------------------------
> At least one pair of MPI processes are unable to reach each other for
> MPI communications. This means that no Open MPI device has indicated
> that it can be used to communicate between these processes. This is
> an error; Open MPI requires that all MPI processes be able to reach
> each other. This error can sometimes be the result of forgetting to
> specify the "self" BTL.
>
> Process 1 ([[56972,2],0]) is on host: rust
> Process 2 ([[56972,1],0]) is on host: rust
> BTLs attempted: self sm
>
> Your MPI job is now going to abort; sorry.
> --------------------------------------------------------------------------
>
> By adding the tcp btl I can run the job. I don't understand why openmpi
> claims that a pair of processes can not reach each other, all processor
> cores should have access to all memory after all. Do I need to set some
> other btl limit?
>
> nick
>
>