Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI users] Communications problems w/OpenMPI
From: deadchicken_at_[hidden]
Date: 2008-12-18 03:15:13


I've been trying to get OpenMPI to work on Amazon's EC2 but I've been
running into a communications problem. Here is the source (typical
Hello, World):

> #include <stdio.h>
> #include "mpi.h"
>
> int main(argc,argv)
> int argc;
> char *argv[];
> {
> int myid, numprocs;
>
> MPI_Init(&argc,&argv);
> MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
> MPI_Comm_rank(MPI_COMM_WORLD,&myid);
>
> printf ("%d of %d: Hello world!\n", myid, numprocs);
>
> MPI_Finalize();
> return 0;
> }

After compiling it, I copied it over to the other machine and tried
running it with:

mpirun -v --mca btl self,tcp -np 4 --machinefile machines /mnt/mpihw

which produces:

--------------------------------------------------------------------------
Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

   PML add procs failed
   --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
--------------------------------------------------------------------------
Process 0.1.3 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

   PML add procs failed
   --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
[domU-12-31-39-02-F5-13:03965] [0,0,0]-[0,1,0] mca_oob_tcp_msg_recv:
readv failed: Connection reset by peer (104)
[domU-12-31-39-02-F5-13:03965] [0,0,0]-[0,1,2] mca_oob_tcp_msg_recv:
readv failed: Connection reset by peer (104)
mpirun noticed that job rank 0 with PID 3653 on node
domU-12-31-39-00-B2-23 exited on signal 15 (Terminated).
1 additional process aborted (not shown)

AFAIK, the machines are able to communicate with each other on any port
you like, just not with MPI. Any idea what's wrong?