Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI 1.3.1 svn Debian trouble
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-03-30 21:29:10


Can you supply all the information listed here:

   http://www.open-mpi.org/community/help/

We need to know exactly how you are invoking mpirun, what MCA
parameters have been set, etc.

On Mar 28, 2009, at 12:37 PM, Jerome BENOIT wrote:

> Hello List,
>
> I have just tried the current SVN Debian package:
> it does not work even without firewall
>
> Please find in attachement my test files and the associated outputs.
>
> hth,
> Jerome
>
> Manuel Prinz wrote:
> > Am Freitag, den 27.03.2009, 20:34 +0800 schrieb Jerome BENOIT:
> >> I have just tried the Sid package (1.3-2), but it does not work
> properly
> >> (when the firewall are off)
> >
> > Though this should work, the version in Sid is broken in other
> respects.
> > I do not recommend using it.
> >
> >> I have just read that the current stable version for OpenMPI is
> now 1.3.1:
> >> I will give it a try once it will be packaged in Sid.
> >
> > I'm the Open MPI maintainer in Debian and am planning to upload a
> fixed
> > version soon, possible around middle of next week. (It has to be
> > coordinated with the release team.) There is a working version
> availble
> > in SVN (try "debcheckout openmpi"). You can either build it
> yourself or
> > I could build it for you. You can mail me in private if you'd like
> me to
> > do so. Please not that installing the new version will break other
> > MPI-related Debian packages. I can explain you the details in
> private
> > mail since I think this is off-topic for the list.
> >
> > Best regards
> > Manuel
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
> --
> Jerome BENOIT
> jgmbenoit_at_mailsnare_dot_net
>
> // `phello.c' MPI C file
> //
> // last major modification 2003/01/05
> // last minor modification 2009/03/27
> //
>
> // mpicc -o phello phello.c
> // mpirun -np 5 phello
>
> #include <unistd.h>
> #include <stdio.h>
> #include <mpi.h>
>
> int main(int narg, char *args[]){
> int rank,size;
> char ProcessorName[MPI_MAX_PROCESSOR_NAME];
> int ProcessorNameLength;
>
> MPI_Init(&narg,&args);
>
> MPI_Comm_rank(MPI_COMM_WORLD,&rank);
> MPI_Comm_size(MPI_COMM_WORLD,&size);
>
> MPI_Get_processor_name(ProcessorName,&ProcessorNameLength);
> sleep(11);
> fprintf(stdout,
> "Hello world! I am %d of %d and my name is `%s'\n",
> rank,size,
> ProcessorName);
>
> MPI_Finalize();
>
> return 0; }
>
> //
> // End of file `phello.c'.
>
> <phello.sh>[green][[7042,1],6][../../../../../../ompi/mca/btl/tcp/
> btl_tcp_component.c:596:mca_btl_tcp_component_create_listen] bind()
> failed: Permission denied (13)
> [green][[7042,1],5][../../../../../../ompi/mca/btl/tcp/
> btl_tcp_component.c:596:mca_btl_tcp_component_create_listen] bind()
> failed: Permission denied (13)
> --------------------------------------------------------------------------
> At least one pair of MPI processes are unable to reach each other for
> MPI communications. This means that no Open MPI device has indicated
> that it can be used to communicate between these processes. This is
> an error; Open MPI requires that all MPI processes be able to reach
> each other. This error can sometimes be the result of forgetting to
> specify the "self" BTL.
>
> Process 1 ([[7042,1],2]) is on host: violet
> Process 2 ([[7042,1],5]) is on host: green
> BTLs attempted: self sm tcp
>
> Your MPI job is now going to abort; sorry.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process
> is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or
> environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> PML add procs failed
> --> Returned "Unreachable" (-12) instead of "Success" (0)
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [violet:12941] Abort before MPI_INIT completed successfully; not
> able to guarantee that all other processes were killed!
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [green:13026] Abort before MPI_INIT completed successfully; not able
> to guarantee that all other processes were killed!
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 2 with PID 12941 on
> node violet exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [blue:15300] Abort before MPI_INIT completed successfully; not able
> to guarantee that all other processes were killed!
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [indigo:12605] Abort before MPI_INIT completed successfully; not
> able to guarantee that all other processes were killed!
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [red:12874] Abort before MPI_INIT completed successfully; not able
> to guarantee that all other processes were killed!
> *** An error occurred in MPI_Init
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [orange:14888] Abort before MPI_INIT completed successfully; not
> able to guarantee that all other processes were killed!
> *** before MPI was initialized
> *** An error occurred in MPI_Init
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> *** before MPI was initialized
> [yellow:11441] Abort before MPI_INIT completed successfully; not
> able to guarantee that all other processes were killed!
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [orange:14887] Abort before MPI_INIT completed successfully; not
> able to guarantee that all other processes were killed!
> [yellow:11440] Abort before MPI_INIT completed successfully; not
> able to guarantee that all other processes were killed!
> [violet:12932] 8 more processes have sent help message help-mca-bml-
> r2.txt / unreachable proc
> [violet:12932] Set MCA parameter "orte_base_help_aggregate" to 0 to
> see all help / error messages
> [violet:12932] 8 more processes have sent help message help-mpi-
> runtime / mpi_init:startup:internal-failure
>
> /local/benoit
> /scratch/benoit
> /local/benoit
> /local/benoit/FAKEROOT
>
> <ATT8008991.txt>

-- 
Jeff Squyres
Cisco Systems