Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] trunk segfault
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-03-27 08:07:00


Lenny --

Did this get fixed? We were mucking with some mca param stuff on the
trunk yesterday; not sure if it was related to this failure or not.

On Mar 26, 2008, at 10:34 AM, Lenny Verkhovsky wrote:
> Hi, all
>
> I compiled and builded source from trunk
> and it causes segfault
>
> /home/USERS/lenny/OMPI_ORTE_NEW/bin/mpirun -np 1 -H witch17 /home/
> USERS/lenny/TESTS/ORTE/mpi_p01_NEW -t lt
>
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process
> is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or
> environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
> mca_mpi_register_params() failed
> --> Returned "Error" (-1) instead of "Success" (0)
> --------------------------------------------------------------------------
> [witch17:01220] *** Process received signal ***
> [witch17:01220] Signal: Segmentation fault (11)
> [witch17:01220] Signal code: (128)
> [witch17:01220] Failing at address: (nil)
> [witch17:01220] [ 0] /lib64/libpthread.so.0 [0x2aadf7072c10]
> [witch17:01220] [ 1] /home/USERS/lenny/OMPI_ORTE_NEW/lib/libopen-
> pal.so.0(free+0x56) [0x2aadf6acb6d6]
> [witch17:01220] [ 2] /home/USERS/lenny/OMPI_ORTE_NEW/lib/libopen-
> pal.so.0(opal_argv_free+0x25) [0x2aadf6ab9635]
> [witch17:01220] [ 3] /home/USERS/lenny/OMPI_ORTE_NEW/lib/libmpi.so.0
> [0x2aadf67f4206]
> [witch17:01220] [ 4] /home/USERS/lenny/OMPI_ORTE_NEW/lib/libmpi.so.
> 0(MPI_Init+0xf0) [0x2aadf68117c0]
> [witch17:01220] [ 5] /home/USERS/lenny/TESTS/ORTE/mpi_p01_NEW(main
> +0xef) [0x40109f]
> [witch17:01220] [ 6] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x2aadf7199154]
> [witch17:01220] [ 7] /home/USERS/lenny/TESTS/ORTE/mpi_p01_NEW
> [0x400ee9]
> [witch17:01220] *** End of error message ***
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 1220 on node witch17
> exited on signal 11 (Segmentation fault).
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems