Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Segmentation Fault--libc.so.6(__libc_start_main...
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-09-22 11:05:15


Aurelien's advice is good -- check and see exactly what the debugger
is telling you. You might want to look at the corefile in the
debugger and see exactly where it failed -- it may or may not be an
MPI issue.

Also -- Aurelien didn't directly say it, but don't worry about the
OMPI_DECLSPEC stuff. You'll see earlier in mpi.h that OMPI_DECLSPEC
is #define'd to be empty (it's for Windows compatibility).

Keep in mind that although different MPI implementations provide
source code compatibility for MPI applications, they are not binary-
portable.

So if you compile an MPI application with MPICH's wrapper compilers,
it will not run properly under Open MPI's mpirun (and vice versa).
You must entirely compile your application with Open MPI's wrapper
compilers and then run it with Open MPI's mpirun.

On Sep 21, 2008, at 12:35 PM, Aurélien Bouteiller wrote:

> Are you sure that you have matching versions of the MPI library and
> mpi.h file ? Open MPI and MPICH have different internal types for
> the opaque MPI objects (such as MPI_Comm). If you have mismatching
> mpi.h and mpi library, you'll transmit those as integer to the
> library while it is expecting pointers. This will obviously segfault
> very badly. Please make sure that you actually use the mpi.h from
> open MPI (by using Open MPI's mpicc) to compile your program when
> using Open MPI. Also make sure that you don't have another version
> of libmpi in your LD_LIBRARY_PATH that could be used instead of the
> one you used to compile.
>
> Aurelien
>
> Le 21 sept. 08 à 04:38, Shafagh Jafer a écrit :
>
>>
>> Ok. I noticed that whenever in my code, i use an MPI fucntion that
>> has
>> "OMPI_DECLSPEC" in front of it in mpi.h , I get this segfault
>> error. Could some one please tell me what is "OMPI_DECLSPEC"?? is
>> it a macro that I need to enable ?!?
>> forexample, in MPICH the function getsize in mpi.h looks like the
>> following:
>>
>> int MPI_Comm_size(MPI_Comm, int *);
>>
>> but the same function in OMPI apears as follows:
>> OMPI_DECLSPEC int MPI_Comm_size(MPI_Comm comm, int *size);
>>
>> --- On Sat, 9/20/08, Shafagh Jafer <barfy27_at_[hidden]> wrote:
>> From: Shafagh Jafer <barfy27_at_[hidden]>
>> Subject: Re: [OMPI users] Segmentation Fault--libc.so.
>> 6(__libc_start_main...
>> To: "Open MPI Users" <users_at_[hidden]>
>> Date: Saturday, September 20, 2008, 9:50 PM
>>
>> My code was working perfect when I had it with MPICH now I have
>> replaced that with OMPI. Could that be the problem?? Do I need to
>> change any part of my source code if I migrate from MPICH-1.2.6 to
>> OpenMPI-1.2.7?? Please let me know.
>>
>> --- On Sat, 9/20/08, Aurélien Bouteiller <bouteill_at_[hidden]>
>> wrote:
>> From: Aurélien Bouteiller <bouteill_at_[hidden]>
>> Subject: Re: [OMPI users] Segmentation Fault--libc.so.
>> 6(__libc_start_main...
>> To: "Open MPI Users" <users_at_[hidden]>
>> Date: Saturday, September 20, 2008, 6:54 AM
>>
>> Shafagh,
>>
>> You have a segfault in your own code. Because Open MPI detects it, it
>> forwards the error to you and pretty print it but Open MPI is not the
>> source of the bug. From the stack trace, I suggest you gdb debug the
>> PhysicalGetID function.
>>
>> Aurelien
>>
>> Le 19 sept. 08 à 22:22, Shafagh Jafer a écrit :
>>
>> > Hi every one,
>> > I need urgent help plz :-(
>> > I am getting the following error when i run my program. OpenMPI
>> > compilation was all fine and went well, but now i dont understand
>> > the source of this error:
>> > ============================================
>> > [node01:29264] *** Process received signal ***
>> > [node01:29264] Signal: Segmentation fault (11)
>> > [node01:29264] Signal code: Address not mapped (1)
>> > [node01:29264] Failing at address: 0xcf
>> > [node01:29264] [ 0] /lib/tls/libpthread.so.0 [0x7ccf80]
>> > [node01:29264] [ 1] /nfs/sjafer/phd/openMPI/latest_cd++_timewarp/
>> cd++
>> > (physicalGetId__C10CommPhyMPI+0x14) [0x8305880]
>> > [node01:29264] [ 2] /nfs/sjafer/phd/openMPI/latest_cd++_timewarp/
>> cd++
>> > (physicalCommGetId__Fv+0x43) [0x82ff81b]
>> > [node01:29264] [ 3] /nfs/sjafer/phd/openMPI/latest_cd++_timewarp/
>> cd++
>> > (openComm__16StandAloneLoader+0x1f) [0x80fdf43]
>> > [node01:29264] [ 4] /nfs/sjafer/phd/openMPI/latest_cd++_timewarp/
>> cd++
>> > (run__21ParallelMainSimulator+0x1640) [0x81ea53c]
>> > [node01:29264] [ 5] /nfs/sjafer/phd/openMPI/latest_cd++_timewarp/
>> cd++
>> > (main+0xde) [0x80a58ce]
>> > [node01:29264] [ 6] /lib/tls/libc.so.6(__libc_start_main+0xda)
>> > [0xe3d79a]
>> > [node01:29264] [ 7] /nfs/sjafer/phd/openMPI/latest_cd++_timewarp/
>> cd++
>> > (sinh+0x4d) [0x80a2221]
>> > [node01:29264] *** End of error message ***
>> > mpirun noticed that job rank 0 with PID 29264 on node node01 exited
>> > on signal 11 (Segmentation fault).
>> > ===========================================
>> >
>> > _______________________________________________
>> > users mailing list
>> > users_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> --
>> * Dr. Aurélien Bouteiller
>> * Sr. Research Associate at Innovative Computing Laboratory
>> * University of Tennessee
>> * 1122 Volunteer Boulevard, suite 350
>> * Knoxville, TN 37996
>> * 865 974 6321
>>
>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems