Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Segmentation fault - Address not mapped
From: Catalin David (catalindavid2003_at_[hidden])
Date: 2009-07-07 08:08:49


Thank you very much for the help and assistance :)

Using -isystem /users/cluster/cdavid/local/include the program now
runs fine (loads the correct mpi.h).

Thank you again,

Catalin

On Tue, Jul 7, 2009 at 12:29 PM, Catalin
David<catalindavid2003_at_[hidden]> wrote:
>  #include <mpi.h>
>  #include <stdio.h>
>  int main(int argc, char *argv[])
>  {
>   printf("%d %d %d\n", OMPI_MAJOR_VERSION,
> OMPI_MINOR_VERSION,OMPI_RELEASE_VERSION);
>   return 0;
>  }
>
> returns:
>
> test.cpp: In function ‘int main(int, char**)’:
> test.cpp:11: error: ‘OMPI_MAJOR_VERSION’ was not declared in this scope
> test.cpp:11: error: ‘OMPI_MINOR_VERSION’ was not declared in this scope
> test.cpp:11: error: ‘OMPI_RELEASE_VERSION’ was not declared in this scope
>
> So, I am definitely using another library (mpich).
>
> Thanks one more time!!! I will try to fix it and come back with results.
>
> Catalin
>
> On Tue, Jul 7, 2009 at 12:23 PM, Dorian Krause<doriankrause_at_[hidden]> wrote:
>> Catalin David wrote:
>>>
>>> Hello, all!
>>>
>>> Just installed Valgrind (since this seems like a memory issue) and got
>>> this interesting output (when running the test program):
>>>
>>> ==4616== Syscall param sched_setaffinity(mask) points to unaddressable
>>> byte(s)
>>> ==4616==    at 0x43656BD: syscall (in /lib/tls/libc-2.3.2.so)
>>> ==4616==    by 0x4236A75: opal_paffinity_linux_plpa_init
>>> (plpa_runtime.c:37)
>>> ==4616==    by 0x423779B:
>>> opal_paffinity_linux_plpa_have_topology_information (plpa_map.c:501)
>>> ==4616==    by 0x4235FEE: linux_module_init (paffinity_linux_module.c:119)
>>> ==4616==    by 0x447F114: opal_paffinity_base_select
>>> (paffinity_base_select.c:64)
>>> ==4616==    by 0x444CD71: opal_init (opal_init.c:292)
>>> ==4616==    by 0x43CE7E6: orte_init (orte_init.c:76)
>>> ==4616==    by 0x4067A50: ompi_mpi_init (ompi_mpi_init.c:342)
>>> ==4616==    by 0x40A3444: PMPI_Init (pinit.c:80)
>>> ==4616==    by 0x804875C: main (test.cpp:17)
>>> ==4616==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
>>> ==4616==
>>> ==4616== Invalid read of size 4
>>> ==4616==    at 0x4095772: ompi_comm_invalid (communicator.h:261)
>>> ==4616==    by 0x409581E: PMPI_Comm_size (pcomm_size.c:46)
>>> ==4616==    by 0x8048770: main (test.cpp:18)
>>> ==4616==  Address 0x440000a0 is not stack'd, malloc'd or (recently) free'd
>>> [denali:04616] *** Process received signal ***
>>> [denali:04616] Signal: Segmentation fault (11)
>>> [denali:04616] Signal code: Address not mapped (1)
>>> [denali:04616] Failing at address: 0x440000a0
>>> [denali:04616] [ 0] /lib/tls/libc.so.6 [0x42b4de0]
>>> [denali:04616] [ 1]
>>> /users/cluster/cdavid/local/lib/libmpi.so.0(MPI_Comm_size+0x6f)
>>> [0x409581f]
>>> [denali:04616] [ 2] ./test(__gxx_personality_v0+0x12d) [0x8048771]
>>> [denali:04616] [ 3] /lib/tls/libc.so.6(__libc_start_main+0xf8) [0x42a2768]
>>> [denali:04616] [ 4] ./test(__gxx_personality_v0+0x3d) [0x8048681]
>>> [denali:04616] *** End of error message ***
>>> ==4616==
>>> ==4616== Invalid read of size 4
>>> ==4616==    at 0x4095782: ompi_comm_invalid (communicator.h:261)
>>> ==4616==    by 0x409581E: PMPI_Comm_size (pcomm_size.c:46)
>>> ==4616==    by 0x8048770: main (test.cpp:18)
>>> ==4616==  Address 0x440000a0 is not stack'd, malloc'd or (recently) free'd
>>>
>>>
>>> The problem is that, now, I don't know where the issue comes from (is
>>> it libc that is too old and incompatible with g++ 4.4/OpenMPI? is libc
>>> broken?).
>>>
>>
>> Looking at the code for ompi_comm_invalid:
>>
>> static inline int ompi_comm_invalid(ompi_communicator_t* comm)
>> {
>>   if ((NULL == comm) || (MPI_COMM_NULL == comm) ||
>>       (OMPI_COMM_IS_FREED(comm)) || (OMPI_COMM_IS_INVALID(comm)) )
>>       return true;
>>   else
>>       return false;
>> }
>>
>>
>> the interesting point is that (MPI_COMM_NULL == comm) evaluates to false,
>> otherwise the following macros (where the invalid read occurs) would not be
>> evaluated.
>>
>> The only idea that comes to my mind is that you are mixing MPI versions, but
>> as you said your PATH is fine ?!
>>
>> Regards,
>> Dorian
>>
>>
>>
>>> Any help would be highly appreciated.
>>>
>>> Thanks,
>>> Catalin
>>>
>>>
>>> On Mon, Jul 6, 2009 at 3:36 PM, Catalin David<catalindavid2003_at_[hidden]>
>>> wrote:
>>>
>>>>
>>>> On Mon, Jul 6, 2009 at 3:26 PM, jody<jody.xha_at_[hidden]> wrote:
>>>>
>>>>>
>>>>> Hi
>>>>> Are you also sure that you have the same version of Open-MPI
>>>>> on every machine of your cluster, and that it is the mpicxx of this
>>>>> version that is called when you run your program?
>>>>> I ask because you mentioned that there was an old version of Open-MPI
>>>>> present... die you remove this?
>>>>>
>>>>> Jody
>>>>>
>>>>
>>>> Hi
>>>>
>>>> I have just logged in a few other boxes and they all mount my home
>>>> folder. When running `echo $LD_LIBRARY_PATH` and other commands, I get
>>>> what I expect to get, but this might be because I have set these
>>>> variables in the .bashrc file. So, I tried compiling/running like this
>>>>  ~/local/bin/mpicxx [stuff] and ~/local/bin/mpirun -np 4 ray-trace,
>>>> but I get the same errors.
>>>>
>>>> As for the previous version, I don't have root access, therefore I was
>>>> not able to remove it. I was just trying to outrun it by setting the
>>>> $PATH variable to point first at my local installation.
>>>>
>>>>
>>>> Catalin
>>>>
>>>>
>>>> --
>>>>
>>>> ******************************
>>>> Catalin David
>>>> B.Sc. Computer Science 2010
>>>> Jacobs University Bremen
>>>>
>>>> Phone: +49-(0)1577-49-38-667
>>>>
>>>> College Ring 4, #343
>>>> Bremen, 28759
>>>> Germany
>>>> ******************************
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
>
> ******************************
> Catalin David
> B.Sc. Computer Science 2010
> Jacobs University Bremen
>
> Phone: +49-(0)1577-49-38-667
>
> College Ring 4, #343
> Bremen, 28759
> Germany
> ******************************
>

-- 
******************************
Catalin David
B.Sc. Computer Science 2010
Jacobs University Bremen
Phone: +49-(0)1577-49-38-667
College Ring 4, #343
Bremen, 28759
Germany
******************************