Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Segmentation fault - Address not mapped
From: Catalin David (catalindavid2003_at_[hidden])
Date: 2009-07-07 07:29:07


 #include <mpi.h>
 #include <stdio.h>
 int main(int argc, char *argv[])
 {
   printf("%d %d %d\n", OMPI_MAJOR_VERSION,
OMPI_MINOR_VERSION,OMPI_RELEASE_VERSION);
   return 0;
 }

returns:

test.cpp: In function ‘int main(int, char**)’:
test.cpp:11: error: ‘OMPI_MAJOR_VERSION’ was not declared in this scope
test.cpp:11: error: ‘OMPI_MINOR_VERSION’ was not declared in this scope
test.cpp:11: error: ‘OMPI_RELEASE_VERSION’ was not declared in this scope

So, I am definitely using another library (mpich).

Thanks one more time!!! I will try to fix it and come back with results.

Catalin

On Tue, Jul 7, 2009 at 12:23 PM, Dorian Krause<doriankrause_at_[hidden]> wrote:
> Catalin David wrote:
>>
>> Hello, all!
>>
>> Just installed Valgrind (since this seems like a memory issue) and got
>> this interesting output (when running the test program):
>>
>> ==4616== Syscall param sched_setaffinity(mask) points to unaddressable
>> byte(s)
>> ==4616==    at 0x43656BD: syscall (in /lib/tls/libc-2.3.2.so)
>> ==4616==    by 0x4236A75: opal_paffinity_linux_plpa_init
>> (plpa_runtime.c:37)
>> ==4616==    by 0x423779B:
>> opal_paffinity_linux_plpa_have_topology_information (plpa_map.c:501)
>> ==4616==    by 0x4235FEE: linux_module_init (paffinity_linux_module.c:119)
>> ==4616==    by 0x447F114: opal_paffinity_base_select
>> (paffinity_base_select.c:64)
>> ==4616==    by 0x444CD71: opal_init (opal_init.c:292)
>> ==4616==    by 0x43CE7E6: orte_init (orte_init.c:76)
>> ==4616==    by 0x4067A50: ompi_mpi_init (ompi_mpi_init.c:342)
>> ==4616==    by 0x40A3444: PMPI_Init (pinit.c:80)
>> ==4616==    by 0x804875C: main (test.cpp:17)
>> ==4616==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
>> ==4616==
>> ==4616== Invalid read of size 4
>> ==4616==    at 0x4095772: ompi_comm_invalid (communicator.h:261)
>> ==4616==    by 0x409581E: PMPI_Comm_size (pcomm_size.c:46)
>> ==4616==    by 0x8048770: main (test.cpp:18)
>> ==4616==  Address 0x440000a0 is not stack'd, malloc'd or (recently) free'd
>> [denali:04616] *** Process received signal ***
>> [denali:04616] Signal: Segmentation fault (11)
>> [denali:04616] Signal code: Address not mapped (1)
>> [denali:04616] Failing at address: 0x440000a0
>> [denali:04616] [ 0] /lib/tls/libc.so.6 [0x42b4de0]
>> [denali:04616] [ 1]
>> /users/cluster/cdavid/local/lib/libmpi.so.0(MPI_Comm_size+0x6f)
>> [0x409581f]
>> [denali:04616] [ 2] ./test(__gxx_personality_v0+0x12d) [0x8048771]
>> [denali:04616] [ 3] /lib/tls/libc.so.6(__libc_start_main+0xf8) [0x42a2768]
>> [denali:04616] [ 4] ./test(__gxx_personality_v0+0x3d) [0x8048681]
>> [denali:04616] *** End of error message ***
>> ==4616==
>> ==4616== Invalid read of size 4
>> ==4616==    at 0x4095782: ompi_comm_invalid (communicator.h:261)
>> ==4616==    by 0x409581E: PMPI_Comm_size (pcomm_size.c:46)
>> ==4616==    by 0x8048770: main (test.cpp:18)
>> ==4616==  Address 0x440000a0 is not stack'd, malloc'd or (recently) free'd
>>
>>
>> The problem is that, now, I don't know where the issue comes from (is
>> it libc that is too old and incompatible with g++ 4.4/OpenMPI? is libc
>> broken?).
>>
>
> Looking at the code for ompi_comm_invalid:
>
> static inline int ompi_comm_invalid(ompi_communicator_t* comm)
> {
>   if ((NULL == comm) || (MPI_COMM_NULL == comm) ||
>       (OMPI_COMM_IS_FREED(comm)) || (OMPI_COMM_IS_INVALID(comm)) )
>       return true;
>   else
>       return false;
> }
>
>
> the interesting point is that (MPI_COMM_NULL == comm) evaluates to false,
> otherwise the following macros (where the invalid read occurs) would not be
> evaluated.
>
> The only idea that comes to my mind is that you are mixing MPI versions, but
> as you said your PATH is fine ?!
>
> Regards,
> Dorian
>
>
>
>> Any help would be highly appreciated.
>>
>> Thanks,
>> Catalin
>>
>>
>> On Mon, Jul 6, 2009 at 3:36 PM, Catalin David<catalindavid2003_at_[hidden]>
>> wrote:
>>
>>>
>>> On Mon, Jul 6, 2009 at 3:26 PM, jody<jody.xha_at_[hidden]> wrote:
>>>
>>>>
>>>> Hi
>>>> Are you also sure that you have the same version of Open-MPI
>>>> on every machine of your cluster, and that it is the mpicxx of this
>>>> version that is called when you run your program?
>>>> I ask because you mentioned that there was an old version of Open-MPI
>>>> present... die you remove this?
>>>>
>>>> Jody
>>>>
>>>
>>> Hi
>>>
>>> I have just logged in a few other boxes and they all mount my home
>>> folder. When running `echo $LD_LIBRARY_PATH` and other commands, I get
>>> what I expect to get, but this might be because I have set these
>>> variables in the .bashrc file. So, I tried compiling/running like this
>>>  ~/local/bin/mpicxx [stuff] and ~/local/bin/mpirun -np 4 ray-trace,
>>> but I get the same errors.
>>>
>>> As for the previous version, I don't have root access, therefore I was
>>> not able to remove it. I was just trying to outrun it by setting the
>>> $PATH variable to point first at my local installation.
>>>
>>>
>>> Catalin
>>>
>>>
>>> --
>>>
>>> ******************************
>>> Catalin David
>>> B.Sc. Computer Science 2010
>>> Jacobs University Bremen
>>>
>>> Phone: +49-(0)1577-49-38-667
>>>
>>> College Ring 4, #343
>>> Bremen, 28759
>>> Germany
>>> ******************************
>>>
>>>
>>
>>
>>
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
******************************
Catalin David
B.Sc. Computer Science 2010
Jacobs University Bremen
Phone: +49-(0)1577-49-38-667
College Ring 4, #343
Bremen, 28759
Germany
******************************