Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Segmentation fault - Address not mapped
From: Ashley Pittman (ashley_at_[hidden])
Date: 2009-07-07 07:14:11


This is the error you get when an invalid communicator handle is passed
to a MPI function, the handle is deferenced so you may or may not get a
SEGV from it depending on the value you pass.

The 0x440000a0 address is an offset from 0x44000000, the value of
MPI_COMM_WORLD in mpich2, my guess would be you are either picking up a
mpich2 mpi.h or the mpich2 mpicc.

Ashley,

On Tue, 2009-07-07 at 11:05 +0100, Catalin David wrote:
> Hello, all!
>
> Just installed Valgrind (since this seems like a memory issue) and got
> this interesting output (when running the test program):
>
> ==4616== Syscall param sched_setaffinity(mask) points to unaddressable byte(s)
> ==4616== at 0x43656BD: syscall (in /lib/tls/libc-2.3.2.so)
> ==4616== by 0x4236A75: opal_paffinity_linux_plpa_init (plpa_runtime.c:37)
> ==4616== by 0x423779B:
> opal_paffinity_linux_plpa_have_topology_information (plpa_map.c:501)
> ==4616== by 0x4235FEE: linux_module_init (paffinity_linux_module.c:119)
> ==4616== by 0x447F114: opal_paffinity_base_select
> (paffinity_base_select.c:64)
> ==4616== by 0x444CD71: opal_init (opal_init.c:292)
> ==4616== by 0x43CE7E6: orte_init (orte_init.c:76)
> ==4616== by 0x4067A50: ompi_mpi_init (ompi_mpi_init.c:342)
> ==4616== by 0x40A3444: PMPI_Init (pinit.c:80)
> ==4616== by 0x804875C: main (test.cpp:17)
> ==4616== Address 0x0 is not stack'd, malloc'd or (recently) free'd
> ==4616==
> ==4616== Invalid read of size 4
> ==4616== at 0x4095772: ompi_comm_invalid (communicator.h:261)
> ==4616== by 0x409581E: PMPI_Comm_size (pcomm_size.c:46)
> ==4616== by 0x8048770: main (test.cpp:18)
> ==4616== Address 0x440000a0 is not stack'd, malloc'd or (recently) free'd
> [denali:04616] *** Process received signal ***
> [denali:04616] Signal: Segmentation fault (11)
> [denali:04616] Signal code: Address not mapped (1)
> [denali:04616] Failing at address: 0x440000a0
> [denali:04616] [ 0] /lib/tls/libc.so.6 [0x42b4de0]
> [denali:04616] [ 1]
> /users/cluster/cdavid/local/lib/libmpi.so.0(MPI_Comm_size+0x6f)
> [0x409581f]
> [denali:04616] [ 2] ./test(__gxx_personality_v0+0x12d) [0x8048771]
> [denali:04616] [ 3] /lib/tls/libc.so.6(__libc_start_main+0xf8) [0x42a2768]
> [denali:04616] [ 4] ./test(__gxx_personality_v0+0x3d) [0x8048681]
> [denali:04616] *** End of error message ***
> ==4616==
> ==4616== Invalid read of size 4
> ==4616== at 0x4095782: ompi_comm_invalid (communicator.h:261)
> ==4616== by 0x409581E: PMPI_Comm_size (pcomm_size.c:46)
> ==4616== by 0x8048770: main (test.cpp:18)
> ==4616== Address 0x440000a0 is not stack'd, malloc'd or (recently) free'd
>
>
> The problem is that, now, I don't know where the issue comes from (is
> it libc that is too old and incompatible with g++ 4.4/OpenMPI? is libc
> broken?).
>
> Any help would be highly appreciated.
>
> Thanks,
> Catalin
>
>
> On Mon, Jul 6, 2009 at 3:36 PM, Catalin David<catalindavid2003_at_[hidden]> wrote:
> > On Mon, Jul 6, 2009 at 3:26 PM, jody<jody.xha_at_[hidden]> wrote:
> >> Hi
> >> Are you also sure that you have the same version of Open-MPI
> >> on every machine of your cluster, and that it is the mpicxx of this
> >> version that is called when you run your program?
> >> I ask because you mentioned that there was an old version of Open-MPI
> >> present... die you remove this?
> >>
> >> Jody
> >
> > Hi
> >
> > I have just logged in a few other boxes and they all mount my home
> > folder. When running `echo $LD_LIBRARY_PATH` and other commands, I get
> > what I expect to get, but this might be because I have set these
> > variables in the .bashrc file. So, I tried compiling/running like this
> > ~/local/bin/mpicxx [stuff] and ~/local/bin/mpirun -np 4 ray-trace,
> > but I get the same errors.
> >
> > As for the previous version, I don't have root access, therefore I was
> > not able to remove it. I was just trying to outrun it by setting the
> > $PATH variable to point first at my local installation.
> >
> >
> > Catalin
> >
> >
> > --
> >
> > ******************************
> > Catalin David
> > B.Sc. Computer Science 2010
> > Jacobs University Bremen
> >
> > Phone: +49-(0)1577-49-38-667
> >
> > College Ring 4, #343
> > Bremen, 28759
> > Germany
> > ******************************
> >
>
>
>

-- 
Ashley Pittman, Bath, UK.
Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk