Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Possible bug in finalize, OpenMPI v1.5, head revision
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2012-01-18 15:05:15


Jumping in pretty late in this thread here...

I see that it's failing in opal_hwloc_base_close(). That's a little worrysome.

I do see an odd path through the hwloc initialization that *could* cause an error during finalization -- but it would involve you setting an invalid value for an MCA parameter. Are you setting hwloc_base_mem_bind_failure_action or
hwloc_base_mem_alloc_policy, perchance?

On Jan 16, 2012, at 1:56 PM, Andrew Senin wrote:

> Hi,
>
> I think I've found a bug in the hear revision of the OpenMPI 1.5
> branch. If it is configured with --disable-debug it crashes in
> finalize on the hello_c.c example. Did I miss something out?
>
> Configure options:
> ./configure --with-pmi=/usr/ --with-slurm=/usr/ --without-psm
> --disable-debug --enable-mpirun-prefix-by-default
> --prefix=/hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install
>
> Runtime command and output:
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:../lib ./mpirun --mca btl openib,self
> --npernode 1 --host mir1,mir2 ./hello
>
> Hello, world, I am 0 of 2
> Hello, world, I am 1 of 2
> [mir1:05542] *** Process received signal ***
> [mir1:05542] Signal: Segmentation fault (11)
> [mir1:05542] Signal code: Address not mapped (1)
> [mir1:05542] Failing at address: 0xe8
> [mir2:10218] *** Process received signal ***
> [mir2:10218] Signal: Segmentation fault (11)
> [mir2:10218] Signal code: Address not mapped (1)
> [mir2:10218] Failing at address: 0xe8
> [mir1:05542] [ 0] /lib64/libpthread.so.0() [0x390d20f4c0]
> [mir1:05542] [ 1]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(+0x1346a8)
> [0x7f4588cee6a8]
> [mir1:05542] [ 2]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_hwloc_base_close+0x32)
> [0x7f4588cee700]
> [mir1:05542] [ 3]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_finalize+0x73)
> [0x7f4588d1beb2]
> [mir1:05542] [ 4]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(orte_finalize+0xfe)
> [0x7f4588c81eb5]
> [mir1:05542] [ 5]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(ompi_mpi_finalize+0x67a)
> [0x7f4588c217c3]
> [mir1:05542] [ 6]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(PMPI_Finalize+0x59)
> [0x7f4588c39959]
> [mir1:05542] [ 7] ./hello(main+0x69) [0x4008fd]
> [mir1:05542] [ 8] /lib64/libc.so.6(__libc_start_main+0xfd) [0x390ca1ec5d]
> [mir1:05542] [ 9] ./hello() [0x4007d9]
> [mir1:05542] *** End of error message ***
> [mir2:10218] [ 0] /lib64/libpthread.so.0() [0x3a6dc0f4c0]
> [mir2:10218] [ 1]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(+0x1346a8)
> [0x7f409f31d6a8]
> [mir2:10218] [ 2]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_hwloc_base_close+0x32)
> [0x7f409f31d700]
> [mir2:10218] [ 3]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_finalize+0x73)
> [0x7f409f34aeb2]
> [mir2:10218] [ 4]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(orte_finalize+0xfe)
> [0x7f409f2b0eb5]
> [mir2:10218] [ 5]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(ompi_mpi_finalize+0x67a)
> [0x7f409f2507c3]
> [mir2:10218] [ 6]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(PMPI_Finalize+0x59)
> [0x7f409f268959]
> [mir2:10218] [ 7] ./hello(main+0x69) [0x4008fd]
> [mir2:10218] [ 8] /lib64/libc.so.6(__libc_start_main+0xfd) [0x3a6d41ec5d]
> [mir2:10218] [ 9] ./hello() [0x4007d9]
> [mir2:10218] *** End of error message ***
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 5542 on node mir1 exited
> on signal 11 (Segmentation fault).
> ---------------------------------------------------------------------
>
> Thanks,
> Andrew Senin
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/