Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Possible bug in finalize, OpenMPI v1.5, head revision
From: Andrew Senin (andrew.senin_at_[hidden])
Date: 2012-01-17 10:19:16


slurm 2.3.2

-Andrew

On Tue, Jan 17, 2012 at 6:05 PM, Ralph Castain <rhc.openmpi_at_[hidden]> wrote:
> What version of slurm?
>
>
> Sent from my iPad
>
> On Jan 17, 2012, at 4:36 AM, Andrew Senin <andrew.senin_at_[hidden]> wrote:
>
>> Hi Ralph,
>>
>> If you want Mike can provide access to the lab with RHEL 6.0 where we
>> see the problem.
>>
>> Thanks,
>> Andrew Senin
>>
>> On Tue, Jan 17, 2012 at 9:59 AM, Mike Dubman <mike.ompi_at_[hidden]> wrote:
>>> It happens for us on RHEL 6.0
>>>
>>>
>>> On Tue, Jan 17, 2012 at 3:46 AM, Ralph Castain <rhc.openmpi_at_[hidden]>
>>> wrote:
>>>>
>>>> Well, I'm afraid I can't replicate your report. It runs fine for me.
>>>>
>>>> Sent from my iPad
>>>>
>>>> On Jan 16, 2012, at 4:25 PM, Ralph Castain <rhc.openmpi_at_[hidden]> wrote:
>>>>
>>>>> Hmmmm....probably a bug. I haven't tested that branch yet. Will take a
>>>>> look.
>>>>>
>>>>> Sent from my iPad
>>>>>
>>>>> On Jan 16, 2012, at 11:56 AM, Andrew Senin <andrew.senin_at_[hidden]>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I think I've found a bug in the hear revision of the OpenMPI 1.5
>>>>>> branch. If it is configured with --disable-debug it crashes in
>>>>>> finalize on the hello_c.c example. Did I miss something out?
>>>>>>
>>>>>> Configure options:
>>>>>> ./configure --with-pmi=/usr/ --with-slurm=/usr/ --without-psm
>>>>>> --disable-debug --enable-mpirun-prefix-by-default
>>>>>>
>>>>>> --prefix=/hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install
>>>>>>
>>>>>> Runtime command and output:
>>>>>> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:../lib ./mpirun --mca btl openib,self
>>>>>> --npernode 1 --host mir1,mir2 ./hello
>>>>>>
>>>>>> Hello, world, I am 0 of 2
>>>>>> Hello, world, I am 1 of 2
>>>>>> [mir1:05542] *** Process received signal ***
>>>>>> [mir1:05542] Signal: Segmentation fault (11)
>>>>>> [mir1:05542] Signal code: Address not mapped (1)
>>>>>> [mir1:05542] Failing at address: 0xe8
>>>>>> [mir2:10218] *** Process received signal ***
>>>>>> [mir2:10218] Signal: Segmentation fault (11)
>>>>>> [mir2:10218] Signal code: Address not mapped (1)
>>>>>> [mir2:10218] Failing at address: 0xe8
>>>>>> [mir1:05542] [ 0] /lib64/libpthread.so.0() [0x390d20f4c0]
>>>>>> [mir1:05542] [ 1]
>>>>>>
>>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(+0x1346a8)
>>>>>> [0x7f4588cee6a8]
>>>>>> [mir1:05542] [ 2]
>>>>>>
>>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_hwloc_base_close+0x32)
>>>>>> [0x7f4588cee700]
>>>>>> [mir1:05542] [ 3]
>>>>>>
>>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_finalize+0x73)
>>>>>> [0x7f4588d1beb2]
>>>>>> [mir1:05542] [ 4]
>>>>>>
>>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(orte_finalize+0xfe)
>>>>>> [0x7f4588c81eb5]
>>>>>> [mir1:05542] [ 5]
>>>>>>
>>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(ompi_mpi_finalize+0x67a)
>>>>>> [0x7f4588c217c3]
>>>>>> [mir1:05542] [ 6]
>>>>>>
>>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(PMPI_Finalize+0x59)
>>>>>> [0x7f4588c39959]
>>>>>> [mir1:05542] [ 7] ./hello(main+0x69) [0x4008fd]
>>>>>> [mir1:05542] [ 8] /lib64/libc.so.6(__libc_start_main+0xfd)
>>>>>> [0x390ca1ec5d]
>>>>>> [mir1:05542] [ 9] ./hello() [0x4007d9]
>>>>>> [mir1:05542] *** End of error message ***
>>>>>> [mir2:10218] [ 0] /lib64/libpthread.so.0() [0x3a6dc0f4c0]
>>>>>> [mir2:10218] [ 1]
>>>>>>
>>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(+0x1346a8)
>>>>>> [0x7f409f31d6a8]
>>>>>> [mir2:10218] [ 2]
>>>>>>
>>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_hwloc_base_close+0x32)
>>>>>> [0x7f409f31d700]
>>>>>> [mir2:10218] [ 3]
>>>>>>
>>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_finalize+0x73)
>>>>>> [0x7f409f34aeb2]
>>>>>> [mir2:10218] [ 4]
>>>>>>
>>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(orte_finalize+0xfe)
>>>>>> [0x7f409f2b0eb5]
>>>>>> [mir2:10218] [ 5]
>>>>>>
>>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(ompi_mpi_finalize+0x67a)
>>>>>> [0x7f409f2507c3]
>>>>>> [mir2:10218] [ 6]
>>>>>>
>>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(PMPI_Finalize+0x59)
>>>>>> [0x7f409f268959]
>>>>>> [mir2:10218] [ 7] ./hello(main+0x69) [0x4008fd]
>>>>>> [mir2:10218] [ 8] /lib64/libc.so.6(__libc_start_main+0xfd)
>>>>>> [0x3a6d41ec5d]
>>>>>> [mir2:10218] [ 9] ./hello() [0x4007d9]
>>>>>> [mir2:10218] *** End of error message ***
>>>>>>
>>>>>> --------------------------------------------------------------------------
>>>>>> mpirun noticed that process rank 0 with PID 5542 on node mir1 exited
>>>>>> on signal 11 (Segmentation fault).
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>> Thanks,
>>>>>> Andrew Senin
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users