Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] 1.3.1 fails with GM
From: Paul H. Hargrove (PHHargrove_at_[hidden])
Date: 2009-03-20 18:38:25


I've just tested OMPI 1.3.1 w/ GM 2.0.19 and the "ring" test from LAM's
examples directory.
I too see an early death. The backtrace is shown below.

-Paul

$ mpirun -mca btl gm,self -H pcp-i-1,pcp-i-2 ./ring
Rank 0 starting message around the ring -- 1st of 5
[pcp-i-1:27587] *** Process received signal ***
[pcp-i-1:27587] Signal: Segmentation fault (11)
[pcp-i-1:27587] Signal code: Address not mapped (1)
[pcp-i-1:27587] Failing at address: 0x5
[pcp-i-1:27587] [ 0] /lib/tls/libpthread.so.0 [0x401a38f0]
[pcp-i-1:27587] [ 1]
/usr/local/pkg/gm-2.0.19-2.4.20-8smp//lib/libgm.so.0(gm_handle_sent_tokens+0x67)
[0x404c6fbb]
[pcp-i-1:27587] [ 2]
/usr/local/pkg/gm-2.0.19-2.4.20-8smp//lib/libgm.so.0(_gm_unknown+0x42f)
[0x404cbf53]
[pcp-i-1:27587] [ 3]
/usr/local/pkg/gm-2.0.19-2.4.20-8smp//lib/libgm.so.0(gm_unknown+0x20)
[0x404cc068]
[pcp-i-1:27587] [ 4]
/opt/pcp-i/usr/local/pkg/openmpi-1.3.1/lib/openmpi/mca_btl_gm.so(mca_btl_gm_component_progress+0xc2)
[0x404b9c92]
[pcp-i-1:27587] [ 5]
/opt/pcp-i/usr/local/pkg/openmpi-1.3.1/lib/libopen-pal.so.0(opal_progress+0x79)
[0x40100c79]
[pcp-i-1:27587] [ 6]
/opt/pcp-i/usr/local/pkg/openmpi-1.3.1/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv+0x225)
[0x4049d9b5]
[pcp-i-1:27587] [ 7]
/opt/pcp-i/usr/local/pkg/openmpi-1.3.1/lib/libmpi.so.0(PMPI_Recv+0x159)
[0x40074ab9]
[pcp-i-1:27587] [ 8] ./ring(main+0xda) [0x8048876]
[pcp-i-1:27587] [ 9] /lib/tls/libc.so.6(__libc_start_main+0xe4) [0x42015704]
[pcp-i-1:27587] [10] ./ring(printf+0x31) [0x804870d]
[pcp-i-1:27587] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 27587 on node pcp-i-1 exited
on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Patrick Geoffray wrote:
> Hi Christian,
>
> Christian Siebert wrote:
>> I just gave the new release 1.3.1 a go. While Ethernet and InfiniBand
>> seem to work properly, I noticed that Myrinet/GM compiles fine but
>> gives a segmentation violation in the first attempt to communicate
>> (MPI_Send in a simple "hello world" application). Is GM not supported
>> anymore or is it just too old so that nobody tested it?
>
> GM itself is supported and maintenance releases are tested (no more
> development releases), but Open-MPI/GM is not tested at the moment. GM
> does not run on Myri-10G NICs, so we have to use a smaller pool of
> machines with Myrinet 2000 NICs in them. Human usage and MTT runs for
> Open-MPI/MX have priority and MTT for Open-MPI/GM has not run for a
> while :-(
>
> We will try to resume MTT testing with Open-MPI/GM when we have the
> resources. In the meantime, we'll look into the segfault.
>
> Patrick
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group                 Tel: +1-510-495-2352
HPC Research Department                   Fax: +1-510-486-6900
Lawrence Berkeley National Laboratory