Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Tim Prins (tprins_at_[hidden])
Date: 2007-03-30 22:48:54


Hi Valmor,

What is happening here is that when Open MPI tries to create MX endpoint for
communication, mx returns code 20, which is MX_BUSY.

At this point we should gracefully move on, but there is a bug in Open MPI 1.2
which causes a segmentation fault in case of this type of error. This will be
fixed in 1.2.1, and the fix is available now in the 1.2 nightly tarballs.

Hope this helps,

Tim

On Friday 30 March 2007 05:06 pm, de Almeida, Valmor F. wrote:
> Hello,
>
> I am getting this error any time the number of processes requested per
> machine is greater than the number of cpus. I suspect it is something on
> the configuration of mx / ompi that I am missing since another machine I
> have without mx installed runs ompi correctly with oversubscription.
>
> Thanks for any help.
>
> --
> Valmor
>
>
> ->mpirun -np 3 --machinefile mymachines-1 a.out
> [x1:23624] mca_btl_mx_init: mx_open_endpoint() failed with status=20
> [x1:23624] *** Process received signal *** [x1:23624] Signal:
> Segmentation fault (11) [x1:23624] Signal code: Address not mapped (1)
> [x1:23624] Failing at address: 0x20 [x1:23624] [ 0] [0xb7f7f440]
> [x1:23624] [ 1]
> /opt/openmpi-1.2/lib/openmpi/mca_btl_mx.so(mca_btl_mx_finalize+0x25)
> [0xb7aca825] [x1:23624] [ 2]
> /opt/openmpi-1.2/lib/openmpi/mca_btl_mx.so(mca_btl_mx_component_init+0x6
> f8) [0xb7acc658] [x1:23624] [ 3]
> /opt/ompi/lib/libmpi.so.0(mca_btl_base_select+0x1a0) [0xb7f41900]
> [x1:23624] [ 4]
> /opt/openmpi-1.2/lib/openmpi/mca_bml_r2.so(mca_bml_r2_component_init+0x2
> 6) [0xb7ad1006] [x1:23624] [ 5]
> /opt/ompi/lib/libmpi.so.0(mca_bml_base_init+0x78) [0xb7f41198]
> [x1:23624] [ 6]
> /opt/openmpi-1.2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_component_init+0
> x7d) [0xb7af866d] [x1:23624] [ 7]
> /opt/ompi/lib/libmpi.so.0(mca_pml_base_select+0x176) [0xb7f49b56]
> [x1:23624] [ 8] /opt/ompi/lib/libmpi.so.0(ompi_mpi_init+0x4cf)
> [0xb7f0fe2f] [x1:23624] [ 9] /opt/ompi/lib/libmpi.so.0(MPI_Init+0xab)
> [0xb7f3204b] [x1:23624] [10] a.out(_ZN3MPI4InitERiRPPc+0x18) [0x8052cbe]
> [x1:23624] [11] a.out(main+0x21) [0x804f4a7] [x1:23624] [12]
> /lib/libc.so.6(__libc_start_main+0xdc) [0xb7be9824]
>
> content of mymachines-1 file
>
> x1 max_slots=4
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users