Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Nicolas Niclausse (Nicolas.Niclausse_at_[hidden])
Date: 2007-03-21 11:45:32


hello,

I'm trying to use netpipe with openmpi on my system (rhel 3, dual opteron,
myrinet 2G with MX drivers).

Everything is fine when i use a 64bit binary, but it segfaults when i use a
32 bit binary :

nniclausse# mpirun -machinefile ./machines ./NPmpi
[helios38:15657] *** Process received signal ***
[helios38:15657] Signal: Segmentation fault (11)
[helios38:15657] Signal code: Address not mapped (1)
[helios38:15657] Failing at address: 0x215b
[helios38:15657] [ 0] /lib/libpthread.so.0 [0x40508688]
[helios38:15657] [ 1] /lib/libc.so.6 [0x40575160]
[helios38:15657] [ 2]
/opt/openmpi/1.2/lib/openmpi/mca_mtl_mx.so(ompi_mtl_mx_module_init+0x124)
[0x4084a0f4]
[helios38:15657] [ 3] /opt/openmpi/1.2/lib/openmpi/mca_mtl_mx.so [0x4084e108]
[helios38:15657] [ 4]
/opt/openmpi/1.2/lib/libmpi.so.0(ompi_mtl_base_select+0xe1) [0x402ddd11]
[helios38:15657] [ 5] /opt/openmpi/1.2/lib/openmpi/mca_pml_cm.so [0x407fd83f]
[helios38:15657] [ 6]
/opt/openmpi/1.2/lib/libmpi.so.0(mca_pml_base_select+0x209) [0x402ef569]
[helios38:15657] [ 7] /opt/openmpi/1.2/lib/libmpi.so.0(ompi_mpi_init+0x3cf)
[0x4006beef]
[helios38:15657] [ 8] /opt/openmpi/1.2/lib/libmpi.so.0(MPI_Init+0x109)
[0x401cb5d9]
[helios38:15657] [ 9] ./NPmpi(Init+0x22) [0x804adb2]
[helios38:15657] [10] ./NPmpi(main+0xb3) [0x80492d3]
[helios38:15657] [11] /lib/libc.so.6(__libc_start_main+0x9e) [0x40563bd2]
[helios38:15657] [12] ./NPmpi(free+0x45) [0x8049171]
[helios38:15657] *** End of error message ***
mpirun noticed that job rank 0 with PID 15657 on node helios38 exited on
signal 11 (Segmentation fault).
1 additional process aborted (not shown)

nniclausse#gdb ./NPmpi core.7834

#0 mx_decompose_endpoint_addr (endpoint_addr={stuff =
{4611686018427396095, 0}},
    nic_id=0x1fff, endpoint_id=0x40858490) at
../mx_decompose_endpoint_addr.c:32
32 *nic_id = x.partner->nic_id;
(gdb) bt
#0 mx_decompose_endpoint_addr (endpoint_addr={stuff =
{4611686018427396095, 0}},
    nic_id=0x1fff, endpoint_id=0x40858490) at
../mx_decompose_endpoint_addr.c:32
#1 0x4084a0f4 in ompi_mtl_mx_module_init () at mtl_mx.c:90
#2 0x4084e108 in ompi_mtl_mx_component_init (enable_progress_threads=0 '\0',
    enable_mpi_threads=0 '\0') at mtl_mx_component.c:124
#3 0x402ddd11 in ompi_mtl_base_select (enable_progress_threads=0 '\0',
    enable_mpi_threads=0 '\0') at base/mtl_base_component.c:104
#4 0x407fd83f in mca_pml_cm_component_init (priority=0xfffbb6d4,
    enable_progress_threads=0 '\0', enable_mpi_threads=0 '\0')
    at pml_cm_component.c:128
#5 0x402ef569 in mca_pml_base_select (enable_progress_threads=0 '\0',
    enable_mpi_threads=0 '\0') at base/pml_base_select.c:96
#6 0x4006beef in ompi_mpi_init (argc=1, argv=0xffffa4b4, requested=0,
    provided=0xfffbb7b8) at runtime/ompi_mpi_init.c:398
#7 0x401cb5d9 in PMPI_Init (argc=0xffffa460, argv=0xffffa464) at pinit.c:70
#8 0x0804adb2 in Init ()
#9 0x080492d3 in main ()

(OpenMPI is compiled with PGI 6.0 )

nniclausse# /opt/mx/bin/mx_info | head
MX Version: 1.1.6
MX Build: nniclausse_at_helios38:/scratch/rpmbuild/BUILD/mx-1.1.6 Wed Mar 21
14:45:21 CET 2007
1 Myrinet board installed.
The MX driver is configured to support up to 4 instances and 1024 nodes.
===================================================================
Instance #0: 333.2 MHz LANai, 133.1 MHz PCI bus, 2 MB SRAM

Any idea ?

-- 
Nicolas NICLAUSSE                          Service DREAM
INRIA Sophia Antipolis                     http://www-sop.inria.fr/