Hi,

I'm observing a random segmentation fault during an internode parallel computation involving the openib btl and OpenMPI-1.4.2 (the same issue can be observed with OpenMPI-1.3.3).
  mpirun (Open MPI) 1.4.2
  Report bugs to http://www.open-mpi.org/community/help/
  [pbn08:02624] *** Process received signal ***
  [pbn08:02624] Signal: Segmentation fault (11)
  [pbn08:02624] Signal code: Address not mapped (1)
  [pbn08:02624] Failing at address: (nil)
  [pbn08:02624] [ 0] /lib64/libpthread.so.0 [0x349540e4c0]
  [pbn08:02624] *** End of error message ***
  sh: line 1:  2624 Segmentation fault      \/share\/hpc3\/actran_suite\/Actran_11\.0\.rc2\.41872\/RedHatEL\-5\/x86_64\/bin\/actranpy_mp '--apl=/share/hpc3/actran_suite/Actran_11.0.rc2.41872/RedHatEL-5/x86_64/Actran_11.0.rc2.41872' '--inputfile=/work/st25652/LSF_130073_0_47696_0/Case1_3Dreal_m4_n2.dat' '--scratch=/scratch/st25652/LSF_130073_0_47696_0/scratch' '--mem=3200' '--threads=1' '--errorlevel=FATAL' '--t_max=0.1' '--parallel=domain'

If I choose not to use the openib btl (by using --mca btl self,sm,tcp on the command line, for instance), I don't encounter any problem and the parallel computation runs flawlessly.

I would like to get some help to be able:
- to diagnose the issue I'm facing with the openib btl 
- understand why this issue is observed only when using the openib btl and not when using self,sm,tcp

Any help would be very much appreciated.

The outputs of ompi_info and the configure scripts of OpenMPI are enclosed to this email, and some information on the infiniband drivers as well.

Here is the command line used when launching a parallel computation using infiniband:
  path_to_openmpi/bin/mpirun -np $NPROCESS --hostfile host.list --mca btl openib,sm,self,tcp  --display-map --verbose --version --mca mpi_warn_on_fork 0 --mca btl_openib_want_fork_support 0 [...]
and the command line used if not using infiniband:
  path_to_openmpi/bin/mpirun -np $NPROCESS --hostfile host.list --mca btl self,sm,tcp  --display-map --verbose --version --mca mpi_warn_on_fork 0 --mca btl_openib_want_fork_support 0 [...]

Thanks,
Eloi






-- 


Eloi Gaudry

Free Field Technologies
Axis Park Louvain-la-Neuve
Rue Emile Francqui, 1
B-1435 Mont-Saint Guibert
BELGIUM

Company Phone: +32 10 487 959
Company Fax:   +32 10 454 626