Dear All
I am trying to get openmpi-1.1.2 to work on AIX 5.3 / power5.
:: Compilation seems to have worked with the following sequence:
====================================================================
setenv OBJECT_MODE 64
setenv CC xlc
setenv CXX xlC
setenv F77 xlf
setenv FC xlf90
setenv CFLAGS "-qthreaded -O3 -qmaxmem=-1 -qarch=pwr5x -qtune=pwr5 -q64"
setenv CXXFLAGS "-qthreaded -O3 -qmaxmem=-1 -qarch=pwr5x -qtune=pwr5 -q64"
setenv FFLAGS "-qthreaded -O3 -qmaxmem=-1 -qarch=pwr5x -qtune=pwr5 -q64"
setenv FCFLAGS "-qthreaded -O3 -qmaxmem=-1 -qarch=pwr5x -qtune=pwr5 -q64"
setenv LDFLAGS "-Wl,-brtl"
./configure --prefix=/ist/openmpi-1.1.2 \
--disable-mpi-cxx \
--disable-mpi-cxx-seek \
--enable-mpi-threads \
--enable-progress-threads \
--enable-static \
--disable-shared \
--disable-io-romio
====================================================================
:: After the compilation I ran make check and all 11 tests passed successfully.
:: Now I'm trying to run the following command just for test:
# mpirun -hostfile /gpfs/MICHAEL/MPI_hostfiles/mpinodes_b41-b44_1.asc -np 2 /usr/bin/hostname
- The file /gpfs/MICHAEL/MPI_hostfiles/mpinodes_b41-b44_1.asc contains 4 hosts:
r1blade041 slots=1
r1blade042 slots=1
r1blade043 slots=1
r1blade044 slots=1
- The mpirun command eventually hangs with the following message:
[r1blade041:418014] poll failed with errno=25
[r1blade041:418014] opal_event_loop: ompi_evesel->dispatch() failed.
- In this state mpirun cannot be killed by hitting <ctrl-c> only a kill -9 will do the trick.
- While the mpirun still hangs I can see that the "orted" has been launched on both requested hosts.
:: I turned on all debug options in openmpi-mca-params.conf. The output for the same call of mpirun is in the file mpirun-debug.txt.gz.