What is happening is that on creation of the Queue Pair the max inline
data is reported as 0. Open MPI 1.0.1 did not check this and assumed that data
smaller than some threshold could be sent inline :-(. The Open MPI trunk does
check the max inline data QP attribute prior to using inline data for sending,
but if it sees that the inline data is 0 it will report a warning although the
application should still run to completion. Is your application completing
with the trunk snapshot, if so you can effectively ignore these warnings?
We saw this problem (0 max inline data) on one of our clusters, upgrading
the OpenIB stack corrected the issue. You are running a 2.6.9 kernel however
so to upgrade to the latest OpenIB stack you will need to get a backport which
is in the OpenIB SVN under gen2/branches/backport-to-2.6.9.
So to distill, the hang on 0 max inline data was corrected in Open MPI in
the trunk and the 1.0.2 branch. Upgrading the OpenIB stack resulted in a
positive max inline data, although I cannot explain why, there may have been
other changes to the cluster I am not aware of, Roland could probably explain
this better.
I'm trying OpenMPI with the 2.6.9-27.ELsmp
kernel and Red Hat OpenIB RPMs included with the beta.
I ran configure with only the --prefix
option.
OpenMPI 1.0.1 mpirun just hangs with no
output.
A 1.1 snapshot gets this:
/data/software/qa/MPI/openmpi-1.1a1r9094.rhel4-amd64-openib/bin/mpirun
--prefix
/data/software/qa/MPI/openmpi-1.1a1r9094.rhel4-amd64-openib
-np 2 -host 192.168.0.197,192.168.0.199
/data/home/releng/tmp/rhel4/openib/mpi_latency.amd64 100 1
[0,1,1][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp]
ibv_create_qp: returned 0 byte(s) for max inline data
[0,1,0][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp]
ibv_create_qp: returned 0 byte(s) for max inline data[
0,1,1][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp]
ibv_create_qp: returned 0 byte(s) for max inline data
[0,1,0][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp]
ibv_create_qp: returned 0 byte(s) for max inline data
[0,1,0][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp]
ibv_create_qp: returned 0 byte(s) for max inline data
[0,1,0][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp]
ibv_create_qp: returned 0 byte(s) for max inline data
[0,1,1][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp]
ibv_create_qp: returned 0 byte(s) for max inline data
[0,1,1][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp]
ibv_create_qp: returned 0 byte(s) for max inline data
Do I need to use a newer OpenIB?
Scott
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users