Thanks for the info.  I'm going to try upgrading my OpenIB bits.
 
Scott


From: Galen Shipman [mailto:gshipman@lanl.gov]
Sent: Wednesday, March 01, 2006 7:53 AM
To: Open MPI Users
Cc: Scott Weitzenkamp (sweitzen); Roland Dreier (rdreier)
Subject: Re: [OMPI users] anyone have OpenMPI working with OpenIB on RHEL4 U3 beta?

Hi Scott,

What is happening is that on creation of the Queue Pair the max inline data is reported as 0. Open MPI 1.0.1 did not check this and assumed that data smaller than some threshold could be sent inline :-(. The Open MPI trunk does check the max inline data QP attribute prior to using inline data for sending, but if it sees that the inline data is 0 it will report a warning although the application should still run to completion. Is your application completing with the trunk snapshot, if so you can effectively ignore these warnings?

We saw this problem (0 max inline data) on one of our clusters, upgrading the OpenIB stack corrected the issue. You are running a 2.6.9 kernel however so to upgrade to the latest OpenIB stack you will need to get a backport which is in the OpenIB SVN under gen2/branches/backport-to-2.6.9.

So to distill, the hang on 0 max inline data was corrected in Open MPI in the trunk and the 1.0.2 branch. Upgrading the OpenIB stack resulted in a positive max inline data, although I cannot explain why, there may have been other changes to the cluster I am not aware of, Roland could probably explain this better.

- Galen

On Feb 28, 2006, at 1:36 PM, Scott Weitzenkamp ((sweitzen)) wrote:

I'm trying OpenMPI with the 2.6.9-27.ELsmp kernel and Red Hat OpenIB RPMs included with the beta.
I ran configure with only the --prefix option.
OpenMPI 1.0.1 mpirun just hangs with no output.
A 1.1 snapshot gets this:
/data/software/qa/MPI/openmpi-1.1a1r9094.rhel4-amd64-openib/bin/mpirun --prefix
/data/software/qa/MPI/openmpi-1.1a1r9094.rhel4-amd64-openib -np 2 -host 192.168.0.197,192.168.0.199 /data/home/releng/tmp/rhel4/openib/mpi_latency.amd64 100 1
[0,1,1][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp] ibv_create_qp: returned 0 byte(s) for max inline data
[0,1,0][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp] ibv_create_qp: returned 0 byte(s) for max inline data[
0,1,1][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp] ibv_create_qp: returned 0 byte(s) for max inline data
[0,1,0][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp] ibv_create_qp: returned 0 byte(s) for max inline data
[0,1,0][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp] ibv_create_qp: returned 0 byte(s) for max inline data
[0,1,0][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp] ibv_create_qp: returned 0 byte(s) for max inline data
[0,1,1][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp] ibv_create_qp: returned 0 byte(s) for max inline data
[0,1,1][btl_openib_endpoint.c:889:mca_btl_openib_endpoint_create_qp] ibv_create_qp: returned 0 byte(s) for max inline data
Do I need to use a newer OpenIB?
Scott
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users