Thanks for the patch. A few comments:
--- a/ompi/mca/btl/openib/btl_openib_endpoint.h Mon Aug 31 00:00:16
+++ b/ompi/mca/btl/openib/btl_openib_endpoint.h Thu Sep 17 18:23:58
@@ -246,6 +246,12 @@
/** Whether we've send out CTS to the peer or not (only used in
CTS protocol) */
+ uint32_t vendor_id;
+ uint32_t vendor_part_id;
+ uint32_t max_inline_data;
The vendor_id and vendor_part_id actually the *remote* values of this
information. Shouldn't that go in endpoint.rem_info?
Is there a reason to put the max_inline_data on the endpoint rather
than accessing it through endpoint->endpoint_btl->device->ib_dev?
(I'm a little confused on how it is used / assigned -- I could be
missing something here)
What testing has been done to see that this stuff works?
Has an equivalent patch been made for Pasha's ofacm work?
On Sep 23, 2009, at 5:05 AM, Vasily Filipov wrote:
> Some time ago Mellanox proposed design that should improve current
> support for heterogeneous clusters (see Design.txt).The design was
> accepted by IB vendors, and now we propose patch that adds a
> heterogeneous cluster support. The path leaves one issue that we do
> not resolve completely. If 2 different procs have different QPs
> configuration (P/S/X) we print nice warning message that describes
> that such configuration is not supported and it propose way to
> resolve the issue. Theoretically it will be best to provide
> solution that automatically will resolve the problem, but it will
> require significant changes on openib blt that we dont want to
> introduce in this stage.
> Please comment.
> devel mailing list