X-Account-Key: account2
X-Mozilla-Keys: 
Delivered-To: pasha@dev.mellanox.co.il
Received: by 10.216.35.66 with SMTP id t44cs98667wea;
	Thu, 24 Sep 2009 08:56:57 -0700 (PDT)
Received: by 10.224.95.196 with SMTP id e4mr3396299qan.180.1253807816410;
	Thu, 24 Sep 2009 08:56:56 -0700 (PDT)
Return-Path: <jsquyres@cisco.com>
Received: from rtp-iport-1.cisco.com (rtp-iport-1.cisco.com [64.102.122.148])
	by mx.google.com with ESMTP id 17si3744230qyk.35.2009.09.24.08.56.54;
	Thu, 24 Sep 2009 08:56:55 -0700 (PDT)
Received-SPF: pass (google.com: domain of jsquyres@cisco.com designates
	64.102.122.148 as permitted sender) client-ip=64.102.122.148; 
Authentication-Results: mx.google.com;
	spf=pass (google.com: domain of jsquyres@cisco.com
	designates 64.102.122.148 as permitted sender)
	smtp.mail=jsquyres@cisco.com;
	dkim=pass (test mode) header.i=jsquyres@cisco.com
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: ApoEAKozu0pAZnme/2dsb2JhbAC/GIhQAY9zBYQb
X-IronPort-AV: E=Sophos;i="4.44,446,1249257600"; d="scan'208";a="59731120"
Received: from rtp-dkim-1.cisco.com ([64.102.121.158])
	by rtp-iport-1.cisco.com with ESMTP; 24 Sep 2009 15:56:51 +0000
Received: from rtp-core-1.cisco.com (rtp-core-1.cisco.com [64.102.124.12])
	by rtp-dkim-1.cisco.com (8.12.11/8.12.11) with ESMTP id n8OFupQ4002395; 
	Thu, 24 Sep 2009 11:56:51 -0400
Received: from rtp-jsquyres-8711.cisco.com (rtp-jsquyres-8711.cisco.com
	[10.116.19.194])
	by rtp-core-1.cisco.com (8.13.8/8.14.3) with ESMTP id n8OFun6t021951;
	Thu, 24 Sep 2009 15:56:50 GMT
Cc: pasha@dev.mellanox.co.il
Message-Id: <A5F43242-D906-42EC-8355-2AAC4B9DB2A9@cisco.com>
From: Jeff Squyres <jsquyres@cisco.com>
To: Open MPI Developers <devel@open-mpi.org>
In-Reply-To: <4AB9E4E2.10203@dev.mellanox.co.il>
Content-Type: text/plain; charset=WINDOWS-1252; format=flowed; delsp=yes
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Apple Message framework v936)
Subject: Re: [OMPI devel] [PATCH] Improving heterogeneous IB clusters support.
Date: Thu, 24 Sep 2009 11:56:49 -0400
References: <4AB9E4E2.10203@dev.mellanox.co.il>
X-Mailer: Apple Mail (2.936)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; l=2079; t=1253807811;
	x=1254671811; c=relaxed/simple; s=rtpdkim1001;
	h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version;
	d=cisco.com; i=jsquyres@cisco.com;
	z=From:=20Jeff=20Squyres=20<jsquyres@cisco.com>
	|Subject:=20Re=3A=20[OMPI=20devel]=20[PATCH]=20Improving=20
	heterogeneous=20IB=20clusters=20support. |Sender:=20
	|To:=20Open=20MPI=20Developers=20<devel@open-mpi.org>;
	bh=lS3Jr00/pjz7QmLbSkO5ENpu7oVetFtSjJkQs/AFRes=;
	b=gmPbvO2V498NShBNuzG8TUawUlJBn0eYn87uI7VqRkhMPPlH1xRO2nVVwM
	PmHCaD2MhY14nseGGhoucGVJaxE91605EgdZVOFvgJi7UgKu5O6lT66Wfs18
	hxKlCbtzGh;
Authentication-Results: rtp-dkim-1; header.From=jsquyres@cisco.com; dkim=pass (
	sig from cisco.com/rtpdkim1001 verified; ); 

Thanks for the patch.  A few comments:

--- a/ompi/mca/btl/openib/btl_openib_endpoint.h	Mon Aug 31 00:00:16 =20
2009 -0700
+++ b/ompi/mca/btl/openib/btl_openib_endpoint.h	Thu Sep 17 18:23:58 =20
2009 +0300
@@ -246,6 +246,12 @@
      /** Whether we've send out CTS to the peer or not (only used in
          CTS protocol) */
      bool endpoint_cts_sent;
+
+    uint32_t vendor_id;
+    uint32_t vendor_part_id;
+
+    uint32_t max_inline_data;
+
  };

The vendor_id and vendor_part_id actually the *remote* values of this =20=

information.  Shouldn't that go in endpoint.rem_info?

Is there a reason to put the max_inline_data on the endpoint rather =20
than accessing it through endpoint->endpoint_btl->device->ib_dev?  =20
(I'm a little confused on how it is used / assigned -- I could be =20
missing something here)

What testing has been done to see that this stuff works?

Has an equivalent patch been made for Pasha's ofacm work?



On Sep 23, 2009, at 5:05 AM, Vasily Filipov wrote:

> Hello,
>
> Some time ago Mellanox proposed design that should improve current =20
> support for heterogeneous clusters (see Design.txt).The design was =20
> accepted by IB vendors, and now we propose patch that adds a =20
> heterogeneous cluster support. The path leaves one issue that we do =20=

> not resolve completely. If 2 different procs have different QPs =20
> configuration (P/S/X) we print nice warning message that describes =20
> that such configuration is not supported and it propose way to =20
> resolve the issue.  Theoretically it will be best to provide =20
> solution that automatically will resolve the problem, but it will =20
> require significant changes on openib blt that we don=92t want to =20
> introduce in this stage.
>
> Please comment.
>
> Regards,
> Vasily
>
> <=20
> Design=20
> .txt><btl_openib.patch>_______________________________________________
> devel mailing list
> devel@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


--=20
Jeff Squyres
jsquyres@cisco.com




