Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] 1.3 PML default choice
From: Tim Mattox (timattox_at_[hidden])
Date: 2009-01-13 13:28:34


Hi Bogdan,
Sorry for such a late reply to your e-mail. Glad to hear that the
performance anomaly you mentioned below is now gone with 1.3rc3.
But I noticed that we either didn't explain something well enough, or not
at all... The cm PML does not use BTLs..., only MTLs, so your
suggested commandline of:
  --mca pml cm --mca btl mx,sm,self
does not do what you think... the BTL selection is ignored.
Thus the above is equivalent to:
  --mca pml cm
And on a machine with MX as the high speed interconnect,
would be equivalent to:
  --mca pml cm --mca mtl mx

So, in short, the PML selection (ob1 or cm) use distinct sets of lower
level drivers, with ob1 using potentially multiple BTLs, and CM using
a single MTL module. This is kind of explained in this FAQ entry:
http://www.open-mpi.org/faq/?category=myrinet#myri-btl-mx

On Mon, Nov 17, 2008 at 12:39 PM, Bogdan Costescu
<Bogdan.Costescu_at_[hidden]> wrote:
>
> Hi!
>
> In testing the 1.3b2, I have encountered a rather strange behaviour.
> First the setup:
> dual-CPU dual-core x86_64 with Myrinet 10G card
> self compiled Linux kernel 2.6.22.18, MX 1.2.7(*)
> GCC-4.1.2 (from Debian etch), Torque 2.1.10
> OpenMPI 1.3b2 (tar.gz from download page)
> IMB 3.1
>
> (*) I'm actually tracking a problem together with Myricom people, so it's
> not a vanilla 1.2.7, but 1.2.7 with a tiny patch; I believe that this has no
> influence
>
> When starting an IMB run with the default settings, in all collective
> communication functions I see huge jumps around 32-1024 bytes and flat
> results around 1K-16K like:
>
> #----------------------------------------------------------------
> # Benchmarking Allgatherv
> # #processes = 64
> #----------------------------------------------------------------
> #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
> 0 1000 0.19 0.21 0.19
> 1 1000 35.29 35.30 35.29
> 2 1000 36.01 36.03 36.02
> 4 1000 38.97 38.98 38.98
> 8 1000 42.12 42.13 42.13
> 16 1000 45.76 45.77 45.77
> 32 1000 19991.83 20011.84 20005.29
> 64 1000 38561.52 38599.66 38587.74
> 128 1000 58263.81 58305.74 58293.48
> 256 1000 77382.83 77425.93 77412.49
> 512 1000 95981.97 96022.70 96010.73
> 1024 1000 480838.00 481214.78 481027.05
> 2048 1000 480522.97 480917.02 480727.98
> 4096 1000 480762.69 481134.49 480955.03
> 8192 1000 481136.70 481505.36 481334.86
> 16384 1000 483629.46 483889.28 483759.38
> 32768 1000 23809.47 23810.27 23809.62
> 65536 640 7085.58 7085.91 7085.69
> 131072 320 11928.29 11929.29 11928.72
> 262144 160 22174.66 22177.67 22175.94
> 524288 80 42270.91 42283.90 42277.55
> 1048576 40 82389.85 82461.10 82428.26
> 2097152 20 161347.04 161624.54 161485.84
> 4194304 10 321467.52 322562.79 322019.24
>
> This happens on various numbers of nodes and is reproducable - I have
> repeated the run 5 times on 8 nodes.
>
> I have not seen such results with 1.2.x series with either OB1+BTL or
> CM+MTL, timing increases rather smoothly. Trying various options with 1.3b2:
>
> --mca pml cm --mca mtl mx works well
> --mca pml cm --mca btl mx,sm,self works well
> --mca pml ob1 --mca btl mx,sm,self jumps like above
>
>> From what I know, the 1.2.x series defaulted to OB1+BTL; CM was only
>
> possible with a MTL which internally implemented sm and self, so second test
> above would have failed (please correct me if I'm wrong).
>
> The README for 1.3b2 specifies that CM is now chosen if possible; in my
> trials, when I specify CM+BTL, it doesn't complain and works well.
> However either the default (no options) or OB1+BTL leads to the jumps
> mentioned above, which makes me believe that OB1+BTL is still chosen as
> default, contrary to what the README specifies.
>
> So there are 2 issues:
> - which is right, the README or the runtime behaviour that I see ?
> - is it normal for the OB1+BTL to behave so poorly with MX ?
>
> Thanks for any insight into this issues.
>
> --
> Bogdan Costescu
>
> IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
> Phone: +49 6221 54 8240, Fax: +49 6221 54 8850
> E-mail: bogdan.costescu_at_[hidden]
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
 tmattox_at_[hidden] || timattox_at_[hidden]
    I'm a bright... http://www.the-brights.net/