Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] poor btl sm latency
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2012-02-16 10:50:55


On Feb 16, 2012, at 10:30 AM, Matthias Jurenz wrote:

> $ mpirun -np 2 --bind-to-core --cpus-per-proc 2 hwloc-bind --get
> 0x00000003
> 0x0000000c

That seems right. From your prior email, 3 maps to 11 binary, which maps to:

 Socket L#0 (16GB)
   NUMANode L#0 (P#0 8190MB) + L3 L#0 (6144KB)
     L2 L#0 (2048KB) + L1 L#0 (16KB) + Core L#0 + PU L#0 (P#0)
     L2 L#1 (2048KB) + L1 L#1 (16KB) + Core L#1 + PU L#1 (P#1)

And c maps to 1100 binary, which maps to PU's P#2 and P#3 on the same socket:

 Socket L#0 (16GB)
   NUMANode L#0 (P#0 8190MB) + L3 L#0 (6144KB)
     L2 L#2 (2048KB) + L1 L#2 (16KB) + Core L#2 + PU L#2 (P#2)
     L2 L#3 (2048KB) + L1 L#3 (16KB) + Core L#3 + PU L#3 (P#3)

Let me ask two more dumb questions:

1. Run "ompi_info | grep debug". All the debugging is set to "no", right?

2. Your /tmp is not a network filesystem, is it? (i.e., is OMPI putting the shared memory backing files on NFS?) I *think* you said in a prior mail that you tried all the shared memory methods and got the same results (i.e., not just mmap), right?

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/