Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI providing rank?
From: Eugene Loh (eugene.loh_at_[hidden])
Date: 2010-08-01 01:17:34


Yves Caniou wrote:
Le Wednesday 28 July 2010 15:05:28, vous avez écrit :
  
I am confused. I thought all you wanted to do is report out the binding of
the process - yes? Are you trying to set the affinity bindings yourself?

If the latter, then your script doesn't do anything that mpirun wouldn't
do, and doesn't do it as well. You would be far better off just adding
--bind-to-core to the mpirun cmd line.
    
"mpirun -h" says that it is the default, so there is not even something to do?
I don't even have to add "--mca mpi_paffinity_alone 1" ?
  
Wow.  I just tried "mpirun -h" and, yes, it claims that "--bind-to-core" is the default.  I believe this is wrong... or at least "misleading."  :^)  You should specify --bind-to-core explicitly.  It is the successor to paffinity.  Do add --report-bindings to check what you're getting.
On Jul 28, 2010, at 6:37 AM, Yves Caniou wrote:
    
Le Wednesday 28 July 2010 11:34:13 Ralph Castain, vous avez écrit :
      
On Jul 27, 2010, at 11:18 PM, Yves Caniou wrote:
        
Le Wednesday 28 July 2010 06:03:21 Nysal Jan, vous avez écrit :
          
OMPI_COMM_WORLD_RANK can be used to get the MPI rank.
            
Are processes affected to nodes sequentially, so that I can get the
NODE number from $OMPI_COMM_WORLD_RANK modulo the number of proc per
node?
          
By default, yes. However, you can select alternative mapping methods.
        
It reports to stderr, so the $OMPI_COMM_WORLD_RANK modulo the number of
proc per nodes seems more appropriate for what I need, right?

So is the following valid to put memory affinity?

script.sh:
 MYRANK=$OMPI_COMM_WORLD_RANK
 MYVAL=$(expr $MYRANK / 4)
 NODE=$(expr $MYVAL % 4)
 numactl --cpunodebind=$NODE --membind=$NODE $@

mpiexec ./script.sh -n 128 myappli myparam
      
Another option is to use OMPI_COMM_WORLD_LOCAL_RANK.  This environment variable directly gives you the value you're looking for, regardless of how process ranks are mapped to the nodes.

      
Which is better: using this option, or the cmd line with numactl (if it
works)? What is the difference?
I don't know what's "better," but here are some potential issues:

*) Different MPI implementations use different mechanisms for specifying binding.  So, if you want your solution to be "portable"... well, if you want that, you're out of luck.  But, perhaps some mechanisms (command-line arguments, run-time scripts, etc.) might seem easier for you to adapt than others.

*) Some mechanisms bind processes at process launch time and some at MPI_Init time.  The former might be better.  Otherwise, a process might place some NUMA memory in a location before MPI_Init and then be moved away from that memory when MPI_Init is encountered.  I believe both the numactl and OMPI --bind-to-core mechanisms have this characteristic.  (OMPI's older paffinity might not, but I don't remember for sure.)

Mostly, if you're going to use just OMPI, the --bind-to-core command-line argument might be the simplest.