1.1 --- a/orte/tools/orterun/orterun.1in Tue Feb 19 22:35:08 2013 +0000
1.2 +++ b/orte/tools/orterun/orterun.1in Tue Feb 19 22:36:41 2013 +0000
1.3 @@ -900,29 +900,61 @@
1.4 .
1.5 .SS Rankfiles
1.6 .
1.7 -Rankfiles provide a means for specifying detailed information about
1.8 -how process ranks should be mapped to nodes and how they should be bound.
1.9 -Consider the following:
1.10 +Rankfiles are text files that specify detailed information about how
1.11 +individual processes should be mapped to nodes, and to which
1.12 +processor(s) they should be bound. Each line of a rankfile specifies
1.13 +the location of one process (for MPI jobs, the process' "rank" refers
1.14 +to its rank in MPI_COMM_WORLD). The general form of each line in the
1.15 +rankfile is:
1.16 .
1.17
1.18 - cat myrankfile
1.19 + rank <N>=<hostname> slot=<slot list>
1.20 +.
1.21 +.PP
1.22 +For example:
1.23 +.
1.24 +
1.25 + $ cat myrankfile
1.26 rank 0=aa slot=1:0-2
1.27 rank 1=bb slot=0:0,1
1.28 rank 2=cc slot=1-2
1.29 - mpirun -H aa,bb,cc,dd -rf myrankfile ./a.out
1.30 + $ mpirun -H aa,bb,cc,dd -rf myrankfile ./a.out
1.31 .
1.32 .PP
1.33 -So that
1.34 +Means that
1.35 .
1.36 +
1.37 Rank 0 runs on node aa, bound to socket 1, cores 0-2.
1.38 Rank 1 runs on node bb, bound to socket 0, cores 0 and 1.
1.39 Rank 2 runs on node cc, bound to cores 1 and 2.
1.40 .
1.41 .PP
1.42 -Note that all slot locations are to be specified as
1.43 +The hostnames listed above are "absolute," meaning that actual
1.44 +resolveable hostnames are specified. However, hostnames can also be
1.45 +specified as "relative," meaning that they are specified in relation
1.46 +to an externally-specified list of hostnames (e.g., by mpirun's --host
1.47 +argument, a hostfile, or a job scheduler).
1.48 +.
1.49 +.PP
1.50 +The "relative" specification is of the form "+n<X>", where X is an
1.51 +integer specifying the Xth hostname in the set of all available
1.52 +hostnames, indexed from 0. For example:
1.53 +.
1.54 +
1.55 + $ cat myrankfile
1.56 + rank 0=+n0 slot=1:0-2
1.57 + rank 1=+n1 slot=0:0,1
1.58 + rank 2=+n2 slot=1-2
1.59 + $ mpirun -H aa,bb,cc,dd -rf myrankfile ./a.out
1.60 +.
1.61 +.PP
1.62 +Starting with Open MPI v1.7, all socket/core slot locations are be
1.63 +specified as
1.64 +.I logical
1.65 +indexes (the Open MPI v1.6 series used
1.66 .I physical
1.67 -indexes. You can use tools such as HWLOC's "lstopo -v" to find the
1.68 -physical indexes of socket and cores.
1.69 +indexes). You can use tools such as HWLOC's "lstopo" to find the
1.70 +logical indexes of socket and cores.
1.71 .
1.72 .
1.73 .SS Application Context or Executable Program?