Ralph Castain wrote:
> Thanks - yes, that helps. Can you do add --display-map to you cmd
> line? That will tell us what mpirun thinks it is doing.
The output from display map is below. Note that I've sanitized a few
items, but nothing relevant to this:
[granite:29685] Map for job: 1 Generated by mapping mode: byslot
Starting vpid: 0 Vpid range: 16 Num app_contexts: 1
Data for app_context: index 0 app: /path/to/executable
Num procs: 16
Argv[0]: /path/to/executable
Env[0]: OMPI_MCA_rmaps_base_display_map=1
Env[1]:
OMPI_MCA_orte_precondition_transports=e16b0004a956445e-0515b892592a4a02
Env[2]: OMPI_MCA_rds=proxy
Env[3]: OMPI_MCA_ras=proxy
Env[4]: OMPI_MCA_rmaps=proxy
Env[5]: OMPI_MCA_pls=proxy
Env[6]: OMPI_MCA_rmgr=proxy
Working dir: /home/user/case (user: 0)
Num maps: 0
Num elements in nodes list: 1
Mapped node:
Cell: 0 Nodename: granite Launch id: -1 Username:
NULL
Daemon name:
Data type: ORTE_PROCESS_NAME Data Value: NULL
Oversubscribed: True Num elements in procs list: 16
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,0]
Proc Rank: 0 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,1]
Proc Rank: 1 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,2]
Proc Rank: 2 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,3]
Proc Rank: 3 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,4]
Proc Rank: 4 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,5]
Proc Rank: 5 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,6]
Proc Rank: 6 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,7]
Proc Rank: 7 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,8]
Proc Rank: 8 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,9]
Proc Rank: 9 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,10]
Proc Rank: 10 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,11]
Proc Rank: 11 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,12]
Proc Rank: 12 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,13]
Proc Rank: 13 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,14]
Proc Rank: 14 Proc PID: 0 App_context
index: 0
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAME Data Value:
[0,1,15]
Proc Rank: 15 Proc PID: 0 App_context
index: 0
Note that technically this machine was not oversubscribed, though I
appreciate OMPI might not have any easy way to tell that without a
hostfile, etc.
--
V. Ram
v_r_959_at_[hidden]
--
http://www.fastmail.fm - Or how I learned to stop worrying and
love email again
|