Hello,
I have built Open MPI (1.2) with run-time environment enabled for Torque
(2.1.6) resource manager. Initially I am requesting 4 nodes (1 CPU each)
from Torque. The from inside of my MPI code I am trying to spawn more
processes to nodes outside of Torque-assigned nodes using
MPI_Comm_spawn, but this is failing with an error below:
[wins04:13564] *** An error occurred in MPI_Comm_spawn
[wins04:13564] *** on communicator MPI_COMM_WORLD
[wins04:13564] *** MPI_ERR_ARG: invalid argument of some other kind
[wins04:13564] *** MPI_ERRORS_ARE_FATAL (goodbye)
mpirun noticed that job rank 1 with PID 15070 on node wins03 exited on
signal 15 (Terminated).
2 additional processes aborted (not shown)
#################################
MPI_Info info;
MPI_Comm comm, *intercomm;
...
...
char *key, *value;
key = "host";
value = "wins08";
rc1 = MPI_Info_create(&info);
rc1 = MPI_Info_set(info, key, value);
rc1 = MPI_Comm_spawn(slave,MPI_ARGV_NULL, 1, info, 0,
MPI_COMM_WORLD, intercomm, arr);
...
}
###################################################
Would this work as it is or is something wrong with my assumption? Is
OpenRTE stopping me from spawning processes outside of the initially
allocated nodes through Torque?
Thanks,
Prakash
|