On 08/16/2011 12:30 PM, Ralph Castain wrote:
> What version are you using?
> On Aug 16, 2011, at 3:19 AM, Simone Pellegrini wrote:
>> Dear all,
>> I am developing a system to manage MPI tasks on top of MPI. The architecture is rather simple, I have a set of scheduler processes which takes care to manage the resources of a node. The idea is to have 1 (or more) of those scheduler allocated on each node of a cluster and then create new MPI processes (on demand) as computation is needed. Allocation of processes is done using MPI_Spawn.
>> The system now works fine on a single node by allocating the main scheduler using the following mpi command:
>> mpirun --np 1 ./scheduler ...
>> Now when I scale to multiple nodes problems with default MPI behaviour starts. For example lets assume I have 2 nodes with 8 cpu cores each. I therefore set up a machine file in the following way:
>> s01 slots=1
>> s02 slots=1
>> and start the node schedulers in the following way:
>> mpirun --np 2 --machinefile machinefile ./scheduler ...
>> This allocates the processes correctly, now the problem starts when I invoke MPI_Spawn. basically MPI spawn also uses the informations from the machinefile and if 4 MPI processes are spawned 2 are allocated in s01 and 2 on s02. What I want is to allocate the processes always in the same node.
>> I tried to do this by specifying an MPI_Info object which is then passed to the MPI_Spawn routine. I tried to set the "host" property to the hostname of the machine where the scheduler is running but this didn't help.
>> Unfortunately there is very little documentation on this.
>> Thanks for the help,
>> users mailing list
> users mailing list