This web mail archive is frozen.
This page is part of a frozen web archive of this mailing list.
You can still navigate around this archive, but know that no new mails
have been added to it since July of 2016.
Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.
That should work, then. When you set the "host" property, did you give the same name as was in your machine file?
Debug options that might help:
-mca plm_base_verbose 5 -mca rmaps_base_verbose 5
You'll need to configure --enable-debug to get the output, but that should help tell us what is happening.
On Aug 16, 2011, at 5:09 AM, Simone Pellegrini wrote:
> On 08/16/2011 12:30 PM, Ralph Castain wrote:
>> What version are you using?
> OpenMPI 1.4.3
>> On Aug 16, 2011, at 3:19 AM, Simone Pellegrini wrote:
>>> Dear all,
>>> I am developing a system to manage MPI tasks on top of MPI. The architecture is rather simple, I have a set of scheduler processes which takes care to manage the resources of a node. The idea is to have 1 (or more) of those scheduler allocated on each node of a cluster and then create new MPI processes (on demand) as computation is needed. Allocation of processes is done using MPI_Spawn.
>>> The system now works fine on a single node by allocating the main scheduler using the following mpi command:
>>> mpirun --np 1 ./scheduler ...
>>> Now when I scale to multiple nodes problems with default MPI behaviour starts. For example lets assume I have 2 nodes with 8 cpu cores each. I therefore set up a machine file in the following way:
>>> s01 slots=1
>>> s02 slots=1
>>> and start the node schedulers in the following way:
>>> mpirun --np 2 --machinefile machinefile ./scheduler ...
>>> This allocates the processes correctly, now the problem starts when I invoke MPI_Spawn. basically MPI spawn also uses the informations from the machinefile and if 4 MPI processes are spawned 2 are allocated in s01 and 2 on s02. What I want is to allocate the processes always in the same node.
>>> I tried to do this by specifying an MPI_Info object which is then passed to the MPI_Spawn routine. I tried to set the "host" property to the hostname of the machine where the scheduler is running but this didn't help.
>>> Unfortunately there is very little documentation on this.
>>> Thanks for the help,
>>> users mailing list
>> users mailing list