Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Fwd: [OMPI svn-full] svn:open-mpi r21686
From: George Bosilca (bosilca_at_[hidden])
Date: 2009-07-15 16:30:30

I think I found a better solution (in r21688). Here is what I was
trying to do.

I have a more or less homogeneous cluster. In fact all processors are
identical, except that some are quad core and some dual core. Of
course I care how my processes are mapped on the quad cores, but not
really on the dual cores.

My approach was to use the following configuration files.

In /home/bosilca/.openmpi/mca-params.conf I have:

rmaps_rank_file_path = /home/bosilca/.openmpi/rankfile
rmaps_rank_file_priority = 100

In /home/bosilca/.openmpi/machinefile I have the full description of
the cluster. As an example:
node01 slots=4
node02 slots=4
node03 slots=2
node04 slots=2

And in the /home/bosilca/.openmpi/rankfile file I have:
rank 0=+n0 slot=0
rank 1=+n0 slot=1
rank 2=+n1 slot=0
rank 3=+n1 slot=1

As long as I spawn jobs with less than 4 processes everything worked
fine. But when I used more than 4 processes, orterun segfaulted. After
debugging I found that the nodes, lrank and nrank arrays were
allocated based on the jdata->num_procs, but then filled based on the
total number of processes in the jdata->nodes array. As it appears
that the jdata->num_procs is somehow modified based on the number of
entries in the rankfile, we end-up writing outside the allocation and
then segfault. Now with the latest patch, we can cope with such a
scenario by only packing the known information (and thus not writing
outside the allocated arrays).

This might not be the best approach, but it is doing what I'm looking
for ...


On Jul 15, 2009, at 15:50 , Ralph Castain wrote:

> The routed comm system relies on each daemon having complete
> information as to where every process is located, so the expectation
> was that only full maps would ever be sent. Thus, the nidmap code is
> setup to always send a full map.
> I don't know how to even generate a "partial" map. I assume you are
> doing something offline? Is this to update changed info? If so,
> you'll also have to do something to update the daemon's maps or the
> comm system will break down.
> Ralph
> On Wed, Jul 15, 2009 at 1:40 PM, George Bosilca
> <bosilca_at_[hidden]> wrote:
> I have a question regarding the mapping. How can I declare a partial
> mapping ? In fact I only care about how some of the processes are
> mapped on some specific nodes. Right now if the rmaps doesn't
> contain information about all nodes, we give up (before this patch
> we segfaulted).
> Does it means we always have to declare the whole mapping or it's
> just that we overlooked this strange case?
> george.
> Begin forwarded message:
> Author: bosilca
> Date: 2009-07-15 15:36:53 EDT (Wed, 15 Jul 2009)
> New Revision: 21686
> URL:
> Log:
> Reorder the nidmap encoding function. Add a check to make sure we
> don't write
> outside the boundaries of the allocated array.
> However, the problem is still there. If we have rmaps file
> containing only
> partial information the num_procs get set to the wrong value (the
> number of
> hosts in the rmaps file instead of the number of processes requested
> on the
> command line).
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> _______________________________________________
> devel mailing list
> devel_at_[hidden]