The routed comm system relies on each daemon having complete information as to where every process is located, so the expectation was that only full maps would ever be sent. Thus, the nidmap code is setup to always send a full map.
I don't know how to even generate a "partial" map. I assume you are doing something offline? Is this to update changed info? If so, you'll also have to do something to update the daemon's maps or the comm system will break down.
I have a question regarding the mapping. How can I declare a partial mapping ? In fact I only care about how some of the processes are mapped on some specific nodes. Right now if the rmaps doesn't contain information about all nodes, we give up (before this patch we segfaulted).
Does it means we always have to declare the whole mapping or it's just that we overlooked this strange case?
Begin forwarded message:_______________________________________________
Date: 2009-07-15 15:36:53 EDT (Wed, 15 Jul 2009)
New Revision: 21686
Reorder the nidmap encoding function. Add a check to make sure we don't write
outside the boundaries of the allocated array.
However, the problem is still there. If we have rmaps file containing only
partial information the num_procs get set to the wrong value (the number of
hosts in the rmaps file instead of the number of processes requested on the
devel mailing list