On May 5, 2010, at 6:10 AM, Jeff Squyres wrote:
> On May 4, 2010, at 9:53 AM, Ashley Pittman wrote:
>>> Point noted. But actually -- can you give specific reasons as to
>>> why a user should care? Keep in mind that this would be a short-
>>> lived fork'ed process -- not "spawn" in the MPI sense of the word.
>> You might be running the job under Valgrind or another debugger,
>> bclr has some issues with fork as I remember and traditionally
>> there have been IB mapping issues here as well. I'm sure you could
>> make a case against any of those points if you wanted to but I
>> think the argument stands, doing this kind of run-time check
>> shouldn't be needed.
> Mmm; good points (especially Valgrind). BLCR and OpenFabrics verbs
> shouldn't be much of an issue here, but I can see that there might
> be unexpectedness if you're running under Valgrind or some other
>> It might be possible to construct the code however so that if it
>> failed to initialise it just wasn't used rather than aborted the
>> job which would have much the same effect as a run-time test but
>> without having to fork new processes and create short-lived shared
>> memory regions.
> That's how most of the network transports are in OMPI today -- if
> they fail to init, they are just skipped.
> The problem here is that you really need 2 processes to do this
> test. I suppose it could be done with local ranks 0 and 1 instead
> of forking a new process -- they would just need to communicate via
> RML to sync up, I suppose.
I need to think about it a little more, but I like this solution.
Samuel K. Gutierrez
Los Alamos National Laboratory
>> I should of course said fork where I mentioned spawn above to avoid
>> any confusion, spawn has a specific meaning in the context of MPI.
>> I still think a better understanding of the issue is required
>> before any decision here is made though, I'm surprised by Samuels
>> description of the problem because it's not how I remember it and
>> from what Chris says it doesn't reflect what is in linux Git code
>> either. I'd like to see why there is an apparent difference in
>> behaviour before a decision is made to only support one.
> There's no intent to only support sysv or mmap. Samuel's work was
> to extend OMPI to support sysv in the case where it would be
> advantageous (e.g., guaranteed cleanup of the shmem segment). The
> mmap stuff is definitely not going to be removed.
> Jeff Squyres
> For corporate legal information go to:
> devel mailing list