Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] System V Shared Memory forOpenMPI:Request forCommunity Input and Testing
From: Samuel K. Gutierrez (samuel_at_[hidden])
Date: 2010-05-05 09:53:24


On May 5, 2010, at 6:10 AM, Jeff Squyres wrote:

> On May 4, 2010, at 9:53 AM, Ashley Pittman wrote:
>
>>> Point noted. But actually -- can you give specific reasons as to
>>> why a user should care? Keep in mind that this would be a short-
>>> lived fork'ed process -- not "spawn" in the MPI sense of the word.
>>
>> You might be running the job under Valgrind or another debugger,
>> bclr has some issues with fork as I remember and traditionally
>> there have been IB mapping issues here as well. I'm sure you could
>> make a case against any of those points if you wanted to but I
>> think the argument stands, doing this kind of run-time check
>> shouldn't be needed.
>
> Mmm; good points (especially Valgrind). BLCR and OpenFabrics verbs
> shouldn't be much of an issue here, but I can see that there might
> be unexpectedness if you're running under Valgrind or some other
> debugger.
>
>> It might be possible to construct the code however so that if it
>> failed to initialise it just wasn't used rather than aborted the
>> job which would have much the same effect as a run-time test but
>> without having to fork new processes and create short-lived shared
>> memory regions.
>
> That's how most of the network transports are in OMPI today -- if
> they fail to init, they are just skipped.
>
> The problem here is that you really need 2 processes to do this
> test. I suppose it could be done with local ranks 0 and 1 instead
> of forking a new process -- they would just need to communicate via
> RML to sync up, I suppose.

I need to think about it a little more, but I like this solution.

Thanks,

--
Samuel K. Gutierrez
Los Alamos National Laboratory
>
>> I should of course said fork where I mentioned spawn above to avoid  
>> any confusion, spawn has a specific meaning in the context of MPI.
>>
>> I still think a better understanding of the issue is required  
>> before any decision here is made though, I'm surprised by Samuels  
>> description of the problem because it's not how I remember it and  
>> from what Chris says it doesn't reflect what is in linux Git code  
>> either.  I'd like to see why there is an apparent difference in  
>> behaviour before a decision is made to only support one.
>
> There's no intent to only support sysv or mmap.  Samuel's work was  
> to extend OMPI to support sysv in the case where it would be  
> advantageous (e.g., guaranteed cleanup of the shmem segment).  The  
> mmap stuff is definitely not going to be removed.
>
> -- 
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel