Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-05-04 10:02:50


On May 4, 2010, at 7:56 AM, Terry Dontje wrote:

> Ralph Castain wrote:
>>
>>
>> On May 4, 2010, at 3:45 AM, Terry Dontje wrote:
>>
>>> Is a configure-time test good enough? For example, are all Linuxes the same in this regard. That is if you built OMPI on RH and it configured in the new SysV SM will those bits actually run on other Linux systems correctly? I think Jeff had hinted to this similarly when suggesting this may need to be a runtime test.
>>>
>>
>> I don't think we have ever enforced that requirement, nor am I sure the current code would meet it. We have a number of components that test for ability to build, but don't check again at run-time.
>>
>> Generally, the project has followed the philosophy of "build on the system you intend to run on".
>>
> There is at least one binary distribution that does build on one linux and allows to be installed on several others. That is the reason I bring up the above. The community can make a stance that that one distribution does not matter for this case or needs to handle it on its own. In the grand scheme of things it might not matter but I wanted to at least stand up and be heard.

No problem - I would simply suggest that they not --enable-sysv or whatever Sam calls it. They don't -have- to support that mode, it's just an option.

Or Sam could include a --enable-runtime-sysv-check so they can offer it if they want, but recognize that it may significantly slow down process launch.

>
> --td
>>> --td
>>>
>>> Samuel K. Gutierrez wrote:
>>>>
>>>> Hi All,
>>>>
>>>> New configure-time test added - thanks for the suggestion, Jeff. Update and give it a whirl.
>>>>
>>>> Ethan - could you please try again? This time, I'm hoping sysv support will be disabled ;-).
>>>>
>>>> Thanks!
>>>>
>>>> --
>>>> Samuel K. Gutierrez
>>>> Los Alamos National Laboratory
>>>>
>>>> On May 3, 2010, at 9:18 AM, Samuel K. Gutierrez wrote:
>>>>
>>>>> Hi Jeff,
>>>>>
>>>>> Sounds like a plan :-).
>>>>>
>>>>> Thanks!
>>>>>
>>>>> --
>>>>> Samuel K. Gutierrez
>>>>> Los Alamos National Laboratory
>>>>>
>>>>> On May 3, 2010, at 9:12 AM, Jeff Squyres wrote:
>>>>>
>>>>>> It might well be that you need a configure test to determine whether this behavior occurs or not. Heck, it may even need to be a run-time test! Hrm.
>>>>>>
>>>>>> Write a small C program that does something like the following (this is off the top of my head):
>>>>>>
>>>>>> fork a child
>>>>>> child goes to sleep immediately
>>>>>> sysv alloc a segment
>>>>>> attach to it
>>>>>> ipc rm it
>>>>>> parent wakes up child
>>>>>> child tries to attach to segment
>>>>>>
>>>>>> If that succeeds, then all is good. If not, then don't use this stuff.
>>>>>>
>>>>>>
>>>>>> On May 3, 2010, at 10:55 AM, Samuel K. Gutierrez wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Does anyone know of a relatively portable solution for querying a
>>>>>>> given system for the shmctl behavior that I am relying on, or is this
>>>>>>> going to be a nightmare? Because, if I am reading this thread
>>>>>>> correctly, the presence of shmget and Linux is not sufficient for
>>>>>>> determining an adequate level of sysv support.
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> --
>>>>>>> Samuel K. Gutierrez
>>>>>>> Los Alamos National Laboratory
>>>>>>>
>>>>>>> On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote:
>>>>>>>
>>>>>>>> On May 2 2010, Ashley Pittman wrote:
>>>>>>>>> On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote:
>>>>>>>>>
>>>>>>>>> As to performance there should be no difference in use between sys-
>>>>>>>>> V shared memory and file-backed shared memory, the instructions
>>>>>>>>> issued and the MMU flags for the page should both be the same so
>>>>>>>>> the performance should be identical.
>>>>>>>>
>>>>>>>> Not necessarily, and possibly not so even for far-future Linuces.
>>>>>>>> On at least one system I used, the poxious kernel wrote the complete
>>>>>>>> file to disk before returning - all right, it did that for System V
>>>>>>>> shared memory, too, just to a 'hidden' file! But, if I recall, on
>>>>>>>> another it did that only for file-backed shared memory - however, it's
>>>>>>>> a decade ago now and I may be misremembering.
>>>>>>>>
>>>>>>>> Of course, that's a serious issue mainly for large segments. I was
>>>>>>>> using multi-GB ones. I don't know how big the ones you need are.
>>>>>>>>
>>>>>>>>> The one area you do need to keep an eye on for performance is on
>>>>>>>>> numa machines where it's important which process on a node touches
>>>>>>>>> each page first, you can end up using different areas (pages, not
>>>>>>>>> regions) for communicating in different directions between the same
>>>>>>>>> pair of processes. I don't believe this is any different to mmap
>>>>>>>>> backed shared memory though.
>>>>>>>>
>>>>>>>> On some systems it may be, but in bizarre, inconsistent, undocumented
>>>>>>>> and unpredictable ways :-( Also, there are usually several system
>>>>>>>> (and
>>>>>>>> sometimes user) configuration options that change the behaviour, so
>>>>>>>> you
>>>>>>>> have to allow for that. My experience of trying to use those is that
>>>>>>>> different uses have incompatible requirements, and most of the
>>>>>>>> critical
>>>>>>>> configuration parameters apply to ALL uses!
>>>>>>>>
>>>>>>>> In my view, the configuration variability is the number one nightmare
>>>>>>>> for trying to write portable code that uses any form of shared memory.
>>>>>>>> ARMCI seem to agree.
>>>>>>>>
>>>>>>>>>> Because of this, sysv support may be limited to Linux systems -
>>>>>>>>>> that is,
>>>>>>>>>> until we can get a better sense of which systems provide the shmctl
>>>>>>>>>> IPC_RMID behavior that I am relying on.
>>>>>>>>
>>>>>>>> And, I suggest, whether they have an evil gotcha on one of the areas
>>>>>>>> that
>>>>>>>> Ashley Pittman noted.
>>>>>>>>
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Nick Maclaren.
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> devel_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Jeff Squyres
>>>>>> jsquyres_at_[hidden]
>>>>>> For corporate legal information go to:
>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>> --
>>> <Mail Attachment.gif>
>>> Terry D. Dontje | Principal Software Engineer
>>> Developer Tools Engineering | +1.650.633.7054
>>> Oracle - Performance Technologies
>>> 95 Network Drive, Burlington, MA 01803
>>> Email terry.dontje_at_[hidden]
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> <Mail Attachment.gif>
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.650.633.7054
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.dontje_at_[hidden]
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel