Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] SM init failures
From: Paul H. Hargrove (PHHargrove_at_[hidden])
Date: 2009-03-27 19:59:43


Quoting from a different manpage for ftruncate:
       [T]he POSIX standard allows two behaviours for ftruncate
       when length exceeds the file length [...]: either returning an
error, or
       extending the file.
So, if that is to be trusted, it is not legal by POSIX to *silently* not
extend the file.

-Paul

George Bosilca wrote:
> Talking with Aurelien here @ UT we think we came-up with a possible
> way to get such an error. Before explaining this let me set the bases.
>
> There are 2 critical functions used in setting up the shared memory
> file. One is ftruncate the other one mmap. Here are two snippets from
> these functions documentation (with the interesting part between _).
>
> - ftruncate: . If it was _previously shorter than length, it is
> unspecified whether the file is changed or its size increased_. If the
> file is extended, the extended area appears as if it were zero-filled.
>
> - mmap: _The range of bytes starting at off and continuing for len
> bytes shall be legitimate for the possible (not necessarily current)
> offsets in the file_, shared memory object, or [TYM] typed memory
> object represented by fildes.
>
> As you can see ftruncate can succeed without increasing the size of
> the file to what we specified. Moreover, there is no way to know if
> the size was really increased or not, as ftruncate will return zero in
> all cases (except the really fatal ones). On the other hand, mmap
> suppose that the len is a legitimate length (as I guess it has no way
> to check that).
>
> In our specific case, if the file system is full then ftruncate might
> not do what we expect it to do, and mmap will be just happy to map the
> file to some memory. Later on when we really access the memory ...
> guess what ... we lamentably fail with a segfault as there is no such
> address.
>
> We only see one way around this. It will not prevent us from
> segfaulting but at least we can segfault in a known place, and we can
> put a message in the FAQ about this. The solution is to touch the last
> byte in the mmaped region which will force the operating system to
> really allocate the whole memory region. If this cannot succeed then
> we segfault, and if it can then we're good for the remaining of the
> execution.
>
> george.
>
> On Mar 27, 2009, at 13:30 , Tim Mattox wrote:
>
>> Eugene,
>> I think I remember setting up the MTT tests on Sif so that tests
>> are run both with and without the coll_hierarch component selected.
>> The coll_hierarch component stresses code paths and potential
>> race conditions in its own way. So, if the problems are showing up
>> more frequently for the test runs with the coll_hierarch component
>> enabled, then I would check the communicator creation code paths.
>>
>> Now that I'm at SiCortex, I don't have time to look into these IU MTT
>> failures not that I had a bunch of time while at IU ;-), but you can get
>> to a lot of information with some work in the MTT reporter web page.
>> Also, hopefully Josh will have a little time to look into it.
>>
>> Good luck! -- Tim
>>
>> On Fri, Mar 27, 2009 at 10:15 AM, Eugene Loh <Eugene.Loh_at_[hidden]> wrote:
>>> Josh Hursey wrote:
>>>
>>>> Sif is also running the coll_hierarch component on some of those
>>>> tests
>>>> which has caused some additional problems. I don't know if that is
>>>> related
>>>> or not.
>>>
>>> Indeed. Many of the MTT stack traces (for both 1.3.1 and 1.3.2 and
>>> that
>>> have seg faults and call out mca_btl_sm.so) do involve collectives
>>> and/or
>>> have mca_coll_hierarch.so in their stack traces. I could well
>>> imagine this
>>> is the culprit, though I do not know for sure.
>>>
>>> Ralph Castain wrote:
>>>
>>>> Hmmm...Eugene, you need to be a tad less sensitive. Nobody was
>>>> attempting
>>>> to indict you or in any way attack you or your code.
>>>
>>> Yes, I know, though thank you for saying so. I was overdoing the
>>> defensive
>>> rhetoric trying to be funny, but I confess it's nervous humor.
>>> There was
>>> stuff in the sm code that I couldn't see how it was 100% robust.
>>> Nevertheless, I let that style remain in the code with my changes...
>>> perhaps even pushing it a little bit. My putbacks include a comment
>>> or two
>>> to that effect. E.g.,
>>> https://svn.open-mpi.org/source/xref/ompi-trunk/ompi/mca/btl/sm/btl_sm.c?r=20774#523
>>>
>>> . I understand why the occasional failures that Jeff and Terry saw
>>> did not
>>> hold up 1.3.1, but I'd really like to understand them and fix them.
>>> But at
>>> 0.01% fail rate (<0.001% for me... I've never seen it in 100Ks of
>>> runs), all
>>> I can do about etiology and fixes is speculate.
>>>
>>> Okay, joke overdone and nervousness no longer funny. Indeed,
>>> annoying. I
>>> stop.
>>>
>>>> Since we clearly see problems on sif, and Josh has indicated a
>>>> willingness to help with debugging, this might be a place to start
>>>> the
>>>> investigation. If asked nicely, they might even be willing to
>>>> grant access
>>>> to the machine, if that would help.
>>>
>>> Maybe a starting point would be running IU_Sif without coll_hierarch
>>> and
>>> seeing where we stand.
>>>
>>> And, again, my gut feel is that the failures are unrelated to the 0.01%
>>> failures that Jeff and Terry were seeing.
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>>
>> --
>> Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
>> timattox_at_[hidden] || timattox_at_[hidden]
>> I'm a bright... http://www.the-brights.net/
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group                 Tel: +1-510-495-2352
HPC Research Department                   Fax: +1-510-486-6900
Lawrence Berkeley National Laboratory