Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] SM init failures
From: Christian Siebert (christian.siebert_at_[hidden])
Date: 2009-03-30 04:48:45


Hi,

as you all have noticed already, ftruncate() does NOT extend the size
of a file on all systems. Instead, the preferred way to set a file to
a specific size is to call lseek() and then write() one byte (see e.g.
[1]).

Best regards,

    Christian

[1] Richard Stevens: Advanced Programming in the UNIX Environment

Quoting George Bosilca <bosilca_at_[hidden]>:

> Talking with Aurelien here @ UT we think we came-up with a possible
> way to get such an error. Before explaining this let me set the bases.
>
> There are 2 critical functions used in setting up the shared memory
> file. One is ftruncate the other one mmap. Here are two snippets
> from these functions documentation (with the interesting part
> between _).
>
> - ftruncate: . If it was _previously shorter than length, it is
> unspecified whether the file is changed or its size increased_. If
> the file is extended, the extended area appears as if it were
> zero-filled.
>
> - mmap: _The range of bytes starting at off and continuing for len
> bytes shall be legitimate for the possible (not necessarily current)
> offsets in the file_, shared memory object, or [TYM] typed memory
> object represented by fildes.
>
> As you can see ftruncate can succeed without increasing the size of
> the file to what we specified. Moreover, there is no way to know if
> the size was really increased or not, as ftruncate will return zero
> in all cases (except the really fatal ones). On the other hand, mmap
> suppose that the len is a legitimate length (as I guess it has no
> way to check that).
>
> In our specific case, if the file system is full then ftruncate
> might not do what we expect it to do, and mmap will be just happy to
> map the file to some memory. Later on when we really access the
> memory ... guess what ... we lamentably fail with a segfault as
> there is no such address.
>
> We only see one way around this. It will not prevent us from
> segfaulting but at least we can segfault in a known place, and we
> can put a message in the FAQ about this. The solution is to touch
> the last byte in the mmaped region which will force the operating
> system to really allocate the whole memory region. If this cannot
> succeed then we segfault, and if it can then we're good for the
> remaining of the execution.
>
> george.
>
> On Mar 27, 2009, at 13:30 , Tim Mattox wrote:
>
>> Eugene,
>> I think I remember setting up the MTT tests on Sif so that tests
>> are run both with and without the coll_hierarch component selected.
>> The coll_hierarch component stresses code paths and potential
>> race conditions in its own way. So, if the problems are showing up
>> more frequently for the test runs with the coll_hierarch component
>> enabled, then I would check the communicator creation code paths.
>>
>> Now that I'm at SiCortex, I don't have time to look into these IU MTT
>> failures not that I had a bunch of time while at IU ;-), but you can get
>> to a lot of information with some work in the MTT reporter web page.
>> Also, hopefully Josh will have a little time to look into it.
>>
>> Good luck! -- Tim
>>
>> On Fri, Mar 27, 2009 at 10:15 AM, Eugene Loh <Eugene.Loh_at_[hidden]> wrote:
>>> Josh Hursey wrote:
>>>
>>>> Sif is also running the coll_hierarch component on some of those tests
>>>> which has caused some additional problems. I don't know if that
>>>> is related
>>>> or not.
>>>
>>> Indeed. Many of the MTT stack traces (for both 1.3.1 and 1.3.2 and that
>>> have seg faults and call out mca_btl_sm.so) do involve collectives and/or
>>> have mca_coll_hierarch.so in their stack traces. I could well imagine this
>>> is the culprit, though I do not know for sure.
>>>
>>> Ralph Castain wrote:
>>>
>>>> Hmmm...Eugene, you need to be a tad less sensitive. Nobody was attempting
>>>> to indict you or in any way attack you or your code.
>>>
>>> Yes, I know, though thank you for saying so. I was overdoing the defensive
>>> rhetoric trying to be funny, but I confess it's nervous humor. There was
>>> stuff in the sm code that I couldn't see how it was 100% robust.
>>> Nevertheless, I let that style remain in the code with my changes...
>>> perhaps even pushing it a little bit. My putbacks include a comment or two
>>> to that effect. E.g.,
>>> https://svn.open-mpi.org/source/xref/ompi-trunk/ompi/mca/btl/sm/btl_sm.c?r=20774#523
>>> . I understand why the occasional failures that Jeff and Terry saw did not
>>> hold up 1.3.1, but I'd really like to understand them and fix them. But at
>>> 0.01% fail rate (<0.001% for me... I've never seen it in 100Ks of
>>> runs), all
>>> I can do about etiology and fixes is speculate.
>>>
>>> Okay, joke overdone and nervousness no longer funny. Indeed, annoying. I
>>> stop.
>>>
>>>> Since we clearly see problems on sif, and Josh has indicated a
>>>> willingness to help with debugging, this might be a place to start the
>>>> investigation. If asked nicely, they might even be willing to
>>>> grant access
>>>> to the machine, if that would help.
>>>
>>> Maybe a starting point would be running IU_Sif without coll_hierarch and
>>> seeing where we stand.
>>>
>>> And, again, my gut feel is that the failures are unrelated to the 0.01%
>>> failures that Jeff and Terry were seeing.
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>>
>> --
>> Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
>> timattox_at_[hidden] || timattox_at_[hidden]
>> I'm a bright... http://www.the-brights.net/
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>