Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Possible buffer overrun bug in opal_free_list_grow, called by MPI::Init
From: Stephan Kramer (stephan.kramer_at_[hidden])
Date: 2008-10-21 13:32:07


George Bosilca wrote:
> Stephan,
>
> The fix was committed in the trunk (revision 19778). Fixes for the
> 1.2, as well as the 1.3 are pending.
>
> Thanks for your help,
> george.
Ah brilliant. Yes, modulos get me confused in Fortran as well (there are
2 functions in the standard!). Glad, that's sorted now

Cheers
Stephan

>
> On Oct 20, 2008, at 5:43 PM, George Bosilca wrote:
>
>> Stephen,
>>
>> I think you're completely right, and that I had a wrong understanding
>> of the modulus operator. Based on my memory, I was pretty sure that
>> the modulus is ALWAYS positive. Now, even Wikipedia seems to
>> contradict me :) They have a pretty good definition of % based on the
>> programming language (http://en.wikipedia.org/wiki/Modulo_operation).
>>
>> I will apply your patch to all places where we use modulus in Open
>> MPI. Thanks for your help on this issue.
>>
>> Thanks,
>> george.
>>
>> On Oct 19, 2008, at 1:43 PM, Stephan Kramer wrote:
>>
>>> George Bosilca wrote:
>>>> Stephan,
>>>>
>>>> You might be right. intptr_t is a signed type, which allows the
>>>> result of % to be potentially negative. However, on the other side,
>>>> mod is defined as a size_t which [based on my memory] is
>>>> definitively unsigned as it represent a size.
>>>>
>>>> Did you try to apply your patch to Open MPI ? If yes does it
>>>> resolve the issue ?
>>>>
>>>> george.
>>> Yes, I have applied the patch intptr_t -> uintptr_t and it does
>>> resolve the issue.
>>>
>>> I think the way this works, I'm not a C programmer myself, is:
>>> - the outcome of the % is a signed and negative number, say -x
>>> - this number gets wrapped in the assignment to the signed integer
>>> mod: UINT_MAX+1-x
>>> - in the subtraction CACHE_LINE_SIZE-mod, the result is wrapped
>>> around again, giving CACHE_LINE_SIZE+x
>>>
>>> Cheers
>>> Stephan
>>>>
>>>> On Oct 16, 2008, at 7:29 PM, Stephan Kramer wrote:
>>>>
>>>>> George Bosilca wrote:
>>>>>> I did investigate this issue for about 3 hours yesterday. Neither
>>>>>> valgrind nor efence report any errors on my cluster. I'm using
>>>>>> debian unstable with gcc-4.1.2. Adding printfs doesn't shows the
>>>>>> same output as you, all addresses are in the correct range. I
>>>>>> went over the code manually, and to be honest I cannot imagine
>>>>>> how this might happens IF the compiler is doing what it is
>>>>>> supposed to do.
>>>>>>
>>>>>> I run out of options on this one. If you can debug it and figure
>>>>>> out what's the problem there I'll be happy to hear.
>>>>>>
>>>>>> george.
>>>>> Hi George,
>>>>>
>>>>> Thanks a lot for your effort of looking into this. I think I've
>>>>> come a bit further with this. The reproducibility may in fact have
>>>>> to do with 32bit/64 bit differences.
>>>>> I think the culprit is line 105 of opal_free_list.c:
>>>>>
>>>>> mod = (intptr_t)ptr % CACHE_LINE_SIZE;
>>>>> if(mod != 0) {
>>>>> ptr += (CACHE_LINE_SIZE - mod);
>>>>> }
>>>>>
>>>>> As intptr_t casts to a signed integer, for 32 bit with addresses
>>>>> above 0x7fff ffff the outcome of mod will be negative. Thus ptr
>>>>> will be increased with more than CACHE_LINE_SIZE, which is not
>>>>> accounted for in the allocated buffer size in line 93, and a
>>>>> buffer overrun will appear in the subsequent element loop. This is
>>>>> confirmed with the output of some debugging statements I've pasted
>>>>> below. Also I haven't come across the same bug on 64bit machines.
>>>>>
>>>>> I guess this should be uintptr_t instead?
>>>>>
>>>>> Cheers
>>>>> Stephan Kramer
>>>>>
>>>>> The debugging output:
>>>>>
>>>>> mpidebug: num_elements = 1, flist->fl_elem_size = 40
>>>>> mpidebug: sizeof(opal_list_item_t) = 16
>>>>> mpidebug: allocating 184
>>>>> mpidebug: allocated at memory address 0xb2d29f48
>>>>> mpidebug: mod = -40, CACHE_LINE_SIZE = 128
>>>>>
>>>>> and at point of the buffer overrun/efence segfault in gdb:
>>>>>
>>>>> (gdb) print item
>>>>> $23 = (opal_free_list_item_t *) 0xb2d2a000
>>>>>
>>>>> which is exactly at (over) the end of the buffer:
>>>>> 0xb2d2a000=0xb2d29f48 + 184
>>>>>
>>>>>>
>>>>>> On Oct 14, 2008, at 11:03 AM, Stephan Kramer wrote:
>>>>>>
>>>>>>> Would someone mind having another look at the bug reported
>>>>>>> below? I'm hitting exactly the same problem with Debian
>>>>>>> Unstable, openmpi 1.2.7~rc2. Both valgrind and efence are
>>>>>>> indispensable tools for any developper, where both may catch
>>>>>>> errors the other won't report. Electric fence is especially good
>>>>>>> at catching buffer overruns as it protects the beginning and end
>>>>>>> of each allocated buffer. The original bug report shows an
>>>>>>> undeniable buffer overrun in MPI::Init, i.e. the attached patch
>>>>>>> prints out exactly the address it's trying to access which is
>>>>>>> over the end of the buffer. Any help would be much appreciated
>>>>>>>
>>>>>>> Stephan Kramer
>>>>>>>
>>>>>>>
>>>>>>>> Patrick,
>>>>>>>>
>>>>>>>> I'm unable to reproduce the buffer overrun with the latest
>>>>>>>> trunk. I
>>>>>>>> run valgrind (with the memchekcer tool) on a regular basis on the
>>>>>>>> trunk, and I never noticed anything like that. Moreover, I went
>>>>>>>> over
>>>>>>>> the code, and I cannot imagine how we can overrun the buffer in
>>>>>>>> the
>>>>>>>> code you pinpointed.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> george.
>>>>>>>>
>>>>>>>> On Aug 23, 2008, at 7:57 PM, Patrick Farrell wrote:
>>>>>>>>
>>>>>>>> > Hi,
>>>>>>>> >
>>>>>>>> > I think I have found a buffer overrun in a function
>>>>>>>> > called by MPI::Init, though explanations of why I am
>>>>>>>> > wrong are welcome.
>>>>>>>> >
>>>>>>>> > I am using the openmpi included in Ubuntu Hardy,
>>>>>>>> > version 1.2.5, though I have inspected the latest trunk by eye
>>>>>>>> > and I don't believe the relevant code has changed.
>>>>>>>> >
>>>>>>>> > I was trying to use Electric Fence, a memory debugging library,
>>>>>>>> > to debug a suspected buffer overrun in my own program.
>>>>>>>> > Electric Fence works by replacing malloc/free in such
>>>>>>>> > a way that bounds violation errors issue a segfault.
>>>>>>>> > While running my program under Electric Fence, I found
>>>>>>>> > that I got a segfault issued at:
>>>>>>>> >
>>>>>>>> > 0xb5cdd334 in opal_free_list_grow (flist=0xb2b46a50,
>>>>>>>> num_elements=1)
>>>>>>>> > at class/opal_free_list.c:113
>>>>>>>> > 113 OBJ_CONSTRUCT_INTERNAL(item, flist->fl_elem_class);
>>>>>>>> > (gdb) bt
>>>>>>>> > #0 0xb5cdd334 in opal_free_list_grow (flist=0xb2b46a50,
>>>>>>>> > num_elements=1) at class/opal_free_list.c:113
>>>>>>>> > #1 0xb5cdd479 in opal_free_list_init (flist=0xb2b46a50,
>>>>>>>> > elem_size=56, elem_class=0xb2b46e20, num_elements_to_alloc=73,
>>>>>>>> > max_elements_to_alloc=-1, num_elements_per_alloc=1) at class/
>>>>>>>> > opal_free_list.c:78
>>>>>>>> > #2 0xb2b381aa in ompi_osc_pt2pt_component_init
>>>>>>>> > (enable_progress_threads=false, enable_mpi_threads=false) at
>>>>>>>> > osc_pt2pt_component.c:173
>>>>>>>> > #3 0xb792b67c in ompi_osc_base_find_available
>>>>>>>> > (enable_progress_threads=false, enable_mpi_threads=false) at
>>>>>>>> base/
>>>>>>>> > osc_base_open.c:84
>>>>>>>> > #4 0xb78e6abe in ompi_mpi_init (argc=5, argv=0xbfd61f84,
>>>>>>>> > requested=0, provided=0xbfd61e78) at runtime/ompi_mpi_init.c:411
>>>>>>>> > #5 0xb7911a87 in PMPI_Init (argc=0xbfd61f00, argv=0xbfd61f04) at
>>>>>>>> > pinit.c:71
>>>>>>>> > #6 0x0811ca6c in MPI::Init ()
>>>>>>>> > #7 0x08118b8a in main ()
>>>>>>>> >
>>>>>>>> > To investigate further, I replaced the OBJ_CONSTRUCT_INTERNAL
>>>>>>>> > macro with its definition in opal/class/opal_object.h, and
>>>>>>>> ran it
>>>>>>>> > again.
>>>>>>>> > It appears that the invalid memory access is happening
>>>>>>>> > on the instruction
>>>>>>>> >
>>>>>>>> > ((opal_object_t *) (item))->obj_class = (flist->fl_elem_class);
>>>>>>>> >
>>>>>>>> > Investigating further, I modified the source to opal_free_list
>>>>>>>> > with the attached patch. It adds a few debugging printfs to
>>>>>>>> > diagnose exactly what the code is doing. The output of the
>>>>>>>> debugging
>>>>>>>> > statements are:
>>>>>>>> >
>>>>>>>> > mpidebug: allocating 216
>>>>>>>> > mpidebug: allocated at memory address 0xb62bdf28
>>>>>>>> > mpidebug: accessing address 0xb62be000
>>>>>>>> > [segfault]
>>>>>>>> >
>>>>>>>> > Now, 0xb62be000 - 0xb62bdf28 = 216, which is
>>>>>>>> > the size of the buffer allocated, and so I think
>>>>>>>> > this is a buffer overrun.
>>>>>>>> >
>>>>>>>> > Steps to reproduce:
>>>>>>>> >
>>>>>>>> > a) Install Electric Fence
>>>>>>>> > b) Compile the following program
>>>>>>>> >
>>>>>>>> > #include <stdlib.h>
>>>>>>>> > #include <unistd.h>
>>>>>>>> >
>>>>>>>> > #include <mpi.h>
>>>>>>>> >
>>>>>>>> > int main(int argc, char **argv)
>>>>>>>> > {
>>>>>>>> > MPI::Init(argc, argv);
>>>>>>>> > MPI::Finalize();
>>>>>>>> >
>>>>>>>> > return 0;
>>>>>>>> > }
>>>>>>>> >
>>>>>>>> > with
>>>>>>>> >
>>>>>>>> > mpiCC -o test ./test.cpp
>>>>>>>> >
>>>>>>>> > c) gdb ./test
>>>>>>>> > d) set environment LD_PRELOAD /usr/lib/libefence.so.0.0
>>>>>>>> > e) run
>>>>>>>> >
>>>>>>>> > Hope this helps,
>>>>>>>> >
>>>>>>>> > Patrick Farrell
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > Patrick Farrell
>>>>>>>> > PhD student
>>>>>>>> > Imperial College London
>>>>>>>> > --- openmpi-1.2.5/opal/class/opal_free_list.c 2008-08-23
>>>>>>>> > 18:35:03.000000000 +0100
>>>>>>>> > +++ openmpi-1.2.5-modified/opal/class/opal_free_list.c
>>>>>>>> 2008-08-23
>>>>>>>> > 18:31:47.000000000 +0100
>>>>>>>> > @@ -90,9 +90,12 @@
>>>>>>>> > if (flist->fl_max_to_alloc > 0 && flist->fl_num_allocated +
>>>>>>>> > num_elements > flist->fl_max_to_alloc)
>>>>>>>> > return OPAL_ERR_TEMP_OUT_OF_RESOURCE;
>>>>>>>> >
>>>>>>>> > + fprintf(stderr, "mpidebug: allocating %d\n", (num_elements *
>>>>>>>> > flist->fl_elem_size) + sizeof(opal_list_item_t) +
>>>>>>>> CACHE_LINE_SIZE);
>>>>>>>> > alloc_ptr = (unsigned char *)malloc((num_elements * flist-
>>>>>>>> > >fl_elem_size) +
>>>>>>>> > sizeof(opal_list_item_t) +
>>>>>>>> > CACHE_LINE_SIZE);
>>>>>>>> > + fprintf(stderr, "mpidebug: allocated at memory address %p\n",
>>>>>>>> > alloc_ptr);
>>>>>>>> > +
>>>>>>>> > if(NULL == alloc_ptr)
>>>>>>>> > return OPAL_ERR_TEMP_OUT_OF_RESOURCE;
>>>>>>>> >
>>>>>>>> > @@ -110,7 +113,16 @@
>>>>>>>> > for(i=0; i<num_elements; i++) {
>>>>>>>> > opal_free_list_item_t* item = (opal_free_list_item_t*)ptr;
>>>>>>>> > if (NULL != flist->fl_elem_class) {
>>>>>>>> > - OBJ_CONSTRUCT_INTERNAL(item, flist->fl_elem_class);
>>>>>>>> > + do {
>>>>>>>> > + if (0 == (flist->fl_elem_class)->cls_initialized) {
>>>>>>>> > + opal_class_initialize((flist->fl_elem_class));
>>>>>>>> > + }
>>>>>>>> > + fprintf(stderr, "mpidebug: accessing address %p\n",
>>>>>>>> > &((opal_object_t *) (item))->obj_class);
>>>>>>>> > + ((opal_object_t *) (item))->obj_class = (flist-
>>>>>>>> > >fl_elem_class);
>>>>>>>> > + fprintf(stderr, "mpidebug: accessing address %p\n",
>>>>>>>> > &((opal_object_t *) (item))->obj_reference_count);
>>>>>>>> > + ((opal_object_t *) (item))->obj_reference_count = 1;
>>>>>>>> > + opal_obj_run_constructors((opal_object_t *) (item));
>>>>>>>> > + } while (0);
>>>>>>>> > }
>>>>>>>> > opal_list_append(&(flist->super), &(item->super));
>>>>>>>> > ptr += flist->fl_elem_size;
>>>>>>>> > @@ -119,5 +131,3 @@
>>>>>>>> > return OPAL_SUCCESS;
>>>>>>>> > }
>>>>>>>> >
>>>>>>>> > -
>>>>>>>> > -
>>>>>>>> > _______________________________________________
>>>>>>>> > devel mailing list
>>>>>>>> > devel_at_[hidden]
>>>>>>>> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel