Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] OpenMPI bug?
From: Gabriele Fatigati (g.fatigati_at_[hidden])
Date: 2008-06-13 05:13:44


I'm sorry.
The previous code block reported, is referred to 32 bit not 64. So, the
right code block is:

static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
                                        int32_t oldval, int32_t newval)
{
   unsigned char ret;
   __asm__ __volatile (
                       SMPLOCK "cmpxchgl %1,%2 \n\t"
                               "sete %0 \n\t"
                       : "=qm" (ret)
                       *: "q"(newval), "m"(*(volatile long*)addr),
"a"(oldval)* //<<<<< HERE
                       : "memory");

   return (int)ret;
}

2008/6/13 Gabriele Fatigati <g.fatigati_at_[hidden]>:

> Maybe, i solved this bug, deleting long cast.
> Now, in compile time, it works well, but at runtime, there are other
> problems, like this:
>
> ../../../opal/class/opal_object.h:428:Bounds error: pointer arithmetic
> would overrun the end of the object.
> ../../../opal/class/opal_object.h:428: Pointer value: 0x8, Size: 8
> ../../../opal/class/opal_object.h:428: Object `orte_system_info':
> ../../../opal/class/opal_object.h:428: Address in memory: 0x0 .. 0xf
> ../../../opal/class/opal_object.h:428: Size: 64 bytes
> ../../../opal/class/opal_object.h:428: Element size: 1 bytes
> ../../../opal/class/opal_object.h:428: Number of elements: 64
> ../../../opal/class/opal_object.h:428: Created at:
> util/sys_info.c, line 43
> ../../../opal/class/opal_object.h:428: Storage class: static
>
> There are very much error of this type, differenting by line code error in
> /opal/class/opal_object.h: . All errors are generated by same line code:
>
> util/sys_info.c, line 43
>
> Final status of MPI Job is ever "Undefined".
>
> Another bug?
>
>
> 2008/6/12 Gabriele Fatigati <g.fatigati_at_[hidden]>:
>
>> I found that the error starts in this line code:
>>
>> static opal_atomic_lock_t class_lock = { { OPAL_ATOMIC_UNLOCKED } };
>>
>> in class/opal_object.c, line 52
>>
>> and generates the bound error in this code block:
>>
>> static inline int opal_atomic_cmpset_64( volatile int64_t *addr,
>> int64_t oldval, int64_t newval)
>> {
>> unsigned char ret;
>> __asm__ __volatile (
>> SMPLOCK "cmpxchgq %1,%2 \n\t"
>> "sete %0 \n\t"
>> : "=qm" (ret)
>> * : "q"(newval), "m"(*((volatile long*)addr)),
>> "a"(oldval)* //<<<<< HERE
>> : "memory");
>>
>> return (int)ret;
>> }
>>
>> in /opal/include/opal/sys/amd64/atomic.h, at line 89
>>
>> The previous enviroment variable is GCC_BOUNDS_OPTS
>>
>> Thanks in advance.
>>
>>
>> 2008/6/12 Gabriele Fatigati <g.fatigati_at_[hidden]>:
>>
>>> Hi,
>>>
>>> i have installed OpenMPI 1.2.6, using gcc with bounds checking. But, when
>>> i compile an MPI program, i have many time the same error:
>>>
>>> ../opal/include/opal/sys/amd64/atomic.h:89: Address in memory: 0x8
>>> .. 0xb
>>> ../opal/include/opal/sys/amd64/atomic.h:89: Size: 4
>>> bytes
>>> ../opal/include/opal/sys/amd64/atomic.h:89: Element size: 1
>>> bytes
>>> ../opal/include/opal/sys/amd64/atomic.h:89: Number of elements: 4
>>> ../opal/include/opal/sys/amd64/atomic.h:89: Created at:
>>> class/opal_object.c, line 52
>>> ../opal/include/opal/sys/amd64/atomic.h:89: Storage class:
>>> static
>>> ../opal/include/opal/sys/amd64/atomic.h:89:Bounds error: attempt to
>>> reference memory overrunning the end of an object.
>>> ../opal/include/opal/sys/amd64/atomic.h:89: Pointer value: 0x8, Size: 8
>>>
>>> Setting the enviroment variable to "-never-fatal", the compile phase,
>>> ends successfull. But, at runtime, i have ever the error above, very much
>>> time, and the program fails, with "undefined status".
>>>
>>> Is this an OpenMPI bug?
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Gabriele Fatigati
>>>
>>> CINECA Systems & Tecnologies Department
>>>
>>> Supercomputing Group
>>>
>>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>>
>>> www.cineca.it Tel: +39 051 6171722
>>>
>>> g.fatigati_at_[hidden]
>>>
>>
>>
>>
>> --
>> Gabriele Fatigati
>>
>> CINECA Systems & Tecnologies Department
>>
>> Supercomputing Group
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.it Tel: +39 051 6171722
>>
>> g.fatigati_at_[hidden]
>>
>
>
>
> --
> Gabriele Fatigati
>
> CINECA Systems & Tecnologies Department
>
> Supercomputing Group
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel: +39 051 6171722
>
> g.fatigati_at_[hidden]
>

-- 
Gabriele Fatigati
CINECA Systems & Tecnologies Department
Supercomputing Group
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it Tel: +39 051 6171722
g.fatigati_at_[hidden]