Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI bug?
From: Gabriele Fatigati (g.fatigati_at_[hidden])
Date: 2008-06-13 05:09:43


Maybe, i solved this bug, deleting long cast.
Now, in compile time, it works well, but at runtime, there are other
problems, like this:

../../../opal/class/opal_object.h:428:Bounds error: pointer arithmetic would
overrun the end of the object.
../../../opal/class/opal_object.h:428: Pointer value: 0x8, Size: 8
../../../opal/class/opal_object.h:428: Object `orte_system_info':
../../../opal/class/opal_object.h:428: Address in memory: 0x0 .. 0xf
../../../opal/class/opal_object.h:428: Size: 64 bytes
../../../opal/class/opal_object.h:428: Element size: 1 bytes
../../../opal/class/opal_object.h:428: Number of elements: 64
../../../opal/class/opal_object.h:428: Created at:
util/sys_info.c, line 43
../../../opal/class/opal_object.h:428: Storage class: static

There are very much error of this type, differenting by line code error in
/opal/class/opal_object.h: . All errors are generated by same line code:

util/sys_info.c, line 43

Final status of MPI Job is ever "Undefined".

Another bug?

2008/6/12 Gabriele Fatigati <g.fatigati_at_[hidden]>:

> I found that the error starts in this line code:
>
> static opal_atomic_lock_t class_lock = { { OPAL_ATOMIC_UNLOCKED } };
>
> in class/opal_object.c, line 52
>
> and generates the bound error in this code block:
>
> static inline int opal_atomic_cmpset_64( volatile int64_t *addr,
> int64_t oldval, int64_t newval)
> {
> unsigned char ret;
> __asm__ __volatile (
> SMPLOCK "cmpxchgq %1,%2 \n\t"
> "sete %0 \n\t"
> : "=qm" (ret)
> * : "q"(newval), "m"(*((volatile long*)addr)),
> "a"(oldval)* //<<<<< HERE
> : "memory");
>
> return (int)ret;
> }
>
> in /opal/include/opal/sys/amd64/atomic.h, at line 89
>
> The previous enviroment variable is GCC_BOUNDS_OPTS
>
> Thanks in advance.
>
>
> 2008/6/12 Gabriele Fatigati <g.fatigati_at_[hidden]>:
>
>> Hi,
>>
>> i have installed OpenMPI 1.2.6, using gcc with bounds checking. But, when
>> i compile an MPI program, i have many time the same error:
>>
>> ../opal/include/opal/sys/amd64/atomic.h:89: Address in memory: 0x8
>> .. 0xb
>> ../opal/include/opal/sys/amd64/atomic.h:89: Size: 4
>> bytes
>> ../opal/include/opal/sys/amd64/atomic.h:89: Element size: 1
>> bytes
>> ../opal/include/opal/sys/amd64/atomic.h:89: Number of elements: 4
>> ../opal/include/opal/sys/amd64/atomic.h:89: Created at:
>> class/opal_object.c, line 52
>> ../opal/include/opal/sys/amd64/atomic.h:89: Storage class:
>> static
>> ../opal/include/opal/sys/amd64/atomic.h:89:Bounds error: attempt to
>> reference memory overrunning the end of an object.
>> ../opal/include/opal/sys/amd64/atomic.h:89: Pointer value: 0x8, Size: 8
>>
>> Setting the enviroment variable to "-never-fatal", the compile phase, ends
>> successfull. But, at runtime, i have ever the error above, very much time,
>> and the program fails, with "undefined status".
>>
>> Is this an OpenMPI bug?
>>
>>
>>
>>
>>
>> --
>> Gabriele Fatigati
>>
>> CINECA Systems & Tecnologies Department
>>
>> Supercomputing Group
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.it Tel: +39 051 6171722
>>
>> g.fatigati_at_[hidden]
>>
>
>
>
> --
> Gabriele Fatigati
>
> CINECA Systems & Tecnologies Department
>
> Supercomputing Group
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel: +39 051 6171722
>
> g.fatigati_at_[hidden]
>

-- 
Gabriele Fatigati
CINECA Systems & Tecnologies Department
Supercomputing Group
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it Tel: +39 051 6171722
g.fatigati_at_[hidden]