Maybe, i solved this bug, deleting long cast.
Now, in compile time, it works well, but at runtime, there are other problems, like this:

../../../opal/class/opal_object.h:428:Bounds error: pointer arithmetic would overrun the end of the object.
../../../opal/class/opal_object.h:428:  Pointer value: 0x8, Size: 8
../../../opal/class/opal_object.h:428:  Object `orte_system_info':
../../../opal/class/opal_object.h:428:    Address in memory:    0x0 .. 0xf
../../../opal/class/opal_object.h:428:    Size:                 64 bytes
../../../opal/class/opal_object.h:428:    Element size:         1 bytes
../../../opal/class/opal_object.h:428:    Number of elements:   64
../../../opal/class/opal_object.h:428:    Created at:           util/sys_info.c, line 43
../../../opal/class/opal_object.h:428:    Storage class:        static

There are very much error of this type, differenting by line code error in /opal/class/opal_object.h: . All errors are generated by same line code:

util/sys_info.c, line 43

Final status of MPI Job is ever "Undefined".

Another bug?

2008/6/12 Gabriele Fatigati <g.fatigati@cineca.it>:
I found that the error starts in this line code:

static opal_atomic_lock_t class_lock = { { OPAL_ATOMIC_UNLOCKED } };

in class/opal_object.c, line 52

and generates the bound error in this code block:

static inline int opal_atomic_cmpset_64( volatile int64_t *addr,
                              
           int64_t oldval, int64_t newval)
{
   unsigned char ret;
   __asm__ __volatile (
                       SMPLOCK "cmpxchgq %1,%2   \n\t"
                               "sete     %0      \n\t"
                       : "=qm" (ret)
                       : "q"(newval), "m"(*((volatile long*)addr)), "a"(oldval)   //<<<<< HERE
                       : "memory");

   return (int)ret;
}

in /opal/include/opal/sys/amd64/atomic.h, at line 89

The previous enviroment variable is GCC_BOUNDS_OPTS

Thanks in advance.


2008/6/12 Gabriele Fatigati <g.fatigati@cineca.it>:
Hi,

i have installed OpenMPI 1.2.6, using gcc with bounds checking. But, when i compile an MPI program, i have many time the same error:

../opal/include/opal/sys/amd64/atomic.h:89:    Address in memory:    0x8 .. 0xb
../opal/include/opal/sys/amd64/atomic.h:89:    Size:                 4 bytes
../opal/include/opal/sys/amd64/atomic.h:89:    Element size:         1 bytes
../opal/include/opal/sys/amd64/atomic.h:89:    Number of elements:   4
../opal/include/opal/sys/amd64/atomic.h:89:    Created at:           class/opal_object.c, line 52
../opal/include/opal/sys/amd64/atomic.h:89:    Storage class:        static
../opal/include/opal/sys/amd64/atomic.h:89:Bounds error: attempt to reference memory overrunning the end of an object.
../opal/include/opal/sys/amd64/atomic.h:89:  Pointer value: 0x8, Size: 8

Setting the enviroment variable to "-never-fatal", the compile phase, ends successfull. But, at runtime, i have ever the error above, very much time, and the program fails, with "undefined status".

Is this an OpenMPI bug?





--
Gabriele Fatigati

CINECA Systems & Tecnologies Department

Supercomputing Group

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.it Tel: +39 051 6171722

g.fatigati@cineca.it



--
Gabriele Fatigati

CINECA Systems & Tecnologies Department

Supercomputing Group

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.it Tel: +39 051 6171722

g.fatigati@cineca.it



--
Gabriele Fatigati

CINECA Systems & Tecnologies Department

Supercomputing Group

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.it Tel: +39 051 6171722

g.fatigati@cineca.it