Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] assert in opal_datatype_is_contiguous_memory_layout
From: Eric Chamberland (Eric.Chamberland_at_[hidden])
Date: 2013-04-05 15:18:50


Hi again,

I have attached a very small example which raise the assertion.

The problem is arising from a process which does not have any element to
write in the file (and then in the MPI_File_set_view)...

You can see this "bug" with openmpi 1.6.3, 1.6.4 and 1.7.0 configured with:

./configure --enable-mem-debug --enable-mem-profile --enable-memchecker
  --with-mpi-param-check --enable-debug

Just compile the given example (idx_null.cc) as-is with

mpicxx -o idx_null idx_null.cc

and run with 3 processes:

mpirun -n 3 idx_null

You can modify the example by commenting "#define WITH_ZERO_ELEMNT_BUG"
to see that everything is going well when all processes have something
to write.

There is no "bug" if you use openmpi 1.6.3 (and higher) without the
debugging options.

Also, all is working well with mpich-3.0.3 configured with:

./configure --enable-g=yes

So, is this a wrong "assert" in openmpi?

Is there a real problem to use this code in a "release" mode?

Thanks,

Eric

On 04/05/2013 12:57 PM, Eric Chamberland wrote:
> Hi all,
>
> I have a well working (large) code which is using openmpi 1.6.3 (see
> config.log here:
> http://www.giref.ulaval.ca/~ericc/bug_openmpi/config.log_nodebug)
>
> (I have used it for reading with MPI I/O with success over 1500 procs
> with very large files)
>
> However, when I use openmpi compiled with "debug" options:
>
> ./configure --enable-mem-debug --enable-mem-profile --enable-memchecker
> --with-mpi-param-check --enable-debug --prefix=/opt/openmpi-1.6.3_debug
> (se other config.log here:
> http://www.giref.ulaval.ca/~ericc/bug_openmpi/config.log_debug) the code
> is aborting with an assertion on a very small example on 2 processors.
> (the same very small example is working well without the debug mode)
>
> Here is the assertion causing an abort:
>
> ===================================
>
> openmpi-1.6.3/opal/datatype/opal_datatype.h:
>
> static inline int32_t
> opal_datatype_is_contiguous_memory_layout( const opal_datatype_t*
> datatype, int32_t count )
> {
> if( !(datatype->flags & OPAL_DATATYPE_FLAG_CONTIGUOUS) ) return 0;
> if( (count == 1) || (datatype->flags & OPAL_DATATYPE_FLAG_NO_GAPS)
> ) return 1;
>
>
> /* This is the assertion: */
>
> assert( (OPAL_PTRDIFF_TYPE)datatype->size != (datatype->ub -
> datatype->lb) );
>
> return 0;
> }
>
> ===================================
>
> Does anyone can tell me what does this mean?
>
> It happens while writing a file with MPI I/O when I am calling for the
> fourth time a "MPI_File_set_view"... with different types of
> MPI_Datatype created with "MPI_Type_indexed".
>
> I am trying to reproduce the bug with a very small example to be send
> here, but if anyone has a hint to give me...
> (I would like: this assert is not good! just ignore it ;-) )
>
> Thanks,
>
> Eric
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users