Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Memchecker failure with empty struct type
From: Jeremiah Willcock (jewillco_at_[hidden])
Date: 2012-09-25 17:07:33


My config.log shows that it found Valgrind even though I didn't specify
--with-valgrind. It looks like the issue is in the datatype creation
code; looking at the data structure shows unusual values for true_ub and
true_lb:

{super = {super = {obj_magic_id = 16046253926196952813, obj_class =
0x5005880, obj_reference_count = 1,
       cls_init_file_name = 0x4da6f2f "ompi_datatype_create.c",
cls_init_lineno = 71}, flags = 276, id = 0, bdt_used = 0, size = 0,
     true_lb = 9223372036854775807, true_ub = -9223372036854775808, lb = 0,
ub = 0, align = 1, nbElems = 0, name = '\000' <repeats 63 times>, desc = {
       length = 1, used = 0, desc = 0x54348e0}, opt_desc = {length = 0,
used = 0, desc = 0x0}, btypes = {0 <repeats 46 times>}}, id = 68,
d_f_to_c_index = 68,
   d_keyhash = 0x0, args = 0x8a7b780, packed_description = 0x0, name =
'\000' <repeats 63 times>}

In particular, the true_extent computed on line 99 of memchecker.h is
computed as 1 (because of overflows) while the datatype has size 0. This
causes it to be treated as non-contiguous, while its desc field is NULL;
the code then loops over elements of desc as if it was an array. Fixing
true_lb and true_ub might be enough to make the current memchecker code
work (since the datatype is actually contiguous).

-- Jeremiah Willcock

On Tue, 25 Sep 2012, Ralph Castain wrote:

> IIRC, we found a configure "bug" that allowed you to enable-memchecker without also including the required --with-valgrind. You might try again with 1.6.2, which includes the change - and be sure to add the extra configure flag.
>
>
> On Sep 25, 2012, at 12:04 PM, Jeremiah Willcock <jewillco_at_[hidden]> wrote:
>
>> The following C program:
>>
>> #include <mpi.h>
>>
>> int main(int argc, char** argv) {
>> int blocklengths;
>> MPI_Aint displacements;
>> MPI_Datatype types, dt;
>> int x;
>> MPI_Init(&argc, &argv);
>> MPI_Type_struct(0, &blocklengths, &displacements, &types, &dt);
>> MPI_Type_commit(&dt);
>> MPI_Send(&x, 1, dt, MPI_PROC_NULL, 0, MPI_COMM_WORLD);
>> MPI_Type_free(&dt);
>> MPI_Finalize();
>> return 0;
>> }
>>
>> produces a segmentation fault (caused by a NULL pointer dereference) when run with Open MPI 1.6.1, but only when using Valgrind. Running without Valgrind does not cause any issues; the failure appears to be in the code that checks whether MPI buffers are valid. The configure flags I used to build Open MPI were a prefix and:
>>
>> --disable-pretty-print-stacktrace --enable-mpi-thread-multiple --enable-memchecker --enable-mca-no-build=btl-openib --enable-debug
>>
>> and I am using GCC 4.7.1 on Linux. Is this a known issue? Thank you for your help.
>>
>> -- Jeremiah Willcock
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>