Subject: [OMPI devel] Problem when using struct types at specific offsets
From: Thomas Jahns (jahns_at_[hidden])
Date: 2013-04-08 10:08:07


a colleague of mine has investigated a difficult problem we traced to OpenMPI
giving incorrectly delivered data on some struct datatypes which use specific
offsets (on the stack in our case but the problem can be reproduced when using
specifically chosen slices of an array). Our library is used to aggregate
several MPI communications in a generic and transparent manner and therefore we
need to be able to handle any combination of properly aligned offsets and
component types.

The attached example program contains the necessary steps to reproduce the problem:

1. create the struct types in question
2. send/recv the data
3. compare to reference (said comparison works on several MPICH2 versions)

The code prints than any array indices/values not matching the reference.

Our platform is linux x86_64 with Debian squeeze, the tested versions of OpenMPI
are the 1.4.2 version supplied with squeeze and 1.6.4 compiled ourselves. For
1.4.2 I also did a quick test in a i386 chroot and the code fails there too. gcc
4.6.1 was used for the x86_64 cases and gcc 4.3.5 for the i386 chroot.

Sorry if the test is not of minimal size, but we were happy once he got this
down from several 10000 lines Fortran+C and even that took more than a day once
we understood the problem was unrelated to the Fortran program it originally
occurred in.

When running the program with OpenMPI:

$ mpicc -std=gnu99 ./mpi_test.c && ./a.out
first tests:
second tests:
results_2[6] = 8
ref_results_2[6] = 12
results_2[7] = 9
ref_results_2[7] = 13

MPICH gives the expected result:
$ /sw/squeeze-x64/mpi/mpich2-1.4.1p1-gccsys/bin/mpicc -std=gnu99 ./mpi_test.c &&
first tests:
second tests:

Regards, Thomas

Thomas Jahns
DKRZ GmbH, Department: Application software
Deutsches Klimarechenzentrum
Bundesstraße 45a
D-20146 Hamburg
Phone: +49-40-460094-151
Fax: +49-40-460094-270
Email: Thomas Jahns <jahns_at_[hidden]>