Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] Problem with attributes attached to communicators
From: Pascal Deveze (Pascal.Deveze_at_[hidden])
Date: 2011-01-06 10:08:41

I have a problem to finish the porting of ROMIO into Open MPI. It is
related to the routines MPI_Comm_dup together with MPI_Keyval_create,
MPI_Keyval_free, MPI_Attr_get and MPI_Attr_put.

Here is a simple program that reproduces my problem:

#include <stdio.h>
#include "mpi.h"

int copy_fct(MPI_Comm comm, int keyval, void *extra, void *attr_in, void
**attr_out, int *flag) {
    return MPI_SUCCESS;

int delete_fct(MPI_Comm comm, int keyval, void *attr_val, void *extra) {
    return MPI_SUCCESS;

int main(int argc, char **argv) {
    int i, found, attribute_val=100, keyval = MPI_KEYVAL_INVALID;
    MPI_Comm dupcomm;


    for (i=0; i<100;i++) {
        /* This simulates the MPI_File_open() */
        if (keyval == MPI_KEYVAL_INVALID) {
                MPI_Keyval_create((MPI_Copy_function *) copy_fct,
(MPI_Delete_function *) delete_fct, &keyval, NULL);
                MPI_Attr_put(MPI_COMM_WORLD, keyval, &attribute_val);
                MPI_Comm_dup(MPI_COMM_WORLD, &dupcomm);
        else {
                MPI_Comm_dup(MPI_COMM_WORLD, &dupcomm);
                MPI_Attr_get(MPI_COMM_WORLD, keyval, (void *)
&attribute_val, &found);
        /* This simulates the MPI_File_close() */
I run it on only one process and get the error:
 *** An error occurred in MPI_Attr_get
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_OTHER: known error not in list
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)

I think this error is displayed because keyval does not exist any more.

This programm runs well on MPICH2 (ROMIO is comming with MPICH2).
This programm runs well when delete_fct() does not call MPI_Keyval_free
This programm runs well when I call MPI_Keyval_create with
"MPI_NULL_COPY_FN" instead of "(MPI_Copy_function *) copy_fct" (this is
quite strange : copy_fct does nothing !).

I suspect that there could be a bug in OpenMPI: In
ompi/attribute/attribute.c two functions are calling OBJ_RELEASE:
ompi_attr_delete and ompi_attr_free_keyval. So, the
reference count is decremented twice.