Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Problem with attributes attached to communicators
From: George Bosilca (bosilca_at_[hidden])
Date: 2011-01-06 13:15:14


MPI_Comm_create_keyval and MPI_Comm_free_keyval are the functions you should use in order to be MPI 2.2 compliant.

Based on my understanding of the MPI standard, your application is incorrect, and therefore the MPICH behavior is incorrect. The delete function is not there for you to delete the keyval (!) but to delete the attribute. Here is what the MPI standard states about this:

> Note that it is not erroneous to free an attribute key that is in use, because the actual free does not transpire until after all references (in other communicators on the process) to the key have been freed. These references need to be explictly freed by the program, either via calls to MPI_COMM_DELETE_ATTR that free one attribute instance, or by calls to MPI_COMM_FREE that free all attribute instances associated with the freed communicator.

  george.

On Jan 6, 2011, at 10:08 , Pascal Deveze wrote:

> I have a problem to finish the porting of ROMIO into Open MPI. It is related to the routines MPI_Comm_dup together with MPI_Keyval_create, MPI_Keyval_free, MPI_Attr_get and MPI_Attr_put.
>
> Here is a simple program that reproduces my problem:
>
> ===========================================
> #include <stdio.h>
> #include "mpi.h"
>
> int copy_fct(MPI_Comm comm, int keyval, void *extra, void *attr_in, void **attr_out, int *flag) {
> return MPI_SUCCESS;
> }
>
> int delete_fct(MPI_Comm comm, int keyval, void *attr_val, void *extra) {
> MPI_Keyval_free(&keyval);
> return MPI_SUCCESS;
> }
>
> int main(int argc, char **argv) {
> int i, found, attribute_val=100, keyval = MPI_KEYVAL_INVALID;
> MPI_Comm dupcomm;
>
> MPI_Init(&argc,&argv);
>
> for (i=0; i<100;i++) {
> /* This simulates the MPI_File_open() */
> if (keyval == MPI_KEYVAL_INVALID) {
> MPI_Keyval_create((MPI_Copy_function *) copy_fct, (MPI_Delete_function *) delete_fct, &keyval, NULL);
> MPI_Attr_put(MPI_COMM_WORLD, keyval, &attribute_val);
> MPI_Comm_dup(MPI_COMM_WORLD, &dupcomm);
> }
> else {
> MPI_Comm_dup(MPI_COMM_WORLD, &dupcomm);
> MPI_Attr_get(MPI_COMM_WORLD, keyval, (void *) &attribute_val, &found);
> }
> /* This simulates the MPI_File_close() */
> MPI_Comm_free(&dupcomm);
> }
> MPI_Finalize();
> ===============================================
> I run it on only one process and get the error:
> *** An error occurred in MPI_Attr_get
> *** on communicator MPI_COMM_WORLD
> *** MPI_ERR_OTHER: known error not in list
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>
> I think this error is displayed because keyval does not exist any more.
>
> This programm runs well on MPICH2 (ROMIO is comming with MPICH2).
> This programm runs well when delete_fct() does not call MPI_Keyval_free
> This programm runs well when I call MPI_Keyval_create with "MPI_NULL_COPY_FN" instead of "(MPI_Copy_function *) copy_fct" (this is quite strange : copy_fct does nothing !).
>
> I suspect that there could be a bug in OpenMPI: In ompi/attribute/attribute.c two functions are calling OBJ_RELEASE: ompi_attr_delete and ompi_attr_free_keyval. So, the
> reference count is decremented twice.
>
> Pascal
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel