Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Problem-Bug with MPI_Intercomm_create()
From: orel (aurelien.esnard_at_[hidden])
Date: 2011-10-25 09:01:30


I try from several days to use advanced MPI2 features in the following
scenario :

   1) a master code A (of size NPA) spawns (MPI_Comm_spawn()) two slave
      codes B (of size NPB) and C (of size NPC), providing intercomms
A-B and A-C ;
   2) i create intracomm AB and AC by merging intercomms ;
   3) then i create intercomm AB-C by calling MPI_Intercomm_create() by
using AC as bridge...

     MPI_Comm intercommABC; A: MPI_Intercomm_create(intracommAB, 0,
intracommAC, NPA, TAG,&intercommABC);
B: MPI_Intercomm_create(intracommAB, 0, MPI_COMM_NULL,
C: MPI_Intercomm_create(intracommC, 0, intracommAC, 0, TAG,&intercommABC);

       In these calls, A0 and C0 play the role of local leader for AB
and C respectively.
       C0 and A0 play the roles of remote leader in bridge intracomm AC.

   3) MPI_Barrier(intercommABC);
   4) i merge intercomm AB-C into intracomm ABC$
   5) MPI_Barrier(intracommABC);

My BUG: These calls success, but when i try to use intracommABC for a
collective communication like MPI_Barrier(),
                i got the following error :

*** An error occurred in MPI_Barrier
*** on communicator
*** MPI_ERR_INTERN: internal error
*** MPI_ERRORS_ARE_FATAL: your MPI job will now abort

I try with OpenMPI trunk, 1.5.3, 1.5.4 and Mpich2-1.4.1p1

My code works perfectly if intracomm A, B and C are obtained by
MPI_Comm_split() instead of MPI_Comm_spawn() !!!!

I found same problem in a previous thread of the OMPI Users mailing list :


Is that bug/problem is currently under investigation ? :-)

i can give detailed code, but the one provided by George Bosilca in this
previous thread provides same error...

Thank you to help me...

Aurélien Esnard
University Bordeaux 1 / LaBRI / INRIA (France)