Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: about dynamic/intercomm_create test from ibm test suite
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-05-28 10:53:58


On May 28, 2014, at 7:50 AM, Gilles Gouaillardet <gilles.gouaillardet_at_[hidden]> wrote:

> Ralph,
>
> thanks for the info
>
>> can you detail your full mpirun command line, the number of servers you are using, the btl involved and the ompi release that can be used to reproduce the issue ?
>
> Running on only one server, using the current head of the svn repo. My cluster only has Ethernet, and I let it freely choose the BTLs (so I imagine the candidates are sm,self,tcp,vader). The cmd line is really trivial:
>
>
> is MPSS installed and loaded ?
> if yes, scif is also a candidate

Nope - not on this machine

>
> mpirun -n 1 ./loop_spawn
>
> I modified loop_spawn to only run 100 iterations as I am not patient enough to wait for 1000, and the number of iters isn't a factor so long as it is greater than 1. When the parent calls finalize, I get one of the following emitted for every iteration that was done:
>
> dpm_base_disconnect_init: error -12 in isend to process 0
>
>
> so we do the same thing but have different behaviour ...
>
> just to be sure :
> are we talking about the loop_spawn test from the ibm test suite available at
> http://svn.open-mpi.org/svn/ompi-tests/trunk/ibm/dynamic/loop_spawn.c
> and
> http://svn.open-mpi.org/svn/ompi-tests/trunk/ibm/dynamic/loop_child.c
>
> number of iterations is 2000 (and not 1000)
> MPI_Comm_disconnect is invoked by both parent in loop_spawn.c :
> MPI_Comm_free(&comm_merged);
> MPI_Comm_disconnect(&comm_spawned);
>
> and children in loop_child.c :
> MPI_Comm_free(&merged);
> MPI_Comm_disconnect(&parent);
>
> is there any possibility you are running a different test called loop_spawn or an older version of the dynamic/loop_spawn test from the ibm test suite ?

Yeah, I'm running a version that was the parent of that one. Looks like it has diverged, so perhaps that is the issue. Let me refresh it and try again.

>
> Cheers,
>
> Gilles
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/05/14894.php