Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] reduce_scatter bug with hierarch
From: Edgar Gabriel (gabriel_at_[hidden])
Date: 2009-01-13 14:09:43


I just debugged the Reduce_scatter bug mentioned previously. The bug is
unfortunately not in hierarch, but in tuned.

Here is the code snipplet causing the problems:

int reduce_scatter (...., mca_coll_base_module_t *module)
{
...
    err = comm->c_coll.coll_reduce (...., module)
...
}

but should be
{
...
   err = comm->c_coll.coll_reduce (..., comm->c_coll.coll_reduce_module);
...
}

The problem as it is right now is, that when using hierarch, only a
subset of the function are set, e.g. reduce,allreduce, bcast and
barrier. Thus, reduce_scatter is from tuned in most scenarios, and calls
the subsequent functions with the wrong module. Hierarch of course does
not like that :-)

Anyway, a quick glance through the tuned code reveals a significant
number of instances where this appears(reduce_scatter, allreduce,
allgather, allgatherv). Basic, hierarch and inter seem to do that mostly
correctly.

Thanks
Edgar

-- 
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab      http://pstl.cs.uh.edu
Department of Computer Science          University of Houston
Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
Tel: +1 (713) 743-3857                  Fax: +1 (713) 743-3335