Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] reduce_scatter bug with hierarch
From: Edgar Gabriel (gabriel_at_[hidden])
Date: 2009-01-13 14:09:43

I just debugged the Reduce_scatter bug mentioned previously. The bug is
unfortunately not in hierarch, but in tuned.

Here is the code snipplet causing the problems:

int reduce_scatter (...., mca_coll_base_module_t *module)
    err = comm->c_coll.coll_reduce (...., module)

but should be
   err = comm->c_coll.coll_reduce (..., comm->c_coll.coll_reduce_module);

The problem as it is right now is, that when using hierarch, only a
subset of the function are set, e.g. reduce,allreduce, bcast and
barrier. Thus, reduce_scatter is from tuned in most scenarios, and calls
the subsequent functions with the wrong module. Hierarch of course does
not like that :-)

Anyway, a quick glance through the tuned code reveals a significant
number of instances where this appears(reduce_scatter, allreduce,
allgather, allgatherv). Basic, hierarch and inter seem to do that mostly


Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab
Department of Computer Science          University of Houston
Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
Tel: +1 (713) 743-3857                  Fax: +1 (713) 743-3335