Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] Implementation of MPI_Iprobe
From: Sébastien Boisvert (sebastien.boisvert.3_at_[hidden])
Date: 2011-09-27 14:36:42


Hello,

As I understand, When MPI_Iprobe is called, the code that is called is the function pointed by the attribute

mca_pml_base_module_iprobe_fn_t pml_iprobe;

in ompi/mca/pml/pml.h

In the file ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c (Open-MPI 1.4.3),
ompi_crcp_bkmrk_pml_iprobe calls drain_message_find_any.

In drain_message_find_any (in ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c), there is a loop over all MPI ranks
regardless of the peer parameter.
For instance, with 256 peers, probing for peer 255 requires 256 iterations while probing for peer 0 requires 1 iteration.

As I understand it, the linked list ompi_crcp_bkmrk_pml_peer_refs is populated with nprocs entries where nprocs is presumably the number of MPI ranks in MPI_COMM_WORLD.

If my understanding is right, here are some suggestions:

1. ompi_crcp_bkmrk_pml_peer_refs should be an array so that when peer is not MPI_ANY_SOURCE, MPI_Iprobe can returns in constant time.

2. There should be some sort of round-robin mechanism for the case where the peer is MPI_ANY_SOURCE, otherwise lower ranks will get more probed and higher ranks will
suffer from starvation. This could be done by having a current position in the peer list (or array, see point 1). Instead of starting to loop on the first, the loop would start at current position and
a maximum of nprocs iterations would take place.

A code review is on my blog: http://dskernel.blogspot.com/2011/09/code-review-what-happens-in-open-mpis.html

                                                     Sébastien