Pooja and I are actually working on this course project where we our main
schedule MPI and non MPI calls... giving more priority to the MPI calls over
To make things simple, we are making this scheduling static to some
static I mean.. we know that our clusters use Infiniband for MPI ( from our
the openmpi source code this precisely uses the 'mca_btl_openib_send()' from
the ompi/mca/btl/openib/btl_openib.c file) ... so all the non MPI
be assumed to be TCP communication using the 'mca_btl_tcp_send()' from the
To implement this we plan to implement the foll. simple algorithm:
- before calling the 'mca_btl_openib_send()' lock0(X);
- before calling the 'mca_btl_tcp_send()' lock1(X);
1. Allow Lock0(x) -> Lock0(x);.. meaning Lock0(x) is followed by Lock0(x).
2. Allow Lock1(x) -> Lock1(x);
3. Do not allow Lock0(x) -> Lock1(x);
4. If Lock1(x) -> Lock0(x).... since MPI calls are to be higher priority
over the non
MPI ones.. in this case the non MPI communication should be paused and all
related data off course needs to be put into a queue(meaning the status of
should be saved in a queue). All other non MPI communications newer than
should also be added to this same queue. Now the MPI process trying to
perform Lock0(x) should be allowed to complete and only when all the MPI
communications are complete should the non MPI communication be allowed.
Currently we are working on a simple scheduling algorithm without giving any
priorities to the 'MPI_send' calls.
However to implement the project fully, we have the following queries :(
-Can we abort or pause the non-MPI/TCP communication in any way???
-Given the assumption that the non-MPI communication is TCP, can we
make use of the built in structures (i mean the buffer already used) in
mca_btl_tcp_send() for the implementation of pt.4 in the above mentioned
algorithm??? and more importantly how?