> So when you say you want your master to send "as fast as possible", I
> suppose you meant get back to running your code as soon as possible. In
> that case you would want nonblocking. However when you say you want the
> slaves to receive data faster, it seems you're implying the actual data
> transmission across the network. I believe the data transmission speed is
> not dependent on whether the it is blocking or nonblocking.
Sorry I did not express myself clearly. With 'as fast as possible' I meant
that I want to have all data ASAP available in my slave nodes. The master
has nothing to do but sending so I do not care if the sends are blocking or
non-blocking. Actually the master will use seperate threads for the sending
anyway so either I launch a thread per blocking-send or just 1 thread to do
all the sending using nonblocking sends.
I do think there is plenty of reason for a difference (in the timing for
receiving the data in the slaves). If OpenMPI is not able to offload the
sending to some dedicated card (which in my case is probably the case since
I'm on a stock linux with stock ethernet cards) and OpenMPI will try to send
the data that it was requested to send by multiple nonblocking send's
simultaneously, OpenMPI itself probably needs to multi-thread the sending of
each message himself.
Well, I do not know anything about the internals of OpenMPI so I actually
have no clue how OpenMPI would do this really and how it will try to
optimise the use of BW on the network.