Just curious has anyone done comparisons of latency measurements as one
changes the size of a job. That is changing the size of the job (and
number of nodes used) and just taking the half roundtrip latency of two
of the processes in the job. I am roughly seeing an addition of 5% to
the latency for each node added. This is with the TCP and Udapl BTLs.
I am curious whether other BTLs have similar issues and if doing some
sort of directed polling would help matters?