Josh ran some tests for me on Odin earlier today - the results show a
major improvement in our startup/shutdown performance. As you may recall,
our times grew roughly exponentially before - as the attached graph
shows, they now grow roughly linearly. The data also shows that the
MPI_INIT penalty is fairly small. This is due to the data exchange being
"encapsulated" in the initial data sent back at the stage_1
trigger, thus avoiding any further overhead as the number of processes
grows. The data was taken using the rsh launcher.
We should be able to further improve our scalability once we (a)
incorporate a tree-based scheme into the rsh launcher and (b) utilize a
tree-based (or better) broadcast mechanism for sending the trigger
messages (right now, we send them linearly across the
Anyway, thought you might find this of interest.