I was pointing out that most programs
have some degree of elastic synchronization built in. Tasks (or groups
or components in a coupled model) seldom only produce data.they also consume
what other tasks produce and that limits the potential skew.
If step n for a task (or group or coupled
component) depends on data produced by step n-1 in another task (or
group or coupled component) then no task can be farther ahead of
the task it depends on than one step. If there are 2 tasks that
each need the others step n-1 result to compute step n then they can never
get farther than one step out of synch. If there were a rank ordered
loop of 8 tasks so each one needs the output of the prior step on
task ((me-1) mod tasks) to compute then you can get more skew because
if
task 5 gets stalled in step 3,
task 6 will finish step 3 and send results
to 7 but stall on recv for step 4 (lacking the end of step 3 send by task
5)
task 7 will finish step 4 and send results
to 0 but stall on recv for step 5
task 0 will finish step 5 and send results
to 1 but stall on recv for step 6
etc
In a 2D or 3D grid, the dependency is
tighter so the possible skew is less. but it is still significant on a
huge grid In a program with frequent calls to MPI_Allreduce on COMM_WORLD,
the skew is very limited. The available skew gets harder to predict as
the interdependencies grow more complex.
I call this "elasticity" because
the amount of stretch varies but, like a bungee cord or an waist band,
only goes so far. Every parallel program has some degree of elasticity
built into the way its parts interact.
I assume a coupler has some elasticity
too. That is, ocean and atmosphere each model Monday and report in to coupler
but neither can model Tuesday until they get some of the Monday results
generated by the other. (I am pretending granularity is day by day) Wouldn't
the right level of synchronization among component result automatically
form the data dependencies among them?
Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363
From:
Eugene Loh <eugene.loh@oracle.com>
To:
Open MPI Users <users@open-mpi.org>
Date:
09/09/2010 12:40 PM
Subject:
Re: [OMPI users] MPI_Reduce performance
Sent by:
users-bounces@open-mpi.org
Gus Correa wrote:
> More often than not some components lag behind (regardless of how
> much you tune the number of processors assigned to each component),
> slowing down the whole scheme.
> The coupler must sit and wait for that late component,
> the other components must sit and wait for the coupler,
> and the (vicious) "positive feedback" cycle that
> Ashley mentioned goes on and on.
I think "sit and wait" is the "typical" scenario that
Dick mentions.
Someone lags, so someone else has to wait.
In contrast, the "feedback" cycle Ashley mentions is where someone
lags
and someone else keeps racing ahead, pumping even more data at the
laggard, forcing the laggard ever further behind.
_______________________________________________
users mailing list
users@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users