first of all, thanks for the info bill! i think i'm really starting to
piece things together now. you are right in that i'm working with a
6.x (6.2 with 6.1 devel libs ;) install here at cadence, without the
HPC extensions AFAIK. also, i think that are customers are mostly in
the same position -- i assume that the HPC extensions cost extra? or
perhaps admins just don't bother to install them.
so, there are at least three cases to consider:
LSF 7.0 or greater
LSF 6.x /w HPC
LSF 6.x 'base'
i'll try to gather more data, but my feeling it that the market
penetration of both HPC and LSF 7.0 is low in our marker (EDA vendors
and customers). i'd love to just stall until 7.0 is widely available,
but perhaps in the mean time it would be nice to have some backward
support for LSF 6.0 'base'. it seems like supporting LSF 6.x /w HPC
might not be too useful, since:
a) it's not clear that the 'built in' "bsub -n N -a openmpi foo"
support will work with an MPI-2 dynamic-spawning application like mine
(or does it?),
b) i've heard that manually interfacing with the parallel application
manager directly is tricky?
c) most importantly, it's not clear than any of our customers have the
HPC support, and certainly not all of them, so i need to support LSF
6.0 'base' anyway -- it only needs to work until 7.0 is widely
available (< 1 year? i really have no idea ... will Platform end
support for 6.x at some particular time? or otherwise push customers
to upgrade? perhaps cadence can help there too ...) .
under LSF 7.0 it looks like things are okay and that open-mpi will
support it in a released version 'soon' (< 6 months? ). sooner than
our customer wil have LSF 7.0 anyway, so that's fine.
as for LSF 6.0 'base', there are two workarounds that i see, and a
couple key questions that remain:
1) use bsub -n N, followed by N-1 ls_rtaske() calls (or similar).
while ls_rtaske() may not 'force' me to follow the queuing rules, if i
only launch on the proper machines, i should be okay, right? i don't
think IO and process marshaling (i'm not sure exactly what you mean by
that) are a problem since openmpi/orted handles those issues, i think?
2) use only bsub's of single processes, using some initial wrapper
script that bsub's all the jobs (master + N-1 slaves) needed to reach
the desired static allocation for openmpi. this seems to be what my
internal guy is suggesting is 'required'. integration with openmpi
might not be too hard, using suitable trickery. for example, the
wrapper script launches some wrapper processes that are basically
rexec daemons. the master waits for them to come up in the ras/lsf
component (tcp notify, perhaps via the launcher machine to avoid
needing to know the master hostname a priori), and then the pls/lsf
component uses the thin rexec daemons to launch orted. seems like a
bit of a silly workaround, but it does seem to both keep the queuing
system happy as well as not need ls_rtaske() or similar.
[ Note: (1) will fail if admins disable the ls_rexec() type of
functionality, but on a LSF 6.0 'base' system, this would seem to
disable all || job launching -- i.e. the shipped mpijob/pvmjob all use
lsgrun and such, so they would be disabled -- is there any other way i
could start the sub-processes within my allocation in that case? can i
just have bsub start N copies of something (maybe orted?)? that seems
like it might be hard to integrate with openmpi, though -- in that
case, i'd probably just only impliment option (2)]
On 7/17/07, Bill McMillan <bmcmillan_at_[hidden]> wrote:
> > there appear to be some overlaps between the ls_* and lsb_* functions,
> > but they seem basically compatible as far as i can tell. almost all
> > the functions have a command line version as well, for example:
> > lsb_submit()/bsub
> Like openmpi and orte, there are two layers in LSF. The ls_* API's
> talk to what is/was historically called "LSF Base" and the lsb_* API's
> talk to what is/was historically called "LSF Batch".
> Bill McMillan
> Principal Technical Product Manager
> Platform Computing