Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] rankfile questions
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-03-19 08:01:41

Yes, you're right -- we should have reviewed this code 2 weeks ago
when you asked. Sorry about that. :-\

Per adding lots of affinity code in ompi_mpi_init.c: perhaps those
code belongs down in the paffinity (or rmaps?) base. It doesn't have
to become part of any specific paffinity component (because it can be
used with any paffinity component) This makes it callable by anyone
(including orte) and keeps the abstraction barriers clean.

On Mar 19, 2008, at 5:36 AM, Lenny Verkhovsky wrote:

>>>> 1. I see a getenv("slot_list") in the MPI side of the code; it
> looks
>>>> like $slot_list is set by the odls for the MPI process. Why isn't
> it
>>>> an MCA parameter? That's what all other values passed by the orted
> to
>>>> the MPI process appear to be.
> "slot_list" consist of socket:core pair for the rank to be bind to.
> This
> info changes according to rankfile and different for each node and
> rank,
> therefore it cannot be passed via mca parameter.

I don't follow the logic here. MCA parameters can certainly be unique
per MPI process...

Remember that MCA parameters can be environment variables. The
advantage of using MCA params as env variables is that we enforce a
common prefix to ensure that we don't collide with other environment
variables. There's functions to get the environment variable names of
MCA parameters, for example, so that you can setenv them to pass them
to another process (e.g., in the odls). Then you use the normal MCA
parameter lookup functions to retrieve them in the target/receiver

>>>> 2. I see that ompi_mpi_params.c is now registering 2 rmaps-level
>>>> parameters. Why? Shouldn't these be in ORTE somewhere?
> If you mean paffinity_alone and rank_file_debug, then
> 1. paffinity_alone was there before.
> 2. After getting some answers from Ralph about orte_debug in
> ompi_mpi_init I intend to introduce ompi_debug mca parameter that will
> be used in this library and rank_file_debug will be removed.

rmaps_rank_file_path and rmaps_rank_file_debug. These have no place
being registered in the OMPI layer.

It looks like rank_file_path is only registered in ompi_mpi_init.c as
an error check. Why isn't this done in the rmaps rankfile component
itself? This would execute in mpirun and avoid launching at all if an
error is detected (vs. detecting the error in each MPI process and
aborting each one).

>>> A few more notes:
>>> 3. Most of the files in orte/mca/rmaps/rankfile do not obey the
> prefix
>>> rule. I think that they should be renamed.
> Rank_file component was copied from round_robin, I thought it would be
> strange if it would look differently.

Blah -- it looks like round robin's files don't adhere to the prefix
rule. In fairness, those files *may* be so old to predate the prefix

Regardless, I think the rankfile files should be named in accordance
with the rest of the code base and adhere to the prefix rule. round
robin should probably be fixed as well.

>>> 4. A quick look through rankfile_lex.l seems to show that there are
>>> global variables that are not protected by the prefix rule (or
>>> static). Ditto in rmaps_rf.c. These should be fixed.
> What do you mean?

 From lex.l:

int rank_file_line=1;
rank_file_value_t rank_file_value;
bool rank_file_done = false;

These are neither static nor do they adhere to the prefix rule
(obviously, if a symbol is static, it doesn't have to adhere to the
prefix rule). Ditto for "rank_file_path" and "rankmap" in
rmaps_rf.c. There may be others; that's all I looked through (e.g., I
didn't check other files or check function symbols).

Jeff Squyres
Cisco Systems