Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] DDT and spawn issue?
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-07-15 12:18:09


Perhaps we should add a requirement for testing on 2-3 different
systems before long-term (or "big change") branches like this come to
the trunk? I say this because it seems like at least some of these
problems were based on bad luck -- i.e., the stuff worked on the
platform that it was being tested and developed on, even though there
are bugs left. Having fallen victim to this myself many times
("worked for me on Cisco machines! I dunno why it's failing for
you... :-("), I think we all recognize the value of just running the
same code on someone else's systems -- it has a good tendency to turn
up issues that don't show up on yours. I'm not trying to say that
every little trunk commit needs to be validated -- but "big" changes
like this could certainly benefit from multiple validations.

Cisco is very willing to be a 2nd platform for testing for stuff that
we can run without too much trouble, especially via MTT (e.g., I
already have the right kind of networks to test, etc.).

BTW, is anyone going to comment about the latency issue that I asked
about?

(in case you can't tell, I'm moderately displeased about how this
whole branch came to the trunk... :-\ )

On Jul 15, 2009, at 12:04 PM, Rainer Keller wrote:

> Hi Jeff,
> Ralph and Edgar send fwd an email about this.
> We (George and myselve) are currently looking into this.
>
> With the changes we have I can get IBM/spawn to work "sometimes", aka
> sometimes, it segfaults.
>
> Thanks,
> Rainer
>
>
>
>
> On Wednesday 15 July 2009 11:50:13 am Jeff Squyres wrote:
> > I [very briefly] read about the DDT spawn issues, so I went to
> look at
> > ompi/op/op.c. I notice that there's a new comment above the op
> > datatype<-->op map construction area that says:
> >
> > /* XXX TODO */
> >
> > svn blame says:
> >
> > 21641 rusraink /* XXX TODO */
> >
> > r21641 is the big merge from the past weekend where the DDT split
> came
> > in.
> >
> > Has this area been looked at and the comment is out of date? Or
> does
> > it need to be updated with new mappings? (I honestly have not
> looked
> > any farther than this -- the new comment caught my eye)
>
> --
> ------------------------------------------------------------------------
> Rainer Keller, PhD Tel: +1 (865) 241-6293
> Oak Ridge National Lab Fax: +1 (865) 241-4811
> PO Box 2008 MS 6164 Email: keller_at_[hidden]
> Oak Ridge, TN 37831-2008 AIM/Skype: rusraink
>
>
>

-- 
Jeff Squyres
Cisco Systems