On Jun 7, 2007, at 7:25 AM, Nysal Jan wrote:
> I'll cleanup the code and add the granular selction part. It should
> be ready by monday.
> On 6/6/07, Jeff Squyres < jsquyres_at_[hidden]> wrote:Ok -- so did
> you want to go ahead and make these changes, or did you
> want me to do it?
> Either way, I'd be in favor of all this stuff coming to the trunk in
> the Very Near Future. :-)
> On Jun 6, 2007, at 7:02 AM, Nysal Jan wrote:
> > Hi Jeff,
> > 1. The logic for if_exclude was not correct. I committed a fix for
> > it. https://svn.open-mpi.org/trac/ompi/changeset/14748
> > Thanks
> > 2. I'm a bit confused on a) how the new MCA params mca_num_hcas and
> > map_num_procs_per_hca are supposed to be used and b) what their
> > default values shoulant code)d be.
> > Probably these params(and relevant code) should be removed now,
> > since there is a plan for generic Socket/Core to HCA mapping
> > scheme. mca_num_hcas is the maximum number of HCAs a task can use.
> > Eg. If mpa_num_procs_per_hca is 3 and max_num_hcas is 2. On any
> > node, task 1/2/3 are mapped to hca1 & hca2, task 4/5/6 are mapped
> > to hca3 & hca4 ....
> > Default values were set as 1(thats what we needed at that point in
> > time).It needs to be modified so that ompi's default behaviour
> > remains unchanged (ie. use all hcas)
> > 2a. I don't quite understand the logic of is_hca_allowed(); I could
> > not get it to work properly. Specifically, I have 2 machines each
> > with 2 HCAs (mthca0 has 1 port, mthca1 has 2 ports). If I ran 2
> > procs (regardless of byslot or bynode), is_hca_allowed() would
> > return false for the 2nd proc. So I put a temporary override in
> > is_hca_allowed() to simply always return true. Can you explain how
> > the logic is supposed to work in that function?
> > Explained above
> > 2b. The default values of max_num_hcas and map_num_procs_per_hca are
> > both 1. Based on my (potentially flawed) understanding of how these
> > MCA params are meant to be used, this is different than the current
> > default behavior. The current default is that all procs use all
> > ACTIVE ports on all HCAs. I *think* your new default param values
> > will set each proc to use the ACTIVE ports on exactly one HCA,
> > regardless how many there are in the host. Did you mean to do that?
> > Also: both values must currently be >=1; should we allow -1 for both
> > of these values, meaning that they can be "infinite" ( i.e.,
> based on
> > the number of HCAs in the host)?
> > Yes, the defaults need to be changed. I'll also make the selection
> > logic more granular (eg. -mca mca_btl_openib_if_include
> > mthca0:1,mthca1:1)
> > --Nysal
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Jeff Squyres
> Cisco Systems
> devel mailing list
> devel mailing list