Nadia,
Interesting. I haven't tried pushing this to levels above 8 on a particular
machine. Do you think that the cpuset / paffinity / hwloc only applies at
the machine level, at which time you need to employ a graph with carto?
Regards,
Ken
-----Original Message-----
From: devel-bounces_at_[hidden] [mailto:devel-bounces_at_[hidden]] On
Behalf Of nadia.derbey
Sent: Monday, August 29, 2011 5:45 AM
To: Open MPI Developers
Subject: [OMPI devel] known limitation or bug in hwloc?
Hi list,
I'm hitting a limitation with paffinity/hwloc with cpu numbers >= 64.
In opal/mca/paffinity/hwloc/paffinity_hwloc_module.c, module_set() is
the routine that sets the calling process affinity to the mask given as
parameter. Note that "mask" is a opal_paffinity_base_cpu_set_t (so we
allow the cpus to be potentially numbered up to
OPAL_PAFFINITY_BITMASK_CPU_MAX - 1).
The problem with module_set() is that is loops over
OPAL_PAFFINITY_BITMASK_T_NUM_BITS bits to check if these bits are set in
the mask:
for (i = 0; ((unsigned int) i) < OPAL_PAFFINITY_BITMASK_T_NUM_BITS; ++i)
{
if (OPAL_PAFFINITY_CPU_ISSET(i, mask)) {
hwloc_bitmap_set(set, i);
}
}
Given "mask"'s type, I think module_set() should instead loop over
OPAL_PAFFINITY_BITMASK_CPU_MAX bits.
Note that module_set() uses a type for its internal mask that is
coherent with OPAL_PAFFINITY_BITMASK_T_NUM_BITS (hwloc_bitmap_t).
So I'm wondering whether this is a known limitation I've never heard of
or an actual bug?
Regards,
Nadia
_______________________________________________
devel mailing list
devel_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/devel
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1392 / Virus Database: 1520/3864 - Release Date: 08/28/11
|