Gus Correa wrote:
> 2) However, running with "sm" still breaks, unfortunately:
> I get the same errors that I reported in my very first email, if I
> increase the number of processes to 16, to explore the hyperthreading
> This is using "sm" (i.e. not excluded in the mca config file), and
> btl_sm_num_fifos (mpiexec command line)
> The machine hangs, requires a hard reboot, etc, etc, as reported earlier.
Okay. I think this is different from trac 2043, then, since that
involved a race condition that can be worked around by giving each
sender its own FIFO.
> So, I guess the conclusion is that I can use sm, but I have to remain
> within the range of physical cores (8), not oversubscribe, not try to
> explore the HT range. Should I expect it to work also for np>number
> of physical cores?
Yes, I believe that would be a reasonable expectation (under
circumstances other than the ones you're facing, in any case). I just
ran the examples/connectivity_c.c test with GCC on an 8-core Nehalem
system with HT turned on and tested up to np=64.