>> We would rather that OpenMPI use shared-mem (sm) module when running
>> intra-node processes.
> Doesn't PSM use shared memory to communicate between peers on the same
Possibly, yes (I'm not sure). Even if it does it appears to consume a
'hardware context' for each peer - this is what we want to avoid.
>> We believe that by using our scheduler's allocation policy (packing)
>> and considering our job mix, we might be able to add nodes to this
>> cluster using only one HCA per node (again, we would rather not use
>> 'shared contexts').
> Are you saying that you want Open MPI to not use PSM when the job
> entirely fits within a single node?
Yes, considering that the use of sm instead of psm would conserve
hardware contexts (and thus reduce the need for HCAs)
> If so, you might want to experiment with the pre-job hook in the job
> scheduler. You could try setting MCA parameters as environment
> variables (e.g., setenv OMPI_MCA_pml ob1 -- which would exclude the CM
> PML and therefore the PSM MTL) if your pre-job hook can tell if the job
> fits entirely on a single node.
> Does that help?
That's an interesting idea that I will investigate.