Dave,
Thanks for the suggestion, adding "-mca plm ^rshd" did force mpirun to
spawn things via qrsh rather than SSH. My problem is solved!
--
Brian McNally
On 02/16/2012 03:05 AM, Dave Love wrote:
> Brian McNally<bmcnally_at_[hidden]> writes:
>
>> Hi Dave,
>>
>> I looked through the INSTALL, VERSION, NEWS, and README files in the
>> 1.5.4 openmpi tarball but didn't see what you were referring to.
>
> I can't access the web site, but there's an item in the notes on the
> download page about the bug. It must also be in the mail archive in a
> thread with me included.
>
>> Are
>> you suggesting that I launch mpirun similar to this?
>>
>> mpirun -mca plm ^rshd ...?
>
> Yes, or put it in openmpi-mca-params.conf. It's harmless with 1.4.
>
>> What I meant by "the same parallel environment setup" was that the PE
>> in SGE was defined the same way:
>>
>> $ qconf -sp orte
>> pe_name orte
>> slots 9999
>> user_lists NONE
>> xuser_lists NONE
>> start_proc_args /bin/true
>> stop_proc_args /bin/true
>> allocation_rule $round_robin
>> control_slaves TRUE
>> job_is_first_task FALSE
>> urgency_slots min
>> accounting_summary FALSE
>>
>> Even though I have RHEL 5 and RHEL 6 nodes in the same cluster they
>> never run the same MPI job; it's always either all RHEL 5 nodes or all
>> RHEL 6.
>
> OK. (I'd expect a separate PE for each node set to enforce that.)
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
|