Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Fwd: How openmpi-1.6.3 using nodes which not LSF dispatch?
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-02-02 09:43:50


I'm afraid this doesn't make much sense to me. LSF has dispatched node1 and node2 - correct? It sounds like you have also given those names aliases that refer to their IB ports - generally a very bad practice, but let's set that aside for now.

If they are the same physical nodes, then the node name makes no difference - OMPI will see both TCP and IB on the node and use them. You can control which interfaces get used by simply telling OMPI on its command line:

mpirun -mca btl tcp,sm,self ... will use shared memory and TCP

mpirun -mca openib,sm,self ... will use IB and shared memory

Using host names to try and control which network gets used isn't going to work - the software is too smart to be fooled that way.

On Feb 2, 2013, at 6:33 AM, HM Li <lihm0_at_[hidden]> wrote:

> Can you help me?
>
> The bnode1.bnode2 and node1,node2 are the hostnames of the same nodes corresponding to the InfiniBand and ethernet network respectively.
> The node1,node2 are the nodes declarated in lsf.cluster.name
> In order to use the IB network, I have modified the lsf mpijob script, and modified the HOSTFILE containing the nodes which LSF dispatched from node to bnode.
> Then use "mpiexec -machinefile $HOSTFILE $COMMANDLINE" to run my jobs.
> But the job exits and shows:
> -------------------------------------------------------------
> A hostfile was provided that contains at least one node not
> present in the allocation:
>
> hostfile: /home/nic/hmli/.lsbatch/bhost23263.node1
> node: bnode2
>
> If you are operating in a resource-managed environment, then only
> nodes that are in the allocation can be used in the hostfile. You
> may find relative node syntax to be a useful alternative to
> specifying absolute node names see the orte_hosts man page for
> further information.
> -------------------------------------------------------------
>
> I don't want to change the hostname from node to bnode in lsf.cluster.name.
>
> Thank you very much.
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users