Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How to add nodes while running job
From: Rayson Ho (raysonlogin_at_[hidden])
Date: 2011-08-27 10:28:15


On Sat, Aug 27, 2011 at 9:12 AM, Ralph Castain <rhc_at_[hidden]> wrote:
> OMPI has no way of knowing that you will turn the node on at some future
> point. All it can do is try to launch the job on the provided node, which
> fails because the node doesn't respond.
> You'll have to come up with some scheme for telling the node to turn on in
> anticipation of starting a job - a resource manager is typically used for
> that purpose.

Hi Ralph,

Are you referring to a specific resource manager/batch system?? AFAIK,
no common batch systems support MPI_Spawn properly...

Rayson

> On Aug 27, 2011, at 6:58 AM, Rafael Braga wrote:
>
> I would like to know how to add nodes during a job execution.
> Now my hostfile has the node 10.0.0.23 that is off,
> I would start this node during the execution so that the job can use it
> When I run the command:
>
> mpirun -np 2 -hostfile /tmp/hosts application
>
> the following message appears:
>
> ssh: connect to host 10.0.0.23 port 22: No route to host
> --------------------------------------------------------------------------
> A daemon (pid 10773) died unexpectedly with status 255 while attempting
> to launch so we are aborting.
>
> There may be more information reported by the environment (see above).
>
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --------------------------------------------------------------------------
> mpirun: clean termination accomplished
>
> thanks a lot,
>
> --
> Rafael Braga
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Rayson
==================================================
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/