Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] --rankfile
From: Nulik Nol (nuliknol_at_[hidden])
Date: 2009-08-18 16:53:06


Hi,
i get this error when i use --rankfile,
"There are not enough slots available in the system to satisfy the 2 slots"
what could be the problem? I have tried using '*' for 'slot' param and
many other configs without any luck. Wihtout --rankfile everything
works fine. Will appriciate any help.

master waver # cat neat.hostfile
n64 max-slots=1 slots=1
master max-slots=1 slots=1
master waver # cat neat.rankfile
rank 0=n64 slot=0
rank 1=master slot=0
master waver # mpirun --rankfile neat.rankfile --hostfile
neat.hostfile -n 2 /tmp/neat
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 2 slots
that were requested by the application:
    /tmp/neat

Either request fewer slots for your application, or make more slots available
for use.

--------------------------------------------------------------------------
--------------------------------------------------------------------------
A daemon (pid unknown) died unexpectedly on signal 1 while attempting to
launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
mpirun: clean termination accomplished

master waver # mpirun --hostfile neat.hostfile -n 2 /tmp/neat
entering master main loop
recieved msg from 1
unknown message 0
^Cmpirun: killing job...

--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 13064 on node master
exited on signal 0 (Unknown signal 0).
--------------------------------------------------------------------------
2 total processes killed (some possibly by mpirun during cleanup)
mpirun: clean termination accomplished

master waver #

-- 
==================================
The power of zero is infinite