Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] problem: help-hostfile.txt: Too many open files in system.
From: Mariana Vargas Magana (mmarianav_at_[hidden])
Date: 2013-01-04 12:48:13


Hello open MPI users:

I was just running a program that usually works well in the cluster and suddenly in the 32 iteration I get this strange set of errors associated with. I will appreciate if someone could give me some hint of the problem and how to solve

Thanks!

Mariana

/usr/bin/ssh: error while loading shared libraries: libcrypt.so.1: cannot open shared object file: Error 23
/usr/bin/ssh: error while loading shared libraries: libutil.so.1: cannot open shared object file: Error 23
/usr/bin/ssh: error while loading shared libraries: libfipscheck.so.1: cannot open shared object file: Error 23
/usr/bin/ssh: error while loading shared libraries: libkrb5.so.3: cannot open shared object file: Error 23
--------------------------------------------------------------------------
A daemon (pid 1486) died unexpectedly with status 127 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Sorry! You were supposed to get help about:
    no-hostfile
But I couldn't open the help file:
    /home/mvargas/openmpi/share/openmpi/help-hostfile.txt: Too many open files in system. Sorry!
--------------------------------------------------------------------------
[ferrari:01490] [[65228,0],0] ORTE_ERROR_LOG: Not found in file base/ras_base_allocate.c at line 200
[ferrari:01490] [[65228,0],0] ORTE_ERROR_LOG: Not found in file base/plm_base_launch_support.c at line 99
[ferrari:01490] [[65228,0],0] ORTE_ERROR_LOG: Not found in file plm_rsh_module.c at line 1167
--------------------------------------------------------------------------
Sorry! You were supposed to get help about:
    no-hostfile
But I couldn't open the help file:
    /home/mvargas/openmpi/share/openmpi/help-hostfile.txt: Too many open files in system. Sorry!
--------------------------------------------------------------------------
[ferrari:01491] [[65229,0],0] ORTE_ERROR_LOG: Not found in file base/ras_base_allocate.c at line 200
[ferrari:01491] [[65229,0],0] ORTE_ERROR_LOG: Not found in file base/plm_base_launch_support.c at line 99
[ferrari:01491] [[65229,0],0] ORTE_ERROR_LOG: Not found in file plm_rsh_module.c at line 1167