Hi,
mpiexec seems to need a file handle per started process.
By default the number of file handles is set to 1024 here, thus I can
start about 900 something processes.
With higher numbers I get
mca_oob_tcp_accept: accept() failed: Too many open files (24).
If I decrease the file handles on the shell I run mpiexec from, I get
the error with less processes. However no MPI process is started on the
local machine.
The first thing I am wondering about is the TCP because Infiniband is
used for communication.
And secondly what are the files/connections used for?
Do I really have to set the file handles to 5000 (and to 32000 in a few
years) for large MPI programs or is there a workaround?
Another thing that I don't get is that the problem only arises if I
start an MPI program.
mpiexec -np 2000 hostname
works fine.
best regards,
Samuel
|