On Sep 20, 2013, at 11:52 AM, Gus Correa <gus_at_[hidden]> wrote:
> Hi Noam
> Could it be that Torque, or probably more likely NFS,
> is too slow to create/make available the PBS_NODEFILE?
> What if you insert a "sleep 2",
> or whatever number of seconds you want,
> before the mpiexec command line?
> Or maybe better, a "ls -l $PBS_NODEFILE; cat $PBS_NODEFILE",
> just to make sure the file it is available and
> filled with the node list, before mpiexec takes over?
I don't see how NFS could be involved, since it's on a local filesystem.
As for adding a sleep, I already tried that - if the file doesn't exist, I sleep a few
seconds and check again, and in every case if it's not there to begin with it's not
there the second time either. And this all doesn't explain the very
mysterious even more infrequent situation where I can cat the file, but
mpirun can't find it.