Often when I try and run larger jobs on our cluster I get the error of
the sort from some of the compute-servers:
eu260 - daemon did not report back when launched
It does not happen every time; but pretty often. Any ideas what could
be wrong? The node seems pingable and I could log in successfully to
it as well. /var/log/messages shows no errors but maybe there is
another log elsewhere?
--
Rahul
|