Not much we can say with that little info. :-/
Are you using Open MPI? If so, what version?
When you say the job gets restarted, do you mean that Condor restarts the entire MPI job? If so, you had best talk to the Condor folks - it has nothing to do with Open MPI, but is due to a job control flag you are passing to Condor.
On Apr 14, 2011, at 6:37 PM, Asad Ali wrote:
> Hi all,
> I am using Condor to run my MPI jobs on a large cluster of nodes. The jobs run fine but after sometimes they automatically get restarted. What can be the reason?
> "A Bayesian is one who, vaguely expecting a horse, and catching a glimpse of a donkey, strongly believes he has seen a mule."
> users mailing list