Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] intermittent node file error running with torque/maui integration
From: Noam Bernstein (noam.bernstein_at_[hidden])
Date: 2013-09-20 10:40:58


On Sep 20, 2013, at 10:36 AM, Noam Bernstein <noam.bernstein_at_[hidden]> wrote:

>
> On Sep 20, 2013, at 10:22 AM, Reuti <reuti_at_[hidden]> wrote:
>
>>
>> Is the location for the spool directory local or shared by NFS? Disk full?
>
> No - locally mounted, and far from full on all the nodes.

Another new observation, which may shift the focus to torque. I
just rebooted some of the nodes that were showing this behavior.
So far, none of them have shown it in a few hundred test jobs,
while before at least 1-5 of each set of 100 had failures.

                                                                                        Noam