Bala,
This is a known problem with the 1.1 series. The bad news is that I
know of no fix for this, though many people work around this problem
by running a cleanup script after each unclean run. The good news is
that the 1.2 series is MUCH better, though still not perfect. I would
suggest trying out 1.2 and seeing if it works for you.
Hope this helps,
Tim
On Mar 17, 2007, at 9:58 AM, Bala wrote:
> Hi All,
> we have installed 16 node Intel X86_64
> dual CPU and dual core cluster( blade servers)
> with OFED-1.1, that installs OpenMPI as well.
>
> we are able to run some sample programs also,
> after few time when we run the sample and do
> some Ctrl+C to stop the program we notice that
> some "orted" is still running and takes 100% cpu
> as well.
>
> 1. why some times this "orted" process not stopped
> and how to avoid this??
>
> 2. we can kill with -9 option, but the problem is
> while running various OpenMPI programs we can
> see each one has one "orted", don't know
> which process is idle to kill.
>
> regards,
> Bala.
>
|