I noticed that the new release of orte is not as good as it used to be
to cleanup the mess left by crashed/aborted mpi processes. Recently We
have been experiencing a lot of zombie or live locked processes
running on the cluster nodes and disturbing following experiments. I
didn't really had time to investigate the issue, maybe ralph can set a
ticket if he is able to reproduce this.
Aurelien
--
* Dr. Aurélien Bouteiller
* Sr. Research Associate at Innovative Computing Laboratory
* University of Tennessee
* 1122 Volunteer Boulevard, suite 350
* Knoxville, TN 37996
* 865 974 6321
|