Dear All:

I run a parallel job on 6 nodes of an OpenMPI cluster. 

But I got error: 

rank 0 in job 82  system.cluster_37948   caused collective abort of all ranks
  exit status of rank 0: killed by signal 9

It seems that there is segmentation fault on node 0. 

But, if the program is run for a short time, no problem.

Any help is appreciated. 

thanks,

Jack

July 22  2010


The New Busy is not the old busy. Search, chat and e-mail from your inbox. Get started.