Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] ompi-restart, ompi-ps problem
From: Nguyen Kim Son (nguyenkims_at_[hidden])
Date: 2010-06-07 04:48:24


Hello,

I'n trying to get functions like orte-checkpoint, orte-restart,... works but
there are some errors that I don't have any clue about.

Blcr (0.8.2) works fine apparently and I have installed openmpi 1.4.2 from
source with option blcr.
The command
mpirun -np 4 -am ft-enable-cr ./checkpoint_test
seemed OK but
orte-checkpoint --term PID_of_checkpoint_test ( obtaining after ps -ef |
grep mpirun )
does not return and shows nothing like errors!

Then, I checked with
ompi-ps
this time, I obtain:
oob-tcp: Communication retries exceeded. Can not communicate with peer

Does anyone has the same problem?
Any idea is welcomed!
Thanks,
Son.

-- 
---------------------------------------------------------
Son NGUYEN KIM
Antibes 06600
Tel: 06 48 28 37 47