Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Processes Not Restarting On Requested Hosts
From: Doug Roberts (roberpj_at_[hidden])
Date: 2008-06-28 00:06:50

Using 1.3a1r18423 built against blcr 0.7.1 but cannot get
ompi-restart to start processes on machinefile hosts. No
useful information shown in /var/log/messages on master or
intended hosts. Disabling prelinking doesnt help either.
ie) All processes start on the host which ompi-restart
is executed on. Any suggestions how to debug this further ?

# mpirun -np 4 -am ft-enable-cr -machinefile balhosts ./a.out
Process 0 of 4 is on bal12
Process 2 of 4 is on bal12
Process 1 of 4 is on bal20
Process 3 of 4 is on bal20

# ompi-checkpoint --term 27098
Snapshot Ref.: 0 ompi_global_snapshot_27098.ckpt

# ompi-restart -v -machinefile balhosts ompi_global_snapshot_27098.ckpt
[bal34:27204] Checking for the existence of
[bal34:27204] Restarting from file (ompi_global_snapshot_27098.ckpt)
[bal34:27204] Exec in self