Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] blcr_checkpoint_peer: execvp returned -1
From: Leonardo Fialho (lfialho_at_[hidden])
Date: 2008-04-28 05:30:28


Hi All,

Does anybody experiment this error?

[aogrdini:09070] Global) Receive a command message from [[13242,0],0].
...
[aogrd02:07642] Local) Receive a command message.
...
[aogrd01:07938] Local) Receive a command message.
...
[aogrd01:07941] App) signal_handler: Receive Checkpoint Request.
...
[aogrd02:07645] App) signal_handler: Receive Checkpoint Request.
...
[aogrd01:07941] crs:blcr: checkpoint(7941, ---)
[aogrd01:07941] crs:blcr: checkpoint_peer(7941, --)
[aogrd01:07941] crs:blcr: get_checkpoint_filename(--, 7941)
[aogrd01:07941] crs:blcr: checkpoint_cmd(7941)
[aogrd01:07941] crs:blcr: blcr_checkpoint_peer: exec :(cr_checkpoint,
cr_checkpoint --pid 7941 --file
/tmp/opal_snapshot_0.ckpt/ompi_blcr_context.7941):
[aogrd01:07941] crs:blcr: blcr_checkpoint_peer: Child failed to execute
:(-1):
[aogrd01:07941] crs:blcr: blcr_checkpoint_peer: execvp returned -1
...
[aogrd02:07645] crs:blcr: blcr_checkpoint_peer: exec :(cr_checkpoint,
cr_checkpoint --pid 7645 --file
/tmp/opal_snapshot_1.ckpt/ompi_blcr_context.7645):
[aogrd02:07645] crs:blcr: blcr_checkpoint_peer: Child failed to execute
:(-1):
[aogrd02:07645] crs:blcr: blcr_checkpoint_peer: execvp returned -1
...
[aogrd02:07642] Local) Location: [/tmp/opal_snapshot_1.ckpt]

The application stop here and don´t continue the execution. It´s using
libcr version 0.6.5
$ lsof -p 7518
/softs/blcr-0.6.5/0.6.5/lib/libcr.so.0.2.1

After orte-checkpoint command the application process is duplicated on
the nodes, like a child of the original process.
When a run an application with this version and take a checkpoint
manually, I have no problem...

Leonardo Fialho
Computer Architecture and Operating Systems Department - CAOS
Universidad Autonoma de Barcelona - UAB
ETSE, Edifcio Q, QC/3088
http://www.caos
Phone: +34-93-581-2888
Fax: +34-93-581-2478