Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Problem with Filem
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2009-05-01 16:40:06

This typically this means that one or more of the rcp/scp or rsh/ssh
commands failed. FileM should be printing an error message when one
of the copy commands fail. Try turning up the verbose level to 10 to
see if it indicates any problems:
  -mca filem_rsh_verbose 10

Can you send me the MCA parameters that you are setting? That may
help narrow down the problem as well. Also I cleaned up some of the
filem (and snapc) error reporting in the development trunk if you
want to give that a try.

Let me know what you find out.


On Apr 30, 2009, at 6:40 AM, Bouguerra mohamed slim wrote:

> Hello,
> I have a problem with the Filem module when i would checkpoint on a
> remote host without shared space file system.
> I use the new open-mpi 1.3.2 and it is the same problem as in the
> version 1.3.1. Indeed, when i use the NFS system file it works.
> Thus i guess that is a problem with the Filem.
> [] filem:rsh: wait_all(): Wait failed (-1)
> [] [[48784,0],0] ORTE_ERROR_LOG: Error in file /home/
> grenoble/msbouguerra/openmpi-1.3.2/orte/mca/snapc/full/
> snapc_full_global.c at line 1054
> --
> Cordialement,
> Mohamed-Slim BOUGUERRA PhD student INRIA-Grenoble / Projet MOAIS
> ENSIMAG - antenne de Montbonnot
> ZIRST 51, avenue Jean Kuntzmann
> Tel :+33 (0)4 76 61 20 79
> Fax :+33 (0)4 76 61 20 99
> _______________________________________________
> users mailing list
> users_at_[hidden]