Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Cannot start (WAS: Segmentation fault / Address not mapped (1) with 2-node job on Rocks 5.2)
From: Riccardo Murri (riccardo.murri_at_[hidden])
Date: 2010-06-22 15:44:07


Hello,

On Tue, Jun 22, 2010 at 8:05 AM, Ralph Castain <rhc_at_[hidden]> wrote:
> Sorry for the problem - the issue is a bug in the handling of the
>pernode option in 1.4.2. This has been fixed and awaits release in
>1.4.3.
>

Thank you for pointing this out. Unfortunately, I still am not able
to start remote processes::

  $ mpirun --host compute-0-11 -np 1 ./hello_mpi
  --------------------------------------------------------------------------
  mpirun noticed that the job aborted, but has no info as to the process
  that caused that situation.
  --------------------------------------------------------------------------

The same program runs fine if I use "--host localhost".

Doing a "strace -v" on the "mpirun" invocation shows a strange
invocation of "orted"::

 execve("//usr/bin/ssh", ["/usr/bin/ssh", "-x", "compute-0-11",
        " orted", "--daemonize", "-mca", "ess", "env",
        "-mca", "orte_ess_jobid", "2322006016", "-mca",
        "orte_ess_vpid", "1", "-mca", "orte_ess_num_procs", "2",
        "--hnp-uri", "\"2322006016.0;tcp://192.168.122.1"],
        ["MKLROOT=/opt/intel/mkl/10.0.3.02", ...])

Indeed, the 192.168.122.1 address is connected to an internal Xen
bridge "virbr0", so it should not appear as a "call-back" address.
Is there a command-line option to force mpirun to use a certain IP address?
I have tried starting "mpirun" with "--mca btl_tcp_if_exclude lo,virbr0"
to no avail.

Also, the " orted" argument to ssh starts with a space; is this OK?

I'm using OMPI 1.4.2, self-compiled on a Rocks 5.2 (i.e., CentOS 5.2) cluster

Regards,
Riccardo