Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Graziano Giuliani (giuliani_at_[hidden])
Date: 2005-12-30 04:15:44


Ok Brian,

for the build part, attached is my config.log.
About stacktrace, I have with my compile options from gdb:

#0 0xb7d105b9 in orte_pls_rsh_launch ()
   from /home/cluster/openmpi/lib/openmpi/mca_pls_rsh.so

and recompiling with -g

#0 0xb7ca2599 in orte_pls_rsh_launch (jobid=1) at pls_rsh_module.c:716
716 if (mca_pls_rsh_component.debug) {

which means we have a memory corruption somewhere else...
Investigating from outside on what may cause the problem, I have found that I
can make the job run also changing the hostname in my hostfile.

-) No localhost in hostfile -> run
-) "wowbagger" or "localhost" in hostfile -> run
-) FQDN wowbagger.cluster in hostfile -> SIGSEGV

I have a private network (10.2.1.0) with cluster master (local node) as DNS
with bind v9.

# hostname
wowbagger
# host wowbagger
wowbagger.cluster has address 10.2.1.100
# mpirun --hostfile wrf_openmpi.mac -np 10 -bynode wrf.exe
mpirun noticed that job rank 0 with PID 0 on node "wowbagger.cluster" exited
on signal 11.
[wowbagger:20400] ERROR: A daemon on node wowbagger.cluster failed to start as
expected.
[wowbagger:20400] ERROR: There may be more information available from
[wowbagger:20400] ERROR: the remote shell (see above).
[wowbagger:20400] The daemon received a signal 11 (with core).
mpirun: killing job...
9 processes killed (possibly by Open MPI)

Changing wowbagger.cluster with simply wowbagger do the trick. Something in
host name resolution?

Attached is my hostfile.

                   Graziano.

P.S.: Sorry for the delay, but yesterday here in Florence we had heavy
snowfall !

-- 
                             \ | /
                             (@ @)
 -------------------------o00-(_)-00o -----------------------------
 LaMMA - Laboratorio per la Meteorologia e la Modellistica Ambientale
         Laboratory for Meteorology and Environmental Modelling
 Via Madonna del Piano, 50019 Sesto Fiorentino (FI)
     tel: + 39 055 4483049
     fax: + 39 055 444083
     web: www.lamma.rete.toscana.it
  e-mail: giuliani_at_[hidden]






  • application/pgp-signature attachment: stored