Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Fwd: ssh MPi and program tests
From: Gus Correa (gus_at_[hidden])
Date: 2009-04-07 12:28:19


Hi Francesco

Sorry, I was out of the loop, doing some real work ... :)
Jody and Terry already gave you great advice (as they always do),
and got you moving in the right direction, which is great news!

More comments below.

I think we need to cut this message short,
for good mailing list etiquette.
I trimmed it a bit but it is still too long.
Or you can open a new thread when you try Amber (was it Amber?) again.

Francesco Pietra wrote:
> Hi Gustavo:
>
> "I feel myself stupid enough in this circumstance."
>

Oh, well, the folks on the list
for sure didn't want you to feel this way.
We are always learning something.

> That was the case. Adjusted as indicated by Jody, the connectivity
> test passed and the hello test:
>
> Hello, world, I am 0 of 4
> 1 of 4
> 2 of 4
> 3 of 4
>
> Combine with all other investigations, the installation of openmpi
> 1.3.1 is correct.

Great, with Jody's help now you know for sure that OpenMPI is sane,
and so is the mpirun launching mechanism.

Now, I would guess you ran it in a single host,
with four processes,
something like
"mpirun -host host1 -n 4 hello"
right?
MPI is highly dependent on the mpirun command and options.
Hence, it would help us help you,
if you send the mpirun command you used as well.

Are you happy with running on a single host,
or does your Amber program require more computer power, more than one host?
If it does,
and if you want to test the connection across the two AMD64 machines,
you need to run again with the two hosts.
Something like this:

"mpirun -host host1,host2 -n 8 hello"

With the appropriate names/IP addresses for host1, and host2, of course.

Does this work for you?

>
> thanks a lot for your lesson

Not trying to lecture you, just to help you get started.
The combination of MPI, network environment, etc, can be daunting in the
beginning, and I and most people on the list experienced the same
difficulties you are facing at some point.
Somebody helped us
(I continue to get help from mailing lists and
their knowledgeable and generous subscribers),
so why not helping you and others out?

> francesco
>
> ---------- Forwarded message ----------
> From: Francesco Pietra <chiendarret_at_[hidden]>
> Date: Tue, Apr 7, 2009 at 11:39 AM
> Subject: Re: [OMPI users] ssh MPi and program tests
> To: Open MPI Users <users_at_[hidden]>
>
>
> Hi Gus:
> I should have set clear at the beginning that on the Zyxel router
> (connected to Internet by dynamic IP afforded by the provider) there
> are three computers. Their host names:
>
> deb32 (desktop debian i386)
>
> deb64 (multisocket debian amd 64 lenny)
>
> tya64 (multisocket debian amd 64 lenny)
>
> The three are ssh passwordless interconnected from the same user
> (myself). I never established connections as root user because I have
> direct access to all tree computers. So, if I slogin as user,
> passwordless connection is established. If I try to slogin as root
> user, it says that the authenticity of the host to which I intended to
> connect can't be established, RSA key fingerprint .. Connect?
>

Not a problem.
You don't need to ssh as root to the remote machines.
Actually, it is safer not to.
If need to do sys admin work in the remote machine,
you can become root after you ssh as a regular user,
using "su" (or preferably "su -", to get the superuser environment).

> Moreover, I appended to the pub keys know to deb64 those that deb64
> had sent to either deb32 or tya64. Whereby, when i command.
>
> With certain programs (conceived for batch run), the execution on
> deb64 is launched from deb32.
>
> ssh 192.168.#.## date (where the numbers stand for hostname)
>
>
> I copied /examples to my deb64 home, chown to me, compiled as user and
> run as user "connectivity". (I have not compild in the openmpi
> directory as this is to root user, while ssh has been adjusted for me
> as user.
>
> Running as user in my home
>
> /usr/local/bin/mpirun -deb64 -1 connectivity_c 2>&1 | tee n=1.connectivity.out
>

OK, Jody already clarified that one.
You have wrong options to mpirun.
You may be mixing the mpirun syntax with some other software syntax.
Remember: "mpirun --help" is your friend!

If you follow the right syntax of mpirun, as recommended by Jody,
the conectivity test program will run correctly,
just as the "hello" program did.

> it asked to add the host (himself) to the list on known hosts (on
> repeating the command, that was no more asked). The unabridged output:
>
> ===========
> [deb64:03575] procdir: /tmp/openmpi-sessions-francesco_at_deb64_0/38647/0/0
> [deb64:03575] jobdir: /tmp/openmpi-sessions-francesco_at_deb64_0/38647/0
> [deb64:03575] top: openmpi-sessions-francesco_at_deb64_0
> [deb64:03575] tmp: /tmp
> [deb64:03575] mpirun: reset PATH:
> /usr/local/bin:/usr/local/mcce/bin:/opt/intel/cce/10.1.015/bin:/opt/intel/fce/10.1.015/bin:/home/francesco/gmmx06:/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/amber10/exe:/usr/local/dock6/bin
> [deb64:03575] mpirun: reset LD_LIBRARY_PATH:
> /usr/local/lib:/opt/intel/mkl/10.0.1.014/lib/em64t:/opt/intel/cce/10.1.015/lib:/opt/intel/fce/10.1.015/lib:/usr/local/lib:/opt/acml4.1.0/gfortran64_mp_int64/lib
> [deb64:03583] procdir: /tmp/openmpi-sessions-francesco_at_deb64_0/38647/0/1
> [deb64:03583] jobdir: /tmp/openmpi-sessions-francesco_at_deb64_0/38647/0
> [deb64:03583] top: openmpi-sessions-francesco_at_deb64_0
> [deb64:03583] tmp: /tmp
> [deb64:03575] [[38647,0],0] node[0].name deb64 daemon 0 arch ffc91200
> [deb64:03575] [[38647,0],0] node[1].name deb64 daemon 1 arch ffc91200
> [deb64:03583] [[38647,0],1] node[0].name deb64 daemon 0 arch ffc91200
> [deb64:03583] [[38647,0],1] node[1].name deb64 daemon 1 arch ffc91200
> --------------------------------------------------------------------------
> mpirun was unable to launch the specified application as it could not
> find an executable:
>
> Executable: -e
> Node: deb64
>
> while attempting to start process rank 0.
> --------------------------------------------------------------------------
> [deb64:03575] sess_dir_finalize: job session dir not empty - leaving
> [deb64:03575] sess_dir_finalize: proc session dir not empty - leaving
> orterun: exiting with status -123
> [deb64:03583] sess_dir_finalize: job session dir not empty - leaving
> =========================
>
> I have changed the command, setting 4 for n and giving the full path
> to the executable "connectivity_c" at no avail. I do not understand
> the message "Executable: -e" in the out file and I feel myself stupid
> enough in this circumstance.
>
> The ssh is working for slogin and ssh to deb 64 date gives the date
> passwordless, both before and after the "connectivity" run. i.e.,
> deb64 knew, and knows, itself.
>
> The output of ompi_info between xxxxxxxxxx should probably clarify
> your other questions.
>
> xxxxxxxxxxxxxxxxxxx
> Package: Open MPI root_at_deb64 Distribution
> Open MPI: 1.3.1
> Open MPI SVN revision: r20826
> Open MPI release date: Mar 18, 2009
> Open RTE: 1.3.1
> Open RTE SVN revision: r20826
> Open RTE release date: Mar 18, 2009
> OPAL: 1.3.1
> OPAL SVN revision: r20826
> OPAL release date: Mar 18, 2009
> Ident string: 1.3.1
> Prefix: /usr/local
> Configured architecture: x86_64-unknown-linux-gnu
> Configure host: deb64
> Configured by: root
> Configured on: Fri Apr 3 23:03:30 CEST 2009
> Configure host: deb64
> Built by: root
> Built on: Fri Apr 3 23:12:28 CEST 2009
> Built host: deb64
> C bindings: yes
> C++ bindings: yes
> Fortran77 bindings: yes (all)
> Fortran90 bindings: yes
> Fortran90 bindings size: small
> C compiler: gcc
> C compiler absolute: /usr/bin/gcc
> C++ compiler: g++
> C++ compiler absolute: /usr/bin/g++
> Fortran77 compiler: /opt/intel/fce/10.1.015/bin/ifort
> Fortran77 compiler abs:
> Fortran90 compiler: /opt/intel/fce/10.1.015/bin/ifort
> Fortran90 compiler abs:
> C profiling: yes
> C++ profiling: yes
> Fortran77 profiling: yes
> Fortran90 profiling: yes
> C++ exceptions: no
> Thread support: posix (mpi: no, progress: no)
> Sparse Groups: no
> Internal debug support: no
> MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
> libltdl support: yes
> Heterogeneous support: no
> mpirun default --prefix: no
> MPI I/O support: yes
> MPI_WTIME support: gettimeofday
> Symbol visibility support: yes
> FT Checkpoint support: no (checkpoint thread: no)
> MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.3.1)
> MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.3.1)
> MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.3.1)
> MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.3.1)
> MCA carto: file (MCA v2.0, API v2.0, Component v1.3.1)
> MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.3.1)
> MCA maffinity: libnuma (MCA v2.0, API v2.0, Component v1.3.1)
> MCA timer: linux (MCA v2.0, API v2.0, Component v1.3.1)
> MCA installdirs: env (MCA v2.0, API v2.0, Component v1.3.1)
> MCA installdirs: config (MCA v2.0, API v2.0, Component v1.3.1)
> MCA dpm: orte (MCA v2.0, API v2.0, Component v1.3.1)
> MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.3.1)
> MCA allocator: basic (MCA v2.0, API v2.0, Component v1.3.1)
> MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.3.1)
> MCA coll: basic (MCA v2.0, API v2.0, Component v1.3.1)
> MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.3.1)
> MCA coll: inter (MCA v2.0, API v2.0, Component v1.3.1)
> MCA coll: self (MCA v2.0, API v2.0, Component v1.3.1)
> MCA coll: sm (MCA v2.0, API v2.0, Component v1.3.1)
> MCA coll: sync (MCA v2.0, API v2.0, Component v1.3.1)
> MCA coll: tuned (MCA v2.0, API v2.0, Component v1.3.1)
> MCA io: romio (MCA v2.0, API v2.0, Component v1.3.1)
> MCA mpool: fake (MCA v2.0, API v2.0, Component v1.3.1)
> MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.3.1)
> MCA mpool: sm (MCA v2.0, API v2.0, Component v1.3.1)
> MCA pml: cm (MCA v2.0, API v2.0, Component v1.3.1)
> MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.3.1)
> MCA pml: v (MCA v2.0, API v2.0, Component v1.3.1)
> MCA bml: r2 (MCA v2.0, API v2.0, Component v1.3.1)
> MCA rcache: vma (MCA v2.0, API v2.0, Component v1.3.1)
> MCA btl: self (MCA v2.0, API v2.0, Component v1.3.1)
> MCA btl: sm (MCA v2.0, API v2.0, Component v1.3.1)
> MCA btl: tcp (MCA v2.0, API v2.0, Component v1.3.1)
> MCA topo: unity (MCA v2.0, API v2.0, Component v1.3.1)
> MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.3.1)
> MCA osc: rdma (MCA v2.0, API v2.0, Component v1.3.1)
> MCA iof: hnp (MCA v2.0, API v2.0, Component v1.3.1)
> MCA iof: orted (MCA v2.0, API v2.0, Component v1.3.1)
> MCA iof: tool (MCA v2.0, API v2.0, Component v1.3.1)
> MCA oob: tcp (MCA v2.0, API v2.0, Component v1.3.1)
> MCA odls: default (MCA v2.0, API v2.0, Component v1.3.1)
> MCA ras: slurm (MCA v2.0, API v2.0, Component v1.3.1)
> MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.3.1)
> MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.3.1)
> MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.3.1)
> MCA rml: oob (MCA v2.0, API v2.0, Component v1.3.1)
> MCA routed: binomial (MCA v2.0, API v2.0, Component v1.3.1)
> MCA routed: direct (MCA v2.0, API v2.0, Component v1.3.1)
> MCA routed: linear (MCA v2.0, API v2.0, Component v1.3.1)
> MCA plm: rsh (MCA v2.0, API v2.0, Component v1.3.1)
> MCA plm: slurm (MCA v2.0, API v2.0, Component v1.3.1)
> MCA filem: rsh (MCA v2.0, API v2.0, Component v1.3.1)
> MCA errmgr: default (MCA v2.0, API v2.0, Component v1.3.1)
> MCA ess: env (MCA v2.0, API v2.0, Component v1.3.1)
> MCA ess: hnp (MCA v2.0, API v2.0, Component v1.3.1)
> MCA ess: singleton (MCA v2.0, API v2.0, Component v1.3.1)
> MCA ess: slurm (MCA v2.0, API v2.0, Component v1.3.1)
> MCA ess: tool (MCA v2.0, API v2.0, Component v1.3.1)
> MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.3.1)
> MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.3.1)
> xxxxxxxxxxxxxxxxxxxxxxxxxx
>

Note that your build didn't use icc, but gcc (and g++), along with ifort.
I don't remember what your intent was,
but if you wanted to use icc (and icpc),
somehow the OpenMPI configure script didn't pick it up.
If you really want icc, rebuild OpenMPI giving the full path name
to icc (CC=/full/path/to/icc), and likewise for icpc and ifort.

> thanks
> francesco
>

Good luck,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------