Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Error run mpiexec
From: mariognu-outside_at_[hidden]
Date: 2008-07-21 09:47:58


Hi all,

First, excuse my english, it isn't good :)

Well, I have 2 machines, one a Xeon with 2 cpu (64bit) and a Pentium 4 with only one cpu. At the 2 machines I have installed Ubuntu 8 Server and all packages to open-mpi and gromacs.

I use gromacs for my works

Ok, in the 2 machines, at my users folder, I have a file like this:
machine1 cpu=2
machine2

Machine1 is Xeon (192.168.0.10) and Machine2 is Pentium 4 (192.168.0.11)

My file /etc/hosts is configured too.

When I run mpiexec in machine2, I have like this:
mariojose_at_machine2:~/lam-mpi$ mpiexec -n 3 hostname
machine1
machine2
-----------------------------------------------------------------------------
It seems that [at least] one of the processes that was started with
mpirun did not invoke MPI_INIT before quitting (it is possible that
more than one process did not invoke MPI_INIT -- mpirun was only
notified of the first one, which was on node n0).

mpirun can *only* be used with MPI programs (i.e., programs that
invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
to run non-MPI programs over the lambooted nodes.
-----------------------------------------------------------------------------
machine1
mpirun failed with exit status 252

When I run in machine1 I have like this:

mariojose_at_machine1:~/lam-mpi$ mpiexec -n 3 hostname
machine1
machine1
machine2
-----------------------------------------------------------------------------
It seems that [at least] one of the processes that was started with
mpirun did not invoke MPI_INIT before quitting (it is possible that
more than one process did not invoke MPI_INIT -- mpirun was only
notified of the first one, which was on node n0).

mpirun can *only* be used with MPI programs (i.e., programs that
invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
to run non-MPI programs over the lambooted nodes.
-----------------------------------------------------------------------------
mpirun failed with exit status 252

I don't know why I have this message. I think that is a error.

I try run with gromacs, if anybody use gromacs and can help me I like very much :) .

mariojose_at_machine1:~/lam-mpi$ grompp -f run.mdp -p topol.top -c pr.gro -o run.tpr
mariojose_at_machine1:~/mpiexec -n 3 mdrun -v -deffnm run

It's works Ok. I see that cpu of 2 machines woks in 100%. It look well for me. But I have a error em I run mdrun_mpi that is a binary to work in cluster.

mariojose_at_machine1:~/lam-mpi$ grompp -f run.mdp -p topol.top -c pr.gro -o run.tpr -np 3 -sort -shuffle
mariojose_at_machine1:~/lam-mpi$ mpiexec -n 3 mdrun_mpi -v -deffnm run
NNODES=3, MYRANK=0, HOSTNAME=machine1
NNODES=3, MYRANK=2, HOSTNAME=machine1
NODEID=0 argc=4
NODEID=2 argc=4
NNODES=3, MYRANK=1, HOSTNAME=machine2
NODEID=1 argc=4
                         :-) G R O M A C S (-:

                     Gyas ROwers Mature At Cryogenic Speed

                            :-) VERSION 3.3.3 (-:

      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2008, The GROMACS development team,
            check out http://www.gromacs.org for more information.

         This program is free software; you can redistribute it and/or
          modify it under the terms of the GNU General Public License
         as published by the Free Software Foundation; either version 2
             of the License, or (at your option) any later version.

                              :-) mdrun_mpi (-:

Option Filename Type Description
------------------------------------------------------------
  -s run.tpr Input Generic run input: tpr tpb tpa xml
  -o run.trr Output Full precision trajectory: trr trj
  -x run.xtc Output, Opt. Compressed trajectory (portable xdr format)
  -c run.gro Output Generic structure: gro g96 pdb xml
  -e run.edr Output Generic energy: edr ene
  -g run.log Output Log file
-dgdl run.xvg Output, Opt. xvgr/xmgr file
-field run.xvg Output, Opt. xvgr/xmgr file
-table run.xvg Input, Opt. xvgr/xmgr file
-tablep run.xvg Input, Opt. xvgr/xmgr file
-rerun run.xtc Input, Opt. Generic trajectory: xtc trr trj gro g96 pdb
-tpi run.xvg Output, Opt. xvgr/xmgr file
 -ei run.edi Input, Opt. ED sampling input
 -eo run.edo Output, Opt. ED sampling output
  -j run.gct Input, Opt. General coupling stuff
 -jo run.gct Output, Opt. General coupling stuff
-ffout run.xvg Output, Opt. xvgr/xmgr file
-devout run.xvg Output, Opt. xvgr/xmgr file
-runav run.xvg Output, Opt. xvgr/xmgr file
 -pi run.ppa Input, Opt. Pull parameters
 -po run.ppa Output, Opt. Pull parameters
 -pd run.pdo Output, Opt. Pull data output
 -pn run.ndx Input, Opt. Index file
-mtx run.mtx Output, Opt. Hessian matrix
 -dn run.ndx Output, Opt. Index file

Option Type Value Description
------------------------------------------------------
-[no]h bool no Print help info and quit
-nice int 19 Set the nicelevel
-deffnm string run Set the default filename for all file options
-[no]xvgr bool yes Add specific codes (legends etc.) in the output
                            xvg files for the xmgrace program
-np int 1 Number of nodes, must be the same as used for
                            grompp
-nt int 1 Number of threads to start on each node
-[no]v bool yes Be loud and noisy
-[no]compact bool yes Write a compact log file
-[no]sepdvdl bool no Write separate V and dVdl terms for each
                            interaction type and node to the log file(s)
-[no]multi bool no Do multiple simulations in parallel (only with
                            -np > 1)
-replex int 0 Attempt replica exchange every # steps
-reseed int -1 Seed for replica exchange, -1 is generate a seed
-[no]glas bool no Do glass simulation with special long range
                            corrections
-[no]ionize bool no Do a simulation including the effect of an X-Ray
                            bombardment on your system

Back Off! I just backed up run2.log to ./#run2.log.5#

Back Off! I just backed up run0.log to ./#run0.log.12#
Getting Loaded...
Reading file run.tpr, VERSION 3.3.3 (single precision)

Back Off! I just backed up run1.log to ./#run1.log.12#

-------------------------------------------------------
Program mdrun_mpi, VERSION 3.3.3
Source code file: ../../../../src/gmxlib/block_tx.c, line: 74

Fatal error:
0: size=672, len=840, rx_count=0

-------------------------------------------------------

"They're Red Hot" (Red Hot Chili Peppers)

Error on node 1, will try to stop all the nodes
Halting parallel program mdrun_mpi on CPU 1 out of 3

gcq#220: "They're Red Hot" (Red Hot Chili Peppers)

-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 15964 failed on node n0 (192.168.0.10) with exit status 1.
-----------------------------------------------------------------------------
mpirun failed with exit status 1

I don't know what is problem.

Anybody can help me ?

Thanks

Mario Jose

/* WE ARE FREE */
Hack to learn, don't learn to hack.

/* Free Software Foundation */
"Free software" is a matter of liberty, not price
GNU's Not UNIX. Be free, use GNU/Linux
www.gnu.org
www.fsf.org

/* Free Culture */
free-culture.org
creativecommons.org

/* ... Hoarders may get piles of money,
That is true, hackers, that is true.
But they cannot help their neighbors;
That's not good, hackers, that's not good ...

Richard Stallman (www.stallman.org) */

/* Human knowledge belongs to the world */

      Novos endereços, o Yahoo! que você conhece. Crie um email novo com a sua cara @ymail.com ou @rocketmail.com.
http://br.new.mail.yahoo.com/addresses