Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Error run mpiexec
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-07-21 10:00:00


If you look closely at the error messages, you will see that you were
executing LAM-MPI, not Open MPI. If you truly wanted to run Open MPI,
I would check your path to ensure that mpiexec is pointing at the Open
MPI binary.

Ralph

On Jul 21, 2008, at 7:47 AM, mariognu-outside_at_[hidden] wrote:

> Hi all,
>
> First, excuse my english, it isn't good :)
>
> Well, I have 2 machines, one a Xeon with 2 cpu (64bit) and a Pentium
> 4 with only one cpu. At the 2 machines I have installed Ubuntu 8
> Server and all packages to open-mpi and gromacs.
>
> I use gromacs for my works
>
> Ok, in the 2 machines, at my users folder, I have a file like this:
> machine1 cpu=2
> machine2
>
> Machine1 is Xeon (192.168.0.10) and Machine2 is Pentium 4
> (192.168.0.11)
>
> My file /etc/hosts is configured too.
>
> When I run mpiexec in machine2, I have like this:
> mariojose_at_machine2:~/lam-mpi$ mpiexec -n 3 hostname
> machine1
> machine2
> -----------------------------------------------------------------------------
> It seems that [at least] one of the processes that was started with
> mpirun did not invoke MPI_INIT before quitting (it is possible that
> more than one process did not invoke MPI_INIT -- mpirun was only
> notified of the first one, which was on node n0).
>
> mpirun can *only* be used with MPI programs (i.e., programs that
> invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
> to run non-MPI programs over the lambooted nodes.
> -----------------------------------------------------------------------------
> machine1
> mpirun failed with exit status 252
>
> When I run in machine1 I have like this:
>
> mariojose_at_machine1:~/lam-mpi$ mpiexec -n 3 hostname
> machine1
> machine1
> machine2
> -----------------------------------------------------------------------------
> It seems that [at least] one of the processes that was started with
> mpirun did not invoke MPI_INIT before quitting (it is possible that
> more than one process did not invoke MPI_INIT -- mpirun was only
> notified of the first one, which was on node n0).
>
> mpirun can *only* be used with MPI programs (i.e., programs that
> invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
> to run non-MPI programs over the lambooted nodes.
> -----------------------------------------------------------------------------
> mpirun failed with exit status 252
>
> I don't know why I have this message. I think that is a error.
>
> I try run with gromacs, if anybody use gromacs and can help me I
> like very much :) .
>
> mariojose_at_machine1:~/lam-mpi$ grompp -f run.mdp -p topol.top -c
> pr.gro -o run.tpr
> mariojose_at_machine1:~/mpiexec -n 3 mdrun -v -deffnm run
>
> It's works Ok. I see that cpu of 2 machines woks in 100%. It look
> well for me. But I have a error em I run mdrun_mpi that is a binary
> to work in cluster.
>
> mariojose_at_machine1:~/lam-mpi$ grompp -f run.mdp -p topol.top -c
> pr.gro -o run.tpr -np 3 -sort -shuffle
> mariojose_at_machine1:~/lam-mpi$ mpiexec -n 3 mdrun_mpi -v -deffnm run
> NNODES=3, MYRANK=0, HOSTNAME=machine1
> NNODES=3, MYRANK=2, HOSTNAME=machine1
> NODEID=0 argc=4
> NODEID=2 argc=4
> NNODES=3, MYRANK=1, HOSTNAME=machine2
> NODEID=1 argc=4
> :-) G R O M A C S (-:
>
> Gyas ROwers Mature At Cryogenic Speed
>
> :-) VERSION 3.3.3 (-:
>
>
> Written by David van der Spoel, Erik Lindahl, Berk Hess, and
> others.
> Copyright (c) 1991-2000, University of Groningen, The
> Netherlands.
> Copyright (c) 2001-2008, The GROMACS development team,
> check out http://www.gromacs.org for more information.
>
> This program is free software; you can redistribute it and/or
> modify it under the terms of the GNU General Public License
> as published by the Free Software Foundation; either version 2
> of the License, or (at your option) any later version.
>
> :-) mdrun_mpi (-:
>
> Option Filename Type Description
> ------------------------------------------------------------
> -s run.tpr Input Generic run input: tpr tpb tpa xml
> -o run.trr Output Full precision trajectory: trr trj
> -x run.xtc Output, Opt. Compressed trajectory (portable xdr
> format)
> -c run.gro Output Generic structure: gro g96 pdb xml
> -e run.edr Output Generic energy: edr ene
> -g run.log Output Log file
> -dgdl run.xvg Output, Opt. xvgr/xmgr file
> -field run.xvg Output, Opt. xvgr/xmgr file
> -table run.xvg Input, Opt. xvgr/xmgr file
> -tablep run.xvg Input, Opt. xvgr/xmgr file
> -rerun run.xtc Input, Opt. Generic trajectory: xtc trr trj
> gro g96 pdb
> -tpi run.xvg Output, Opt. xvgr/xmgr file
> -ei run.edi Input, Opt. ED sampling input
> -eo run.edo Output, Opt. ED sampling output
> -j run.gct Input, Opt. General coupling stuff
> -jo run.gct Output, Opt. General coupling stuff
> -ffout run.xvg Output, Opt. xvgr/xmgr file
> -devout run.xvg Output, Opt. xvgr/xmgr file
> -runav run.xvg Output, Opt. xvgr/xmgr file
> -pi run.ppa Input, Opt. Pull parameters
> -po run.ppa Output, Opt. Pull parameters
> -pd run.pdo Output, Opt. Pull data output
> -pn run.ndx Input, Opt. Index file
> -mtx run.mtx Output, Opt. Hessian matrix
> -dn run.ndx Output, Opt. Index file
>
> Option Type Value Description
> ------------------------------------------------------
> -[no]h bool no Print help info and quit
> -nice int 19 Set the nicelevel
> -deffnm string run Set the default filename for all file
> options
> -[no]xvgr bool yes Add specific codes (legends etc.) in the
> output
> xvg files for the xmgrace program
> -np int 1 Number of nodes, must be the same as
> used for
> grompp
> -nt int 1 Number of threads to start on each node
> -[no]v bool yes Be loud and noisy
> -[no]compact bool yes Write a compact log file
> -[no]sepdvdl bool no Write separate V and dVdl terms for each
> interaction type and node to the log
> file(s)
> -[no]multi bool no Do multiple simulations in parallel
> (only with
> -np > 1)
> -replex int 0 Attempt replica exchange every # steps
> -reseed int -1 Seed for replica exchange, -1 is
> generate a seed
> -[no]glas bool no Do glass simulation with special long
> range
> corrections
> -[no]ionize bool no Do a simulation including the effect of
> an X-Ray
> bombardment on your system
>
>
> Back Off! I just backed up run2.log to ./#run2.log.5#
>
> Back Off! I just backed up run0.log to ./#run0.log.12#
> Getting Loaded...
> Reading file run.tpr, VERSION 3.3.3 (single precision)
>
> Back Off! I just backed up run1.log to ./#run1.log.12#
>
> -------------------------------------------------------
> Program mdrun_mpi, VERSION 3.3.3
> Source code file: ../../../../src/gmxlib/block_tx.c, line: 74
>
> Fatal error:
> 0: size=672, len=840, rx_count=0
>
> -------------------------------------------------------
>
> "They're Red Hot" (Red Hot Chili Peppers)
>
> Error on node 1, will try to stop all the nodes
> Halting parallel program mdrun_mpi on CPU 1 out of 3
>
> gcq#220: "They're Red Hot" (Red Hot Chili Peppers)
>
> -----------------------------------------------------------------------------
> One of the processes started by mpirun has exited with a nonzero exit
> code. This typically indicates that the process finished in error.
> If your process did not finish in error, be sure to include a "return
> 0" or "exit(0)" in your C code before exiting the application.
>
> PID 15964 failed on node n0 (192.168.0.10) with exit status 1.
> -----------------------------------------------------------------------------
> mpirun failed with exit status 1
>
> I don't know what is problem.
>
> Anybody can help me ?
>
> Thanks
>
> Mario Jose
>
>
> /* WE ARE FREE */
> Hack to learn, don't learn to hack.
>
> /* Free Software Foundation */
> "Free software" is a matter of liberty, not price
> GNU's Not UNIX. Be free, use GNU/Linux
> www.gnu.org
> www.fsf.org
>
> /* Free Culture */
> free-culture.org
> creativecommons.org
>
> /* ... Hoarders may get piles of money,
> That is true, hackers, that is true.
> But they cannot help their neighbors;
> That's not good, hackers, that's not good ...
>
> Richard Stallman (www.stallman.org) */
>
> /* Human knowledge belongs to the world */
>
>
> Novos endereços, o Yahoo! que você conhece. Crie um email novo
> com a sua cara @ymail.com ou @rocketmail.com.
> http://br.new.mail.yahoo.com/addresses
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users