Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Please help me with this simple setup. i am stuck
From: Gus Correa (gus_at_[hidden])
Date: 2009-05-09 20:32:59


Hi Venu

As a general suggestion, take a look at the OpenMPI FAQs,
specially those about running MPI jobs and troubleshooting.
They will probably drive you in the right direction:

http://www.open-mpi.org/faq/

Luis Vitorio Cargnini wrote:
> maybe add the slots=1 for example to your first node
>
> Le 09-05-09 à 11:42, Venu Gopal a écrit :
>
>> I am venu,
>>
>> I have tried to setup a simple 2 node openmpi system.
>>
>> on two machines one is running debian lenny (ip 10.0.3.1)
>> other is running ubuntu hardy (ip 10.0.3.3)
>>
>> I am getting error when i try to execute a file using mpiexec, i am
>> sure password is correct. as ssh is working
>> and the file pi3 is in directory code which in turn is in my home
>> directory venu.
>>
>> the file pi.c is below
>>
>>
>>
>> /* To run this program: */
>> /*--------------------- */
>> /* */
>> /* */
>> /* Issue: time mpirun -np [nprocs] ./pi (SGI, Beowulf) */
>> /* */
>> /* */
>> /* ------------------------------------------------------------------ */
>>
>> #include <stdio.h>
>> #include <stdlib.h>
>>
>> #include "mpi.h"
>>
>> int main(int argc, char *argv[])
>> {
>> int i, n;
>> double h, pi, x;
>>
>> int me, nprocs;
>> double piece;
>>
>> /* --------------------------------------------------- */
>>
>> MPI_Init (&argc, &argv);
>>
>> MPI_Comm_size (MPI_COMM_WORLD, &nprocs);
>> MPI_Comm_rank (MPI_COMM_WORLD, &me);
>>
>> /* --------------------------------------------------- */
>>
>> if (me == 0)
>> {
>> printf("%s", "Input number of intervals:\n");
>> scanf ("%d", &n);

This is not why your runs are failing,
but it may cause future runs to fail.

This interactive step above is fine when you are running
on a single machine,which is your local machine.
However, it may get tricky if your rank 0 process (i.e. me=0)
is located on a remote machine. (Where you don't have an interactive
shell connection!)
Who's going to read the "Input number of intervals:" message,
and who is going to type in the value of "n" for sscanf to read it
then?

I suggest that you change this part of the code, remove printf and scanf
lines in "if (me==0)" block, and hardwire the value of n (Say n=10000 or
100000).
This is easy to do.

A little harder alternative is to read n
from STDIN using argv[1], for instance, redirecting STDIN to a file:
mpiexec ..... pi < input_file,
where input_file has a single number in it (the value of n).
In this case you need to modify the "if (me == 0)" block to assign
argv[1] to n, I guess.

>> }
>>
>> /* --------------------------------------------------- */
>>
>> MPI_Bcast (&n, 1, MPI_INT,
>> 0, MPI_COMM_WORLD);
>>
>> /* --------------------------------------------------- */
>>
>> h = 1. / (double) n;
>>
>> piece = 0.;
>>
>> for (i=me+1; i <= n; i+=nprocs)
>> {
>> x = (i-1)*h;
>>
>> piece = piece + (
>> 4/
>> (1+(x)*(x))
>> +
>> 4/
>> (1+(x+h)*(x+h))
>> ) / 2 * h;
>> }
>>
>> printf("%d: pi = %25.15f\n", me, piece);
>>
>> /* --------------------------------------------------- */
>>
>> MPI_Reduce (&piece, &pi, 1, MPI_DOUBLE,
>> MPI_SUM, 0, MPI_COMM_WORLD);
>>
>> /* --------------------------------------------------- */
>>
>> if (me == 0)
>> {
>> printf("pi = %25.15f\n", pi);
>> }
>>
>> /* --------------------------------------------------- */
>>
>> MPI_Finalize();
>>
>> return 0;
>> }
>>
>>
>>
>> the code directory is nfs shared and mounted on the client system
>> which is 10.0.3.3.

That is very nice way to set things up,
much better than having to copy over everything to the local
file system.

>> the server system is 10.0.3.1
>>
>> i can ping the client from server and also server from client. ssh is
>> working bothways.

Did you set up ssh passwordless on both hosts?
You must have passwordless ssh connection for OpenMPI (or any MPI) to
work with more than one host.

>>
>> the /etc/openmpi/openmpi-default-hostfile is having the line on the
>> first node ie. 10.0.3.1
>>
>> 10.0.3.3 slots=2
>>

If you want to run on both hosts you need a hostfile with both hosts.
Assuming both have two slots, something like this:

10.0.3.1 slots=2
10.0.3.3 slots=2

Actually, it may be better to create this file on your execution
directory, instead of using the openmpi-default-hostfile.
If you do so, use also the -hostfile option of mpiexec.

Alternatively, you can list the hosts on the mpiexec command line,
using the -host option.

"man mpiexec" is your friend!

>>
>> the other nodes file is just empty. i mean only comments are there.
>>
>>
>> this is the error is get when i execute.
>>
>>
>> venu_at_mainframe:~$ mpiexec -np 3 ./code/pi3
>> venu_at_10.0.3.3 <mailto:venu_at_10.0.3.3>'s password:

It looks like 10.0.3.3 is asking for password when ssh tries to connect
to it from 10.0.3.1 (where you launched the mpiexec command).

You must setup ssh with **passwordless** connections!
Here is one way to do it:

http://agenda.clustermonkey.net/index.php/Passwordless_SSH_(and_RSH)_Logins

Or you can Google out equivalent solutions.

>> --------------------------------------------------------------------------
>> Could not execute the executable "./code/pi3": Exec format error
>>

Do you have the same architecture on both computers?
(I.e. both x86, or both x86_64, but not one of each?

>> This could mean that your PATH or executable name is wrong, or that
>> you do not
>> have the necessary permissions. Please ensure that the executable is
>> able to be
>> found and executed.
>>

Ssh will put you on your home directory on the remote machine.
However, your executable is not there, but in whatever/code/pi3.

You need to use the mpiexec -path option to tell it where to find
the executable.
"man mpiexec" is your friend!

>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> Could not execute the executable "./code/pi3": Exec format error
>>
>> This could mean that your PATH or executable name is wrong, or that
>> you do not
>> have the necessary permissions. Please ensure that the executable is
>> able to be
>> found and executed.
>>
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> Could not execute the executable "./code/pi3": Exec format error
>>
>> This could mean that your PATH or executable name is wrong, or that
>> you do not
>> have the necessary permissions. Please ensure that the executable is
>> able to be
>> found and executed.
>>
>> --------------------------------------------------------------------------
>>
>>
>>
>> now, when i remove that line from
>> /etc/openmpi/openmpi-default-hostfile on the first node
>>
>> the program compiles and executes on the first node node.
>>
>> same, when i compile it and execute it on the second node, it works.
>>
>> only problem is when i try to run it on both.
>>
>> i get the error mesage as above.
>>

See the suggestions above.

>>
>> someone, please help me. as i am trying to setup this system for the
>> first time.
>>
>> and i am stuck.
>>
>> i am fairly good with linux. so i know my way around linux. but am
>> stuck with open mpi.

The main new ingredient, besides Linux, is the newtork.
First you must tell OpenMPI which hosts in the network
you want to use (in a correct hostfile).
Moreover, the two hosts must talk to each other smoothly,
they must agree about passwordless connections,
about where the executables are, etc.
You are the master, and you must tell both hosts how to agree
on these things.

You'll get there, just be patient, read the available documentation
carefully.

Setup passwordless ssh connections.
Read the OpenMPI FAQ.
Read the mpiexec man page.

They will help you.

Good luck!
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

>> --
>>
>> Regards,
>>
>> Venu Gopal
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden] <mailto:users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users