It looks like the daemon isn't seeing the other interface address on host x2. Can you ssh to x2 and send the contents of ifconfig -a?

Ralph

On Oct 31, 2008, at 9:18 AM, Allan Menezes wrote:

users-request@open-mpi.org wrote:
Send users mailing list submissions to
	users@open-mpi.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://www.open-mpi.org/mailman/listinfo.cgi/users
or, via email, send a message with subject or body 'help' to
	users-request@open-mpi.org

You can reach the person managing the list at
	users-owner@open-mpi.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."


Today's Topics:

   1. Openmpi ver1.3beta1 (Allan Menezes)
   2. Re: Openmpi ver1.3beta1 (Ralph Castain)
   3. Re: Equivalent .h files (Benjamin Lamptey)
   4. Re: Equivalent .h files (Jeff Squyres)
   5. ompi-checkpoint is hanging (Matthias Hovestadt)
   6. unsubscibe (Bertrand P. S. Russell)
   7. Re: ompi-checkpoint is hanging (Tim Mattox)


----------------------------------------------------------------------

Message: 1
Date: Fri, 31 Oct 2008 02:06:09 -0400
From: Allan Menezes <amenezes007@sympatico.ca>
Subject: [OMPI users] Openmpi ver1.3beta1
To: users@open-mpi.org
Message-ID: <BLU0-SMTP224B5E356302AC7AA4481088200@phx.gbl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi,
    I built open mpi version 1.3b1 withe following cofigure command:
./configure --prefix=/opt/openmpi13b1 --enable-mpi-threads 
--with-threads=posix --disable-ipv6
I have six nodes x1..6
I distributed the /opt/openmpi13b1 with scp to all other nodes from the 
head node
When i run the following command:
mpirun --prefix /opt/openmpi13b1  --host x1 hostname it works on x1 
printing out the hostname of x1
But when i type
mpirun --prefix /opt/openmpi13b1 --host x2 hostname it hangs and does 
not give me any output
I have a 6 node intel quad core cluster with OSCAR and pci express 
gigabit ethernet for eth0
Can somebody advise?
Thank you very much.
Allan Menezes


------------------------------

Message: 2
Date: Fri, 31 Oct 2008 02:41:59 -0600
From: Ralph Castain <rhc@lanl.gov>
Subject: Re: [OMPI users] Openmpi ver1.3beta1
To: Open MPI Users <users@open-mpi.org>
Message-ID: <E8AF5AAF-99CB-4EFC-AA97-5385CE333AD2@lanl.gov>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes

When you typed the --host x1 command, were you sitting on x1?  
Likewise, when you typed the --host x2 command, were you not on host x2?

If the answer to both questions is "yes", then my guess is that  
something is preventing you from launching a daemon on host x2. Try  
adding --leave-session-attached to your cmd line and see if any error  
messages appear. And check the FAQ for tips on how to setup for ssh  
launch (I'm assuming that is what you are using).

http://www.open-mpi.org/faq/?category=rsh

Ralph

On Oct 31, 2008, at 12:06 AM, Allan Menezes wrote:

  
Hi Ralph,
   Yes that is true I tried both commands on x1 and ver 1.28 works on the same setup without a problem.
Here is the output with the added 
--leave-session-attached
[allan@x1 ~]$ mpiexec --prefix /opt/openmpi13b2  --leave-session-attached -host x2 hostname
[x2.brampton.net:02236] [[1354,0],1]-[[1354,0],0] mca_oob_tcp_peer_try_connect: connect to 192.168.0.198:0 failed: Network is unreachable (101)
[x2.brampton.net:02236] [[1354,0],1]-[[1354,0],0] mca_oob_tcp_peer_try_connect: connect to 192.168.122.1:0 failed: Network is unreachable (101)
[x2.brampton.net:02236] [[1354,0],1] routed:binomial: Connection to lifeline [[1354,0],0] lost
--------------------------------------------------------------------------
A daemon (pid 7665) died unexpectedly with status 1 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
mpiexec: clean termination accomplished

[allan@x1 ~]$
However my main eth0 IP is 192.168.1.1 and internet gate way is 192.168.0.1
Any solutions?
Allan Menezes



  
Hi,
  I built open mpi version 1.3b1 withe following cofigure command:
./configure --prefix=/opt/openmpi13b1 --enable-mpi-threads --with- 
threads=posix --disable-ipv6
I have six nodes x1..6
I distributed the /opt/openmpi13b1 with scp to all other nodes from  
the head node
When i run the following command:
mpirun --prefix /opt/openmpi13b1  --host x1 hostname it works on x1  
printing out the hostname of x1
But when i type
mpirun --prefix /opt/openmpi13b1 --host x2 hostname it hangs and  
does not give me any output
I have a 6 node intel quad core cluster with OSCAR and pci express  
gigabit ethernet for eth0
Can somebody advise?
Thank you very much.
Allan Menezes
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
    


------------------------------

Message: 3
Date: Fri, 31 Oct 2008 09:48:43 +0000
From: "Benjamin Lamptey" <bllamptey@gmail.com>
Subject: Re: [OMPI users] Equivalent .h files
To: users@open-mpi.org
Message-ID:
	<71ec5a370810310248g91a4d9ftca708e6e6306d0c9@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hello again,
I have to be more specific with my problem.

1) I am using the Mac OS X (Leopard) operating system.
When I do uname -a, I get Darwin Kernel Version 9.5.0

2) My code if fortran 90

3) I tried using the mpif90 wrapper and I got the following message

xxxxxxxxxxxxx
mpif90  -c -O3   /Users/lamptey/projectb/src/blag_real_burnmpi.f90
--------------------------------------------------------------------------
Unfortunately, this installation of Open MPI was not compiled with
Fortran 90 support.  As such, the mpif90 compiler is non-functional.

--------------------------------------------------------------------------
make: *** [blag_real_burnmpi.o] Error 1
xxxxxxxxxxxxx

4) I have the g95 compiler installed. So when I try using the
g95, (with include "mpif.h" or 'mpif.h'), I get the following mesage:

xxxxxxxxxxxxxx
g95 -fno-pic -c -O3   /Users/lamptey/projectb/src/blag_real_burnmpi.f90
Error: Can't open included file 'mpif.h'
make: *** [blag_real_burnmpi.o] Error 1
xxxxxxxxxxxxxxx

5) What are people's experience in this case?

Thanks
Ben

On Thu, Oct 30, 2008 at 2:33 PM, Benjamin Lamptey <bllamptey@gmail.com>wrote:

  
Hello,
I am new at using open-mpi and will like to know something basic.

What is the equivalent of the "mpif.h" in open-mpi which is normally
"included" at
the beginning of mpi codes (fortran in this case).

I shall appreciate that for cpp as well.

Thanks
Ben

    
-------------- next part --------------
HTML attachment scrubbed and removed

------------------------------

Message: 4
Date: Fri, 31 Oct 2008 06:51:01 -0400
From: Jeff Squyres <jsquyres@cisco.com>
Subject: Re: [OMPI users] Equivalent .h files
To: Open MPI Users <users@open-mpi.org>
Message-ID: <A493DF4D-3DFF-46E4-8C90-D3771527379D@cisco.com>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes

The Open MPI that ships with Leopard does not include Fortran support  
because OS X does not ship with a Fortran compiler (this was Apple's  
decision, not ours).  If you have Fortran MPI applications, you'll  
need to a) download and install your own Fortran compiler (e.g., http://hpc.sf.net/) 
, and b) install your own copy Open MPI that includes Fortran support  
(e.g., install it to /opt/openmpi or somesuch -- I do not recommend  
installing it over the system-installed Open MPI).

Once you do this, mpif90 should work as expected, and statements like  
"use mpi" or "include "mpifh."" should function properly.


On Oct 31, 2008, at 5:48 AM, Benjamin Lamptey wrote:

  
Hello again,
I have to be more specific with my problem.

1) I am using the Mac OS X (Leopard) operating system.
When I do uname -a, I get Darwin Kernel Version 9.5.0

2) My code if fortran 90

3) I tried using the mpif90 wrapper and I got the following message

xxxxxxxxxxxxx
mpif90  -c -O3   /Users/lamptey/projectb/src/blag_real_burnmpi.f90
--------------------------------------------------------------------------
Unfortunately, this installation of Open MPI was not compiled with
Fortran 90 support.  As such, the mpif90 compiler is non-functional.

--------------------------------------------------------------------------
make: *** [blag_real_burnmpi.o] Error 1
xxxxxxxxxxxxx

4) I have the g95 compiler installed. So when I try using the
g95, (with include "mpif.h" or 'mpif.h'), I get the following mesage:

xxxxxxxxxxxxxx
g95 -fno-pic -c -O3   /Users/lamptey/projectb/src/ 
blag_real_burnmpi.f90
Error: Can't open included file 'mpif.h'
make: *** [blag_real_burnmpi.o] Error 1
xxxxxxxxxxxxxxx

5) What are people's experience in this case?

Thanks
Ben

On Thu, Oct 30, 2008 at 2:33 PM, Benjamin Lamptey  
<bllamptey@gmail.com> wrote:
Hello,
I am new at using open-mpi and will like to know something basic.

What is the equivalent of the "mpif.h" in open-mpi which is normally  
"included" at
the beginning of mpi codes (fortran in this case).

I shall appreciate that for cpp as well.

Thanks
Ben

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
    

  

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users