Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] users Digest, Vol 1052, Issue 1
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-10-31 11:34:52


It looks like the daemon isn't seeing the other interface address on
host x2. Can you ssh to x2 and send the contents of ifconfig -a?

Ralph

On Oct 31, 2008, at 9:18 AM, Allan Menezes wrote:

> users-request_at_[hidden] wrote:
>> Send users mailing list submissions to
>> users_at_[hidden]
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> or, via email, send a message with subject or body 'help' to
>> users-request_at_[hidden]
>>
>> You can reach the person managing the list at
>> users-owner_at_[hidden]
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of users digest..."
>>
>>
>> Today's Topics:
>>
>> 1. Openmpi ver1.3beta1 (Allan Menezes)
>> 2. Re: Openmpi ver1.3beta1 (Ralph Castain)
>> 3. Re: Equivalent .h files (Benjamin Lamptey)
>> 4. Re: Equivalent .h files (Jeff Squyres)
>> 5. ompi-checkpoint is hanging (Matthias Hovestadt)
>> 6. unsubscibe (Bertrand P. S. Russell)
>> 7. Re: ompi-checkpoint is hanging (Tim Mattox)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Fri, 31 Oct 2008 02:06:09 -0400
>> From: Allan Menezes <amenezes007_at_[hidden]>
>> Subject: [OMPI users] Openmpi ver1.3beta1
>> To: users_at_[hidden]
>> Message-ID: <BLU0-SMTP224B5E356302AC7AA4481088200_at_phx.gbl>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Hi,
>> I built open mpi version 1.3b1 withe following cofigure command:
>> ./configure --prefix=/opt/openmpi13b1 --enable-mpi-threads
>> --with-threads=posix --disable-ipv6
>> I have six nodes x1..6
>> I distributed the /opt/openmpi13b1 with scp to all other nodes from
>> the
>> head node
>> When i run the following command:
>> mpirun --prefix /opt/openmpi13b1 --host x1 hostname it works on x1
>> printing out the hostname of x1
>> But when i type
>> mpirun --prefix /opt/openmpi13b1 --host x2 hostname it hangs and does
>> not give me any output
>> I have a 6 node intel quad core cluster with OSCAR and pci express
>> gigabit ethernet for eth0
>> Can somebody advise?
>> Thank you very much.
>> Allan Menezes
>>
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Fri, 31 Oct 2008 02:41:59 -0600
>> From: Ralph Castain <rhc_at_[hidden]>
>> Subject: Re: [OMPI users] Openmpi ver1.3beta1
>> To: Open MPI Users <users_at_[hidden]>
>> Message-ID: <E8AF5AAF-99CB-4EFC-AA97-5385CE333AD2_at_[hidden]>
>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>>
>> When you typed the --host x1 command, were you sitting on x1?
>> Likewise, when you typed the --host x2 command, were you not on
>> host x2?
>>
>> If the answer to both questions is "yes", then my guess is that
>> something is preventing you from launching a daemon on host x2. Try
>> adding --leave-session-attached to your cmd line and see if any error
>> messages appear. And check the FAQ for tips on how to setup for ssh
>> launch (I'm assuming that is what you are using).
>>
>> http://www.open-mpi.org/faq/?category=rsh
>>
>> Ralph
>>
>> On Oct 31, 2008, at 12:06 AM, Allan Menezes wrote:
>>
>>
> Hi Ralph,
> Yes that is true I tried both commands on x1 and ver 1.28 works
> on the same setup without a problem.
> Here is the output with the added
> --leave-session-attached
> [allan_at_x1 ~]$ mpiexec --prefix /opt/openmpi13b2 --leave-session-
> attached -host x2 hostname
> [x2.brampton.net:02236] [[1354,0],1]-[[1354,0],0]
> mca_oob_tcp_peer_try_connect: connect to 192.168.0.198:0 failed:
> Network is unreachable (101)
> [x2.brampton.net:02236] [[1354,0],1]-[[1354,0],0]
> mca_oob_tcp_peer_try_connect: connect to 192.168.122.1:0 failed:
> Network is unreachable (101)
> [x2.brampton.net:02236] [[1354,0],1] routed:binomial: Connection to
> lifeline [[1354,0],0] lost
> --------------------------------------------------------------------------
> A daemon (pid 7665) died unexpectedly with status 1 while attempting
> to launch so we are aborting.
>
> There may be more information reported by the environment (see above).
>
> This may be because the daemon was unable to find all the needed
> shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to
> have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpiexec noticed that the job aborted, but has no info as to the
> process
> that caused that situation.
> --------------------------------------------------------------------------
> mpiexec: clean termination accomplished
>
> [allan_at_x1 ~]$
> However my main eth0 IP is 192.168.1.1 and internet gate way is
> 192.168.0.1
> Any solutions?
> Allan Menezes
>
>
>
>>> Hi,
>>> I built open mpi version 1.3b1 withe following cofigure command:
>>> ./configure --prefix=/opt/openmpi13b1 --enable-mpi-threads --with-
>>> threads=posix --disable-ipv6
>>> I have six nodes x1..6
>>> I distributed the /opt/openmpi13b1 with scp to all other nodes from
>>> the head node
>>> When i run the following command:
>>> mpirun --prefix /opt/openmpi13b1 --host x1 hostname it works on x1
>>> printing out the hostname of x1
>>> But when i type
>>> mpirun --prefix /opt/openmpi13b1 --host x2 hostname it hangs and
>>> does not give me any output
>>> I have a 6 node intel quad core cluster with OSCAR and pci express
>>> gigabit ethernet for eth0
>>> Can somebody advise?
>>> Thank you very much.
>>> Allan Menezes
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Fri, 31 Oct 2008 09:48:43 +0000
>> From: "Benjamin Lamptey" <bllamptey_at_[hidden]>
>> Subject: Re: [OMPI users] Equivalent .h files
>> To: users_at_[hidden]
>> Message-ID:
>> <71ec5a370810310248g91a4d9ftca708e6e6306d0c9_at_[hidden]>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> Hello again,
>> I have to be more specific with my problem.
>>
>> 1) I am using the Mac OS X (Leopard) operating system.
>> When I do uname -a, I get Darwin Kernel Version 9.5.0
>>
>> 2) My code if fortran 90
>>
>> 3) I tried using the mpif90 wrapper and I got the following message
>>
>> xxxxxxxxxxxxx
>> mpif90 -c -O3 /Users/lamptey/projectb/src/blag_real_burnmpi.f90
>> --------------------------------------------------------------------------
>> Unfortunately, this installation of Open MPI was not compiled with
>> Fortran 90 support. As such, the mpif90 compiler is non-functional.
>>
>> --------------------------------------------------------------------------
>> make: *** [blag_real_burnmpi.o] Error 1
>> xxxxxxxxxxxxx
>>
>> 4) I have the g95 compiler installed. So when I try using the
>> g95, (with include "mpif.h" or 'mpif.h'), I get the following mesage:
>>
>> xxxxxxxxxxxxxx
>> g95 -fno-pic -c -O3 /Users/lamptey/projectb/src/
>> blag_real_burnmpi.f90
>> Error: Can't open included file 'mpif.h'
>> make: *** [blag_real_burnmpi.o] Error 1
>> xxxxxxxxxxxxxxx
>>
>> 5) What are people's experience in this case?
>>
>> Thanks
>> Ben
>>
>> On Thu, Oct 30, 2008 at 2:33 PM, Benjamin Lamptey <bllamptey_at_[hidden]
>> >wrote:
>>
>>
>>> Hello,
>>> I am new at using open-mpi and will like to know something basic.
>>>
>>> What is the equivalent of the "mpif.h" in open-mpi which is normally
>>> "included" at
>>> the beginning of mpi codes (fortran in this case).
>>>
>>> I shall appreciate that for cpp as well.
>>>
>>> Thanks
>>> Ben
>>>
>>>
>> -------------- next part --------------
>> HTML attachment scrubbed and removed
>>
>> ------------------------------
>>
>> Message: 4
>> Date: Fri, 31 Oct 2008 06:51:01 -0400
>> From: Jeff Squyres <jsquyres_at_[hidden]>
>> Subject: Re: [OMPI users] Equivalent .h files
>> To: Open MPI Users <users_at_[hidden]>
>> Message-ID: <A493DF4D-3DFF-46E4-8C90-D3771527379D_at_[hidden]>
>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>>
>> The Open MPI that ships with Leopard does not include Fortran support
>> because OS X does not ship with a Fortran compiler (this was Apple's
>> decision, not ours). If you have Fortran MPI applications, you'll
>> need to a) download and install your own Fortran compiler (e.g., http://hpc.sf.net/)
>> , and b) install your own copy Open MPI that includes Fortran support
>> (e.g., install it to /opt/openmpi or somesuch -- I do not recommend
>> installing it over the system-installed Open MPI).
>>
>> Once you do this, mpif90 should work as expected, and statements like
>> "use mpi" or "include "mpifh."" should function properly.
>>
>>
>> On Oct 31, 2008, at 5:48 AM, Benjamin Lamptey wrote:
>>
>>
>>> Hello again,
>>> I have to be more specific with my problem.
>>>
>>> 1) I am using the Mac OS X (Leopard) operating system.
>>> When I do uname -a, I get Darwin Kernel Version 9.5.0
>>>
>>> 2) My code if fortran 90
>>>
>>> 3) I tried using the mpif90 wrapper and I got the following message
>>>
>>> xxxxxxxxxxxxx
>>> mpif90 -c -O3 /Users/lamptey/projectb/src/blag_real_burnmpi.f90
>>> --------------------------------------------------------------------------
>>> Unfortunately, this installation of Open MPI was not compiled with
>>> Fortran 90 support. As such, the mpif90 compiler is non-functional.
>>>
>>> --------------------------------------------------------------------------
>>> make: *** [blag_real_burnmpi.o] Error 1
>>> xxxxxxxxxxxxx
>>>
>>> 4) I have the g95 compiler installed. So when I try using the
>>> g95, (with include "mpif.h" or 'mpif.h'), I get the following
>>> mesage:
>>>
>>> xxxxxxxxxxxxxx
>>> g95 -fno-pic -c -O3 /Users/lamptey/projectb/src/
>>> blag_real_burnmpi.f90
>>> Error: Can't open included file 'mpif.h'
>>> make: *** [blag_real_burnmpi.o] Error 1
>>> xxxxxxxxxxxxxxx
>>>
>>> 5) What are people's experience in this case?
>>>
>>> Thanks
>>> Ben
>>>
>>> On Thu, Oct 30, 2008 at 2:33 PM, Benjamin Lamptey
>>> <bllamptey_at_[hidden]> wrote:
>>> Hello,
>>> I am new at using open-mpi and will like to know something basic.
>>>
>>> What is the equivalent of the "mpif.h" in open-mpi which is normally
>>> "included" at
>>> the beginning of mpi codes (fortran in this case).
>>>
>>> I shall appreciate that for cpp as well.
>>>
>>> Thanks
>>> Ben
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users