Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Fwd: problem for multiple clusters using mpirun
From: Hamid Saeed (e.hamidsaeed_at_[hidden])
Date: 2014-03-21 10:24:35


/sbin/ifconfig

hsaeed_at_karp:~$ /sbin/ifconfig
br0 Link encap:Ethernet HWaddr 00:25:90:59:c9:ba
          inet addr:134.106.3.231 Bcast:134.106.3.255 Mask:255.255.255.0
          inet6 addr: fe80::225:90ff:fe59:c9ba/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:49080961 errors:0 dropped:50263 overruns:0 frame:0
          TX packets:43279252 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:41348407558 (38.5 GiB) TX bytes:80505842745 (74.9 GiB)

br1 Link encap:Ethernet HWaddr 00:25:90:59:c9:bb
          inet addr:134.106.53.231 Bcast:134.106.53.255 Mask:255.255.255.0
          inet6 addr: fe80::225:90ff:fe59:c9bb/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:41573060 errors:0 dropped:50261 overruns:0 frame:0
          TX packets:1693509 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:6177072160 (5.7 GiB) TX bytes:230617435 (219.9 MiB)

br2 Link encap:Ethernet HWaddr 00:c0:0a:ec:02:e7
          inet addr:10.231.2.231 Bcast:10.231.2.255 Mask:255.255.255.0
          UP BROADCAST MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

eth0 Link encap:Ethernet HWaddr 00:25:90:59:c9:ba
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:69108377 errors:0 dropped:0 overruns:0 frame:0
          TX packets:86459066 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:43533091399 (40.5 GiB) TX bytes:83359370885 (77.6 GiB)
          Memory:dfe60000-dfe80000

eth1 Link encap:Ethernet HWaddr 00:25:90:59:c9:bb
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:43531546 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1716151 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:7201915977 (6.7 GiB) TX bytes:232026383 (221.2 MiB)
          Memory:dfee0000-dff00000

lo Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:16436 Metric:1
          RX packets:10890707 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10890707 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:36194379576 (33.7 GiB) TX bytes:36194379576 (33.7 GiB)

tap0 Link encap:Ethernet HWaddr 00:c0:0a:ec:02:e7
          UP BROADCAST MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500
          RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

When i execute the following line

hsaeed_at_karp:~/Task4_mpi/scatterv$ mpiexec -n 2 -host wirth,karp ./a.out

i receive Error

[wirth][[59430,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_connect]
connect() to 10.231.2.231 failed: Connection refused (111)

NOTE: Karp and wirth are two machines on ssh cluster.

On Fri, Mar 21, 2014 at 3:13 PM, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]
> wrote:

> On Mar 21, 2014, at 10:09 AM, Hamid Saeed <e.hamidsaeed_at_[hidden]> wrote:
>
> > > I think i have a tcp connection. As for as i know my cluster is not
> configured for Infiniband (IB).
>
> Ok.
>
> > > but even for tcp connections.
> > >
> > > mpirun -n 2 -host master,node001 --mca btl tcp,sm,self ./helloworldmpi
> > > mpirun -n 2 -host master,node001 ./helloworldmpi
> > >
> > > These line are not working they output
> > > Error like
> > > [btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_connect]
> connect() to xx.xxx.x.xxx failed: Connection refused (111)
>
> What are the IP addresses reported by connect()? (i.e., the address you
> X'ed out)
>
> Send the output from ifconfig on each of your servers. Note that some
> Linux distributions do not put ifconfig in the default PATH of normal
> users; look for it in/sbin/ifconfig or /usr/sbin/ifconfig.
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
_______________________________________________
Hamid Saeed
CoSynth GmbH & Co. KG
Escherweg 2 - 26121 Oldenburg - Germany
Tel +49 441 9722 738 | Fax -278
http://www.cosynth.com
_______________________________________________