Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Mostyn Lewis (Mostyn.Lewis_at_[hidden])
Date: 2007-09-30 19:15:46


Any ideas about this. One dual core operton box talking to another using
infincon/silverstorm/qlogic hardware and mvapi (actually it's the same
just using ethernet and tcp):

$OPENMPI_INFINICON_GCC_MVAPI/bin/mpicc cpi.c
$OPENMPI_INFINICON_GCC_MVAPI/bin/-np 4 -machinefile j ./a.out
[s0121:07450] [1,0]-[0,0] mca_oob_tcp_peer_try_connect: connect to 10.173.128.48:43359 failed: Software caused connection abort (103)
[s0121:07451] [1,1]-[0,0] mca_oob_tcp_peer_try_connect: connect to 10.173.128.48:43359 failed: Software caused connection abort (103)
[s0121:07453] [1,3]-[0,0] mca_oob_tcp_peer_try_connect: connect to 10.173.128.48:43359 failed: Software caused connection abort (103)
[s0121:07452] [1,2]-[0,0] mca_oob_tcp_peer_try_connect: connect to 10.173.128.48:43359 failed: Software caused connection abort (103)
Process 2 of 4 on s0121
Process 0 of 4 on s0121
Process 1 of 4 on s0121
Process 3 of 4 on s0121
7451:a.out *->3 (f=noaffinity,0,1,2,3)
7453:a.out *->2 (f=noaffinity,0,1,2,3)
7450:a.out *->3 (f=noaffinity,0,1,2,3)
7452:a.out *->3 (f=noaffinity,0,1,2,3)

The Process msgs and the affinity stuff means it ran. The oob msgs are somewhat annoying
(imagine hundreds of nodes). The 10.173.128.48 address is the launch node (s0120).
This is SuSE SLES10:
s0120 Sun Sep 30 16:15:02 PDT 2007
SUSE Linux Enterprise Server 10 (x86_64)
Linux version 2.6.16.21-0.8-smp.lustre-1.6.1.X2200.MRL-0.8-smp (geeko_at_buildhost) (gcc version 4.1.0 (SUSE Linux)) #1 SMP Tue Aug 28 09:51:26 PDT 2007
Machine Model Sun Fire X2200 M2
Bus Speed 202 MHz
4 Cpus
CPU0 Dual-Core AMD Opteron(tm) Processor 2220(2814.485Mhz) stepping 3
L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
L2 cache: 1024 KB
CPU1 Dual-Core AMD Opteron(tm) Processor 2220(2814.485Mhz) stepping 3
L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
L2 cache: 1024 KB
CPU2 Dual-Core AMD Opteron(tm) Processor 2220(2814.485Mhz) stepping 3
L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
L2 cache: 1024 KB
CPU3 Dual-Core AMD Opteron(tm) Processor 2220(2814.485Mhz) stepping 3
L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
L2 cache: 1024 KB
16.0 GB memory

Regards,
Mostyn