Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI users] Bad performance when scattering big size of data?
From: Storm Zhang (stormzhg_at_[hidden])
Date: 2010-10-04 12:53:51

We have 64 compute nodes which are dual qual-core and hyperthreaded CPUs. So
we have 1024 compute units shown in the ROCKS 5.3 system. I'm trying to
scatter an array from the master node to the compute nodes using mpiCC and
mpirun using C++.

Here is my test:

The array size is 18KB * Number of compute nodes and is scattered to the
compute nodes 5000 times repeatly.

The average running time(seconds):

100 nodes: 170,
400 nodes: 690,
500 nodes: 855,
600 nodes: 2550,
700 nodes: 2720,
800 nodes: 2900,

There is a big jump of running time from 500 nodes to 600 nodes. Don't know
what's the problem.
Tried both in OMPI 1.3.2 and OMPI 1.4.2. Running time is a little faster for
all the tests in 1.4.2 but the jump still exists.
Tried using either Bcast function or simply Send/Recv which give very close
Tried both in running it directly or using SGE and got the same results.

The code and ompi_info are attached to this email. The direct running
command is :
/opt/openmpi/bin/mpirun --mca btl_tcp_if_include eth0 --machinefile
../machines -np 600 scatttest

The ifconfig of head node for eth0 is:
eth0 Link encap:Ethernet HWaddr 00:26:B9:56:8B:44
          inet addr: Bcast: Mask:
          inet6 addr: fe80::226:b9ff:fe56:8b44/64 Scope:Link
          RX packets:1096060373 errors:0 dropped:2512622 overruns:0 frame:0
          TX packets:513387679 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:832328807459 (775.1 GiB) TX bytes:250824621959 (233.5
          Interrupt:106 Memory:d6000000-d6012800

A typical ifconfig of a compute node is:
eth0 Link encap:Ethernet HWaddr 00:21:9B:9A:15:AC
          inet addr: Bcast: Mask:
          inet6 addr: fe80::221:9bff:fe9a:15ac/64 Scope:Link
          RX packets:362716422 errors:0 dropped:0 overruns:0 frame:0
          TX packets:349967746 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:139699954685 (130.1 GiB) TX bytes:338207741480 (314.9
          Interrupt:82 Memory:d6000000-d6012800

Does anyone help me out of this? It bothers me a lot.

Thank you very much.


  • application/octet-stream attachment: ompi_info