Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] slowdown with infiniband and latest CentOS kernel
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-12-17 11:04:16


Are you binding the procs? We don't bind by default (this will change in 1.7.4), and binding can play a significant role when comparing across kernels.

add "--bind-to-core" to your cmd line

On Dec 17, 2013, at 7:09 AM, Noam Bernstein <noam.bernstein_at_[hidden]> wrote:

> On Dec 16, 2013, at 5:40 PM, Noam Bernstein <noam.bernstein_at_[hidden]> wrote:
>
>>
>> Once I have some more detailed information I'll follow up.
>
> OK - I've tried to characterize the behavior with vasp, which accounts for
> most of our cluster usage, and it's quite odd. I ran my favorite benchmarking
> job repeated 4 times. As you can see below, in some
> cases using sm it's as fast as before (kernel 2.6.32-358.23.2.el6.x86_64),
> but mostly it's a factor of 2 slower. With openib and our older nodes it's always a
> factor of 2-4 slower. With the newer nodes in a situation where using sm is
> possible it's occasionally as fast as before, but sometimes it's 10-20 times
> slower. When using ib with the new nodes it's always much slower than before.
>
> openmpi is 1.7.3, recompiled with the new kernel. vasp is 5.3.3, which we've
> been using for months. Everything is compiled with an older stable version
> of the intel compiler, as we've been doing for a long time.
>
> More perhaps useful information - I don't have actual data from the previous
> setup (perhaps I should roll back some nodes and check), but I generally
> expect to see 100% cpu usage on all the processes, either because they're
> doing numeric stuff, or doing a busy-wait for mpi. However, now I see a few
> of the vasp processes at 100%, and the others at 50-70% (say 4-6 on a given
> node at 100%, and the rest lower).
>
> If anyone has any ideas on what's going on, or how to debug further, I'd
> really appreciate some suggestions.
>
> Noam
>
> 8 core nodes (dual Xeon X5550)
>
> 8 MPI procs (single node)
> used to be 5.74 s
> now:
> btl: default or sm only or sm+openib: 5.5-9.3 s, mostly the larger times
> btl: openib: 10.0-12.2 s
>
> 16 MPI procs (2 nodes)
> used to be 2.88 s
> btl default or openib or sm+openib: 4.8 - 6.23 s
>
> 32 MPI procs (4 nodes)
> use to be 1.59 s
> btl default or openib or sm+openib: 2.73-4.49 s, but sometimes just fails
>
> at least once gave the errors (stack trace is incomplete, but probably on mpi_comm_rank, mpi_comm_size, or mpi_barrier)
> [compute-3-24:32566] [[59587,0],0]:route_callback trying to get message from [[59587,1],20] to [[59587,1],28]:102, routing loop
> [0] func:/usr/local/openmpi/1.7.3/x86_64/ib/gnu/lib/libopen-pal.so.6(opal_backtrace_print+0x1f) [0x2b5940c2dd9f]
> [1] func:/usr/local/openmpi/1.7.3/x86_64/ib/gnu/lib/openmpi/mca_rml_oob.so(+0x22b6) [0x2b5941f0f2b6]
> [2] func:/usr/local/openmpi/1.7.3/x86_64/ib/gnu/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_recv_complete+0x27f) [0x2b594333341f]
> [3] func:/usr/local/openmpi/1.7.3/x86_64/ib/gnu/lib/openmpi/mca_oob_tcp.so(+0x9d3a) [0x2b5943334d3a]
> [4] func:/usr/local/openmpi/1.7.3/x86_64/ib/gnu/lib/libopen-pal.so.6(opal_libevent2021_event_base_loop+0x8bc) [0x2b5940c3592c]
> [5] func:mpirun(orterun+0xe25) [0x404565]
> [6] func:mpirun(main+0x20) [0x403594]
> [7] func:/lib64/libc.so.6(__libc_start_main+0xfd) [0x3091c1ed1d]
> [8] func:mpirun() [0x4034b9]
>
>
> 16 core nodes (dual Xeon E5-2670)
>
> 8 MPI procs (single node)
> not sure what it used to be, but 3.3 s is plausible
> btl: default or sm or openib+sm: 3.3-3.4 s
> btl: openib 3.9-4.14 s
>
> 16 MPI procs (single node)
> used to be 2.07 s
> btl default or openib: 23.0-32.56 s
> btl sm or sm+openib: 1.94 s - 39.27 s (mostly the slower times)
>
> 32 MPI procs (2 nodes)
> used to be 1.24 s
> btl default or sm or openib or sm+openib: 30s - 97 s_______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users