Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Openib with > 32 cores per node
From: Samuel K. Gutierrez (samuel_at_[hidden])
Date: 2011-05-19 10:27:46


Hi,

Try the following QP parameters that only use shared receive queues.

-mca btl_openib_receive_queues S,12288,128,64,32:S,65536,128,64,32

Samuel K. Gutierrez
Los Alamos National Laboratory

On May 19, 2011, at 5:28 AM, Robert Horton wrote:

> Hi,
>
> I'm having problems getting the MPIRandomAccess part of the HPCC
> benchmark to run with more than 32 processes on each node (each node has
> 4 x AMD 6172 so 48 cores total). Once I go past 32 processes I get an
> error like:
>
> [compute-1-13.local][[5637,1],18][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_oob.c:464:qp_create_one] error creating qp errno says Cannot allocate memory
> [compute-1-13.local][[5637,1],18][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_oob.c:815:rml_recv_cb] error in endpoint reply start connect
> [compute-1-13.local:06117] [[5637,0],0]-[[5637,1],18] mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104)
> [compute-1-13.local:6137] *** An error occurred in MPI_Isend
> [compute-1-13.local:6137] *** on communicator MPI_COMM_WORLD
> [compute-1-13.local:6137] *** MPI_ERR_OTHER: known error not in list
> [compute-1-13.local:6137] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [compute-1-13.local][[5637,1],26][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_oob.c:464:qp_create_one] error creating qp errno says Cannot allocate memory
> [[5637,1],66][../../../../../ompi/mca/btl/openib/btl_openib_component.c:3227:handle_wc] from compute-1-13.local to: compute-1-13 error polling LP CQ with status RETRY EXCEEDED ERROR status number 12 for wr_id 278870912 opcode
>
> I've tried changing btl_openib_receive_queues from
> P,128,256,192,128:S,2048,256,128,32:S,12288,256,128,32:S,65536,256,128,32
> to
> P,128,512,256,512:S,2048,512,256,32:S,12288,512,256,32:S,65536,512,256,32
>
> doing this lets the code run without the error, but it does so extremely
> slowly - I'm also seeing errors in dmesg such as:
>
> CPU 12:
> Modules linked in: nfs fscache nfs_acl blcr(U) blcr_imports(U) autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state
> ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables cpufreq_ondemand powernow_k8 freq_table rdma_ucm(U) ib_sd
> p(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ipoib_helper(U) ib_cm(U) ib_sa(U) ipv6 xfrm_nalgo crypto_api ib_uverbs(U) ib_umad(U) iw_nes(U) iw_cxgb3(U) cxgb3(U)
> mlx4_ib(U) mlx4_en(U) mlx4_core(U) ib_mthca(U) dm_mirror dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_p
> c lp parport joydev shpchp sg i2c_piix4 i2c_core ib_qib(U) dca ib_mad(U) ib_core(U) igb 8021q serio_raw pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_
> mem_cache ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
> Pid: 3980, comm: qib/12 Tainted: G 2.6.18-164.6.1.el5 #1
> RIP: 0010:[<ffffffff80094409>] [<ffffffff80094409>] tasklet_action+0x90/0xfd
> RSP: 0018:ffff810c2f1bff40 EFLAGS: 00000246
> RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff810c2f1bff30
> RDX: 0000000000000000 RSI: ffff81042f063400 RDI: ffffffff8030d180
> RBP: ffff810c2f1bfec0 R08: 0000000000000001 R09: ffff8104aec2d000
> R10: ffff810c2f1bff00 R11: ffff810c2f1bff00 R12: ffffffff8005dc8e
> R13: ffff81042f063480 R14: ffffffff80077874 R15: ffff810c2f1bfec0
> FS: 00002b20829592e0(0000) GS:ffff81042f186bc0(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 00002b2080b70720 CR3: 0000000000201000 CR4: 00000000000006e0
>
> Call Trace:
> <IRQ> [<ffffffff8001235a>] __do_softirq+0x89/0x133
> [<ffffffff8005e2fc>] call_softirq+0x1c/0x28
> [<ffffffff8006cb20>] do_softirq+0x2c/0x85
> [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c
> <EOI> [<ffffffff800da30c>] __kmalloc+0x97/0x9f
> [<ffffffff88220d8b>] :ib_qib:qib_verbs_send+0xdb3/0x104a
> [<ffffffff80064b20>] _spin_unlock_irqrestore+0x8/0x9
> [<ffffffff881f66ca>] :ib_qib:qib_make_rc_req+0xbb1/0xbbf
> [<ffffffff881f5b19>] :ib_qib:qib_make_rc_req+0x0/0xbbf
> [<ffffffff881f8187>] :ib_qib:qib_do_send+0x0/0x950
> [<ffffffff881f8aa1>] :ib_qib:qib_do_send+0x91a/0x950
> [<ffffffff8002e2e3>] __wake_up+0x38/0x4f
> [<ffffffff881f8187>] :ib_qib:qib_do_send+0x0/0x950
> [<ffffffff8004d7fb>] run_workqueue+0x94/0xe4
> [<ffffffff8004a043>] worker_thread+0x0/0x122
> [<ffffffff8009f9f0>] keventd_create_kthread+0x0/0xc4
> [<ffffffff8004a133>] worker_thread+0xf0/0x122
> [<ffffffff8008c3bd>] default_wake_function+0x0/0xe
> [<ffffffff8009f9f0>] keventd_create_kthread+0x0/0xc4
> [<ffffffff8003297c>] kthread+0xfe/0x132
> [<ffffffff8005dfb1>] child_rip+0xa/0x11
> [<ffffffff8009f9f0>] keventd_create_kthread+0x0/0xc4
> [<ffffffff8003287e>] kthread+0x0/0x132
> [<ffffffff8005dfa7>] child_rip+0x0/0x11
>
> Any thoughts on how to proceed?
>
> I'm running OpenMPI 1.4.3 compiled with gcc 4.1.2 and OFED 1.5.3.1
>
> Thanks,
> Rob
> --
> Robert Horton
> System Administrator (Research Support) - School of Mathematical Sciences
> Queen Mary, University of London
> r.horton_at_[hidden] - +44 (0) 20 7882 7345
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users