Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Performance difference on OpenMPI, IntelMPI and ScaliMPI
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-08-05 10:21:18


Okay, one problem is fairly clear. As Terry indicated, you have to tell us
to bind or else you lose a lot of performace. Set -mca opal_paffinity_alone
1 on your cmd line and it should make a significant difference.

On Wed, Aug 5, 2009 at 8:10 AM, Torgny Faxen <faxen_at_[hidden]> wrote:

> Ralph,
> I am running through a locally provided wrapper but it translates to:
> /software/mpi/openmpi/1.3b2/i101017/bin/mpirun -np 144 -npernode 8 -mca
> mpi_show_mca_params env,file /nobac
> kup/rossby11/faxen/RCO_scobi/src_161.openmpi/rco2.24pe
>
> a) Upgrade.. This will take some time, it will have to go through the
> administrator, this is a production cluster
> b) -mca .. see output below
> c) I used exactly the same optimization flags for all three versions
> (ScaliMPI, OpenMPI and IntelMPI) and this is Fortran so I am using mpif90
> :-)
>
> Regards / Torgny
>
> [n70:30299] ess=env (environment)
> [n70:30299] orte_ess_jobid=482607105 (environment)
> [n70:30299] orte_ess_vpid=0 (environment)
> [n70:30299] mpi_yield_when_idle=0 (environment)
> [n70:30299] mpi_show_mca_params=env,file (environment)
>
>
> Ralph Castain wrote:
>
>> Could you send us the mpirun cmd line? I wonder if you are missing some
>> options that could help. Also, you might:
>>
>> (a) upgrade to 1.3.3 - it looks like you are using some kind of
>> pre-release version
>>
>> (b) add -mca mpi_show_mca_params env,file - this will cause rank=0 to
>> output what mca params it sees, and where they came from
>>
>> (c) check that you built a non-debug version, and remembered to compile
>> your application with a -O3 flag - i.e., "mpicc -O3 ...". Remember, OMPI
>> does not automatically add optimization flags to mpicc!
>>
>> Thanks
>> Ralph
>>
>>
>> On Wed, Aug 5, 2009 at 7:15 AM, Torgny Faxen <faxen_at_[hidden] <mailto:
>> faxen_at_[hidden]>> wrote:
>>
>> Pasha,
>> no collectives are being used.
>>
>> A simple grep in the code reveals the following MPI functions
>> being used:
>> MPI_Init
>> MPI_wtime
>> MPI_COMM_RANK
>> MPI_COMM_SIZE
>> MPI_BUFFER_ATTACH
>> MPI_BSEND
>> MPI_PACK
>> MPI_UNPACK
>> MPI_PROBE
>> MPI_GET_COUNT
>> MPI_RECV
>> MPI_IPROBE
>> MPI_FINALIZE
>>
>> where MPI_IPROBE is the clear winner in terms of number of calls.
>>
>> /Torgny
>>
>>
>> Pavel Shamis (Pasha) wrote:
>>
>> Do you know if the application use some collective operations ?
>>
>> Thanks
>>
>> Pasha
>>
>> Torgny Faxen wrote:
>>
>> Hello,
>> we are seeing a large difference in performance for some
>> applications depending on what MPI is being used.
>>
>> Attached are performance numbers and oprofile output
>> (first 30 lines) from one out of 14 nodes from one
>> application run using OpenMPI, IntelMPI and Scali MPI
>> respectively.
>>
>> Scali MPI is faster the other two MPI:s with a factor of
>> 1.6 and 1.75:
>>
>> ScaliMPI: walltime for the whole application is 214 seconds
>> OpenMPI: walltime for the whole application is 376 seconds
>> Intel MPI: walltime for the whole application is 346 seconds.
>>
>> The application is running with the main send receive
>> commands being:
>> MPI_Bsend
>> MPI_Iprobe followed by MPI_Recv (in case of there being a
>> message). Quite often MPI_Iprobe is being called just to
>> check whether there is a certain message pending.
>>
>> Any idea on tuning tips, performance analysis, code
>> modifications to improve the OpenMPI performance? A lot of
>> time is being spent in "mca_btl_sm_component_progress",
>> "btl_openib_component_progress" and other internal routines.
>>
>> The code is running on a cluster with 140 HP ProLiant
>> DL160 G5 compute servers. Infiniband interconnect. Intel
>> Xeon E5462 processors. The profiled application is using
>> 144 cores on 18 nodes over Infiniband.
>>
>> Regards / Torgny
>>
>> =====================================================================================================================0
>>
>> OpenMPI 1.3b2
>>
>> =====================================================================================================================0
>>
>>
>> Walltime: 376 seconds
>>
>> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
>> Profiling through timer interrupt
>> samples % image name app name
>> symbol name
>> 668288 22.2113 mca_btl_sm.so rco2.24pe
>> mca_btl_sm_component_progress
>> 441828 14.6846 rco2.24pe rco2.24pe
>> step_
>> 335929 11.1650 libmlx4-rdmav2.so rco2.24pe
>> (no symbols)
>> 301446 10.0189 mca_btl_openib.so rco2.24pe
>> btl_openib_component_progress
>> 161033 5.3521 libopen-pal.so.0.0.0 rco2.24pe
>> opal_progress
>> 157024 5.2189 libpthread-2.5.so
>> <http://libpthread-2.5.so> rco2.24pe
>> pthread_spin_lock
>>
>> 99526 3.3079 no-vmlinux no-vmlinux
>> (no symbols)
>> 93887 3.1204 mca_btl_sm.so rco2.24pe
>> opal_using_threads
>> 69979 2.3258 mca_pml_ob1.so rco2.24pe
>> mca_pml_ob1_iprobe
>> 58895 1.9574 mca_bml_r2.so rco2.24pe
>> mca_bml_r2_progress
>> 55095 1.8311 mca_pml_ob1.so rco2.24pe
>> mca_pml_ob1_recv_request_match_wild
>> 49286 1.6381 rco2.24pe rco2.24pe
>> tracer_
>> 41946 1.3941 libintlc.so.5 rco2.24pe
>> __intel_new_memcpy
>> 40730 1.3537 rco2.24pe rco2.24pe
>> scobi_
>> 36586 1.2160 rco2.24pe rco2.24pe
>> state_
>> 20986 0.6975 rco2.24pe rco2.24pe
>> diag_
>> 19321 0.6422 libmpi.so.0.0.0 rco2.24pe
>> PMPI_Unpack
>> 18552 0.6166 libmpi.so.0.0.0 rco2.24pe
>> PMPI_Iprobe
>> 17323 0.5757 rco2.24pe rco2.24pe
>> clinic_
>> 16194 0.5382 rco2.24pe rco2.24pe
>> k_epsi_
>> 15330 0.5095 libmpi.so.0.0.0 rco2.24pe
>> PMPI_Comm_f2c
>> 13778 0.4579 libmpi_f77.so.0.0.0 rco2.24pe
>> mpi_iprobe_f
>> 13241 0.4401 rco2.24pe rco2.24pe
>> s_recv_
>> 12386 0.4117 rco2.24pe rco2.24pe
>> growth_
>> 11699 0.3888 rco2.24pe rco2.24pe
>> testnrecv_
>> 11268 0.3745 libmpi.so.0.0.0 rco2.24pe
>> mca_pml_base_recv_request_construct
>> 10971 0.3646 libmpi.so.0.0.0 rco2.24pe
>> ompi_convertor_unpack
>> 10034 0.3335 mca_pml_ob1.so rco2.24pe
>> mca_pml_ob1_recv_request_match_specific
>> 10003 0.3325 libimf.so rco2.24pe
>> exp.L
>> 9375 0.3116 rco2.24pe rco2.24pe
>> subbasin_
>> 8912 0.2962 libmpi_f77.so.0.0.0 rco2.24pe
>> mpi_unpack_f
>>
>>
>>
>>
>> =====================================================================================================================0
>>
>> Intel MPI, version 3.2.0.011/ <http://3.2.0.011/>
>>
>> =====================================================================================================================0
>>
>>
>> Walltime: 346 seconds
>>
>> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
>> Profiling through timer interrupt
>> samples % image name app name
>> symbol name
>> 486712 17.7537 rco2 rco2
>> step_
>> 431941 15.7558 no-vmlinux no-vmlinux
>> (no symbols)
>> 212425 7.7486 libmpi.so.3.2 rco2
>> MPIDI_CH3U_Recvq_FU
>> 188975 6.8932 libmpi.so.3.2 rco2
>> MPIDI_CH3I_RDSSM_Progress
>> 172855 6.3052 libmpi.so.3.2 rco2
>> MPIDI_CH3I_read_progress
>> 121472 4.4309 libmpi.so.3.2 rco2
>> MPIDI_CH3I_SHM_read_progress
>> 64492 2.3525 libc-2.5.so <http://libc-2.5.so>
>> rco2 sched_yield
>> 52372 1.9104 rco2 rco2
>> tracer_
>> 48621 1.7735 libmpi.so.3.2 rco2
>> .plt
>> 45475 1.6588 libmpiif.so.3.2 rco2
>> pmpi_iprobe__
>> 44082 1.6080 libmpi.so.3.2 rco2
>> MPID_Iprobe
>> 42788 1.5608 libmpi.so.3.2 rco2
>> MPIDI_CH3_Stop_recv
>> 42754 1.5595 libpthread-2.5.so
>> <http://libpthread-2.5.so> rco2
>> pthread_mutex_lock
>> 42190 1.5390 libmpi.so.3.2 rco2
>> PMPI_Iprobe
>> 41577 1.5166 rco2 rco2
>> scobi_
>> 40356 1.4721 libmpi.so.3.2 rco2
>> MPIDI_CH3_Start_recv
>> 38582 1.4073 libdaplcma.so.1.0.2 rco2
>> (no symbols)
>> 37545 1.3695 rco2 rco2
>> state_
>> 35597 1.2985 libc-2.5.so <http://libc-2.5.so>
>> rco2 free
>> 34019 1.2409 libc-2.5.so <http://libc-2.5.so>
>> rco2 malloc
>> 31841 1.1615 rco2 rco2
>> s_recv_
>> 30955 1.1291 libmpi.so.3.2 rco2
>> __I_MPI___intel_new_memcpy
>> 27876 1.0168 libc-2.5.so <http://libc-2.5.so>
>> rco2 _int_malloc
>> 26963 0.9835 rco2 rco2
>> testnrecv_
>> 23005 0.8391 libpthread-2.5.so
>> <http://libpthread-2.5.so> rco2
>> __pthread_mutex_unlock_usercnt
>> 22290 0.8131 libmpi.so.3.2 rco2
>> MPID_Segment_manipulate
>> 22086 0.8056 libmpi.so.3.2 rco2
>> MPIDI_CH3I_read_progress_expected
>> 19146 0.6984 rco2 rco2
>> diag_
>> 18250 0.6657 rco2 rco2
>> clinic_
>>
>> =====================================================================================================================0
>>
>> Scali MPI, version 3.13.10-59413
>>
>> =====================================================================================================================0
>>
>>
>> Walltime:
>>
>> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
>> Profiling through timer interrupt
>> samples % image name app name
>> symbol name
>> 484267 30.0664 rco2.24pe rco2.24pe
>> step_
>> 111949 6.9505 libmlx4-rdmav2.so rco2.24pe
>> (no symbols)
>> 73930 4.5900 libmpi.so rco2.24pe
>> scafun_rq_handle_body
>> 57846 3.5914 libmpi.so rco2.24pe
>> invert_decode_header
>> 55836 3.4667 libpthread-2.5.so
>> <http://libpthread-2.5.so> rco2.24pe
>> pthread_spin_lock
>> 53703 3.3342 rco2.24pe rco2.24pe
>> tracer_
>> 40934 2.5414 rco2.24pe rco2.24pe
>> scobi_
>> 40244 2.4986 libmpi.so rco2.24pe
>> scafun_request_probe_handler
>> 37399 2.3220 rco2.24pe rco2.24pe
>> state_
>> 30455 1.8908 libmpi.so rco2.24pe
>> invert_matchandprobe
>> 29707 1.8444 no-vmlinux no-vmlinux
>> (no symbols)
>> 29147 1.8096 libmpi.so rco2.24pe
>> FMPI_scafun_Iprobe
>> 27969 1.7365 libmpi.so rco2.24pe
>> decode_8_u_64
>> 27475 1.7058 libmpi.so rco2.24pe
>> scafun_rq_anysrc_fair_one
>> 25966 1.6121 libmpi.so rco2.24pe
>> scafun_uxq_probe
>> 24380 1.5137 libc-2.5.so <http://libc-2.5.so>
>> rco2.24pe memcpy
>> 22615 1.4041 libmpi.so rco2.24pe
>> .plt
>> 21172 1.3145 rco2.24pe rco2.24pe
>> diag_
>> 20716 1.2862 libc-2.5.so <http://libc-2.5.so>
>> rco2.24pe memset
>> 18565 1.1526 libmpi.so rco2.24pe
>> openib_wrapper_poll_cq
>> 18192 1.1295 rco2.24pe rco2.24pe
>> clinic_
>> 17135 1.0638 libmpi.so rco2.24pe
>> PMPI_Iprobe
>> 16685 1.0359 rco2.24pe rco2.24pe
>> k_epsi_
>> 16236 1.0080 libmpi.so rco2.24pe
>> PMPI_Unpack
>> 15563 0.9662 libmpi.so rco2.24pe
>> scafun_r_rq_append
>> 14829 0.9207 libmpi.so rco2.24pe
>> scafun_rq_test_finished
>> 13349 0.8288 rco2.24pe rco2.24pe
>> s_recv_
>> 12490 0.7755 libmpi.so rco2.24pe
>> flop_matchandprobe
>> 12427 0.7715 libibverbs.so.1.0.0 rco2.24pe
>> (no symbols)
>> 12272 0.7619 libmpi.so rco2.24pe
>> scafun_rq_handle
>> 12146 0.7541 rco2.24pe rco2.24pe
>> growth_
>> 10175 0.6317 libmpi.so rco2.24pe
>> wrp2p_test_finished
>> 9888 0.6139 libimf.so rco2.24pe
>> exp.L
>> 9179 0.5699 rco2.24pe rco2.24pe
>> subbasin_
>> 9082 0.5639 rco2.24pe rco2.24pe
>> testnrecv_
>> 8901 0.5526 libmpi.so rco2.24pe
>> openib_wrapper_purge_requests
>> 7425 0.4610 rco2.24pe rco2.24pe
>> scobimain_
>> 7378 0.4581 rco2.24pe rco2.24pe
>> scobi_interface_
>> 6530 0.4054 rco2.24pe rco2.24pe
>> setvbc_
>> 6471 0.4018 libfmpi.so rco2.24pe
>> pmpi_iprobe
>> 6341 0.3937 rco2.24pe rco2.24pe
>> snap_
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden] <mailto:users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> -- ---------------------------------------------------------
>> Torgny Faxén National Supercomputer Center
>> Linköping University S-581 83 Linköping
>> Sweden
>> Email:faxen_at_[hidden] <Email%3Afaxen_at_[hidden]> <mailto:
>> Email%3Afaxen_at_[hidden] <Email%253Afaxen_at_[hidden]>>
>> Telephone: +46 13 285798 (office) +46 13 282535 (fax)
>> http://www.nsc.liu.se
>> ---------------------------------------------------------
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden] <mailto:users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> ---------------------------------------------------------
> Torgny Faxén
> National Supercomputer Center
> Linköping University
> S-581 83 Linköping
> Sweden
>
> Email:faxen_at_[hidden] <Email%3Afaxen_at_[hidden]>
> Telephone: +46 13 285798 (office) +46 13 282535 (fax)
> http://www.nsc.liu.se
> ---------------------------------------------------------
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>