Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Advices for parameter tuning for CUDA-aware MPI
From: Rolf vandeVaart (rvandevaart_at_[hidden])
Date: 2014-05-27 10:03:56


Answers inline...
>-----Original Message-----
>From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Maxime
>Boissonneault
>Sent: Friday, May 23, 2014 4:31 PM
>To: Open MPI Users
>Subject: [OMPI users] Advices for parameter tuning for CUDA-aware MPI
>
>Hi,
>I am currently configuring a GPU cluster. The cluster has 8 K20 GPUs per node
>on two sockets, 4 PCIe bus (2 K20 per bus, 4 K20 per socket), with a single QDR
>InfiniBand card on each node. We have the latest NVidia drivers and Cuda 6.0.
>
>I am wondering if someone could tell me if all the default MCA parameters are
>optimal for cuda. More precisely, I am interrested about GDR and IPC. It
>seems from the parameters (see below) that they are both available
>(although GDR is disabled by default). However, my notes from
>GTC14 mention the btl_openib_have_driver_gdr parameter, which I do not
>see at all.
>
>So, I guess, my questions :
>1) Why is GDR disabled by default when available ?
It was disabled by default because it did not always give optimum performance. That may change in the future but for now, as you mentioned, you have to turn on the feature explicitly.

>2) Is the absence of btl_openib_have_driver_gdr an indicator of something
>missing ?
Yes, that means that somehow the GPU Direct RDMA is not installed correctly. All that check does is make sure that the file /sys/kernel/mm/memory_peers/nv_mem/version exists. Does that exist?

>3) Are the default parameters, especially the rdma limits and such, optimal for
>our configuration ?
That is hard to say. GPU Direct RDMA does not work well when the GPU and IB card are not "close" on the system. Can you run "nvidia-smi topo -m" on your system?

>4) Do I want to enable or disable IPC by default (my notes state that bandwith
>is much better with MPS than IPC).
Yes, you should leave IPC enabled by default. That should give good performance. They were some issues with earlier CUDA versions, but they were all fixed in CUDA 6.
>
>Thanks,
>
>Here is what I get from
>ompi_info --all | grep cuda
>
>[mboisson_at_login-gpu01 ~]$ ompi_info --all | grep cuda [login-
>gpu01.calculquebec.ca:11486] mca: base: components_register:
>registering filem components
>[login-gpu01.calculquebec.ca:11486] mca: base: components_register:
>found loaded component raw
>[login-gpu01.calculquebec.ca:11486] mca: base: components_register:
>component raw register function successful [login-
>gpu01.calculquebec.ca:11486] mca: base: components_register:
>registering snapc components
> Prefix: /software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37
> Exec_prefix: /software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37
> Bindir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/bin
> Sbindir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/sbin
> Libdir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib
> Incdir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/include
> Mandir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/share/man
> Pkglibdir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/openmpi
> Libexecdir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/libexec
> Datarootdir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/share
> Datadir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/share
> Sysconfdir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/etc
> Sharedstatedir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/com
> Localstatedir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/var
> Infodir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/share/info
> Pkgdatadir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/share/openmpi
> Pkglibdir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/openmpi
> Pkgincludedir:
>/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/include/openmpi
> MCA mca: parameter "mca_param_files" (current value:
>"/home/mboisson/.openmpi/mca-params.conf:/software-
>gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/etc/openmpi-mca-params.conf",
>data source: default, level: 2 user/detail, type: string, deprecated, synonym
>of: mca_base_param_files)
> MCA mca: parameter "mca_component_path" (current
>value:
>"/software-
>gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/openmpi:/home/mboisson/.o
>penmpi/components",
>data source: default, level: 9 dev/all, type: string, deprecated, synonym of:
>mca_base_component_path)
> MCA mca: parameter "mca_base_param_files" (current
>value:
>"/home/mboisson/.openmpi/mca-params.conf:/software-
>gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/etc/openmpi-mca-params.conf",
>data source: default, level: 2 user/detail, type: string, synonyms:
>mca_param_files)
> MCA mca: informational "mca_base_override_param_file"
>(current value:
>"/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/etc/openmpi-mca-
>params-override.conf",
>data source: default, level: 2 user/detail, type: string)
> MCA mca: parameter "mca_base_param_file_path" (current
>value:
>"/software-
>gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/share/openmpi/amca-param-
>sets:/home/mboisson",
>data source: default, level: 3 user/all, type: string)
> MCA mca: parameter "mca_base_component_path" (current
>value:
>"/software-
>gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/lib/openmpi:/home/mboisson/.o
>penmpi/components",
>data source: default, level: 9 dev/all, type: string, synonyms:
>mca_component_path)
> MCA orte: parameter "orte_default_hostfile" (current
>value:
>"/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/etc/openmpi-
>default-hostfile",
>data source: default, level: 9 dev/all, type: string)
> MCA mpi: informational "mpi_built_with_cuda_support"
>(current value: "true", data source: default, level: 4 tuner/basic,
>type: bool)
> MCA mpi: parameter "mpi_cuda_support" (current value:
>"true", data source: default, level: 4 tuner/basic, type: bool)
> MCA btl: parameter "btl_self_cuda_eager_limit"
>(current value: "0", data source: default, level: 5 tuner/detail, type:
>size_t)
> MCA btl: parameter "btl_self_cuda_rdma_limit" (current
>value: "18446744073709551615", data source: default, level: 5 tuner/detail,
>type: size_t)
> MCA btl: parameter "btl_smcuda_free_list_num" (current
>value: "8", data source: default, level: 5 tuner/detail, type: int)
> MCA btl: parameter "btl_smcuda_free_list_max" (current
>value: "-1", data source: default, level: 5 tuner/detail, type: int)
> MCA btl: parameter "btl_smcuda_free_list_inc" (current
>value: "64", data source: default, level: 5 tuner/detail, type: int)
> MCA btl: parameter "btl_smcuda_max_procs" (current
>value: "-1", data source: default, level: 5 tuner/detail, type: int)
> MCA btl: parameter "btl_smcuda_fifo_size" (current
>value: "4096", data source: default, level: 4 tuner/basic, type: unsigned)
> MCA btl: parameter "btl_smcuda_num_fifos" (current
>value: "1", data source: default, level: 4 tuner/basic, type: int)
> MCA btl: parameter "btl_smcuda_fifo_lazy_free"
>(current value: "120", data source: default, level: 5 tuner/detail,
>type: unsigned)
> MCA btl: parameter "btl_smcuda_sm_extra_procs"
>(current value: "0", data source: default, level: 9 dev/all, type: int)
> MCA btl: parameter "btl_smcuda_use_cuda_ipc" (current
>value: "1", data source: default, level: 4 tuner/basic, type: int)
> MCA btl: parameter "btl_smcuda_use_cuda_ipc_same_gpu"
>(current value: "1", data source: default, level: 4 tuner/basic, type: int)
> MCA btl: parameter "btl_smcuda_cuda_ipc_verbose"
>(current value: "0", data source: default, level: 4 tuner/basic, type: int)
> MCA btl: parameter "btl_smcuda_exclusivity" (current
>value: "65537", data source: default, level: 7 dev/basic, type: unsigned)
> MCA btl: parameter "btl_smcuda_flags" (current value:
>"1", data source: default, level: 5 tuner/detail, type: unsigned)
> MCA btl: parameter "btl_smcuda_rndv_eager_limit"
>(current value: "4096", data source: default, level: 4 tuner/basic,
>type: size_t)
> MCA btl: parameter "btl_smcuda_eager_limit" (current
>value: "4096", data source: default, level: 4 tuner/basic, type: size_t)
> MCA btl: parameter "btl_smcuda_cuda_eager_limit"
>(current value: "0", data source: default, level: 5 tuner/detail, type:
>size_t)
> MCA btl: parameter "btl_smcuda_cuda_rdma_limit"
>(current value: "18446744073709551615", data source: default, level: 5
>tuner/detail, type: size_t)
> MCA btl: parameter "btl_smcuda_max_send_size" (current
>value: "32768", data source: default, level: 4 tuner/basic, type: size_t)
> MCA btl: parameter "btl_sm_cuda_eager_limit" (current
>value: "0", data source: default, level: 5 tuner/detail, type: size_t)
> MCA btl: parameter "btl_sm_cuda_rdma_limit" (current
>value: "18446744073709551615", data source: default, level: 5 tuner/detail,
>type: size_t)
> MCA btl: parameter "btl_tcp_cuda_eager_limit" (current
>value: "0", data source: default, level: 5 tuner/detail, type: size_t)
> MCA btl: parameter "btl_tcp_cuda_rdma_limit" (current
>value: "18446744073709551615", data source: default, level: 5 tuner/detail,
>type: size_t)
> MCA btl: parameter "btl_openib_device_param_files"
>(current value:
>"/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/share/openmpi/mca-
>btl-openib-device-params.ini",
>data source: default, level: 9 dev/all, type: string, synonyms:
>btl_openib_hca_param_files)
> MCA btl: parameter "btl_openib_hca_param_files"
>(current value:
>"/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/share/openmpi/mca-
>btl-openib-device-params.ini",
>data source: default, level: 9 dev/all, type: string, deprecated, synonym of:
>btl_openib_device_param_files)
> MCA btl: parameter "btl_openib_cuda_async_send"
>(current value: "true", data source: default, level: 9 dev/all, type: bool)
> MCA btl: parameter "btl_openib_cuda_async_recv"
>(current value: "true", data source: default, level: 9 dev/all, type: bool)
> MCA btl: informational "btl_openib_have_cuda_gdr"
>(current value: "true", data source: default, level: 5 tuner/detail,
>type: bool)
> MCA btl: parameter "btl_openib_want_cuda_gdr" (current
>value: "false", data source: default, level: 9 dev/all, type: bool)
> MCA btl: parameter "btl_openib_cuda_eager_limit"
>(current value: "0", data source: default, level: 5 tuner/detail, type:
>size_t)
> MCA btl: parameter "btl_openib_cuda_rdma_limit"
>(current value: "18446744073709551615", data source: default, level: 5
>tuner/detail, type: size_t)
> MCA btl: parameter "btl_vader_cuda_eager_limit"
>(current value: "0", data source: default, level: 5 tuner/detail, type:
>size_t)
> MCA btl: parameter "btl_vader_cuda_rdma_limit"
>(current value: "18446744073709551615", data source: default, level: 5
>tuner/detail, type: size_t)
> MCA coll: parameter "coll_ml_config_file" (current
>value:
>"/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37/share/openmpi/mca-
>coll-ml.config",
>data source: default, level: 9 dev/all, type: string)
> MCA io: informational "io_romio_complete_configure_params"
>(current value:
>"--with-file-system=nfs+lustre FROM_OMPI=yes
>CC='/software6/compilers/gcc/4.8/bin/gcc -std=gnu99' CFLAGS='-O3 -
>DNDEBUG -finline-functions -fno-strict-aliasing -pthread' CPPFLAGS='
>-I/software-gpu/src/openmpi-1.8.1/opal/mca/hwloc/hwloc172/hwloc/include
>-I/software-gpu/src/openmpi-1.8.1/opal/mca/event/libevent2021/libevent
>-I/software-gpu/src/openmpi-
>1.8.1/opal/mca/event/libevent2021/libevent/include'
>FFLAGS='' LDFLAGS=' ' --enable-shared --enable-static --with-file-
>system=nfs+lustre
>--prefix=/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37
>--disable-aio", data source: default, level: 9 dev/all, type: string) [login-
>gpu01.calculquebec.ca:11486] mca: base: close: unloading component Q
>
>
>--
>---------------------------------
>Maxime Boissonneault
>Analyste de calcul - Calcul Québec, Université Laval Ph. D. en physique
>
>_______________________________________________
>users mailing list
>users_at_[hidden]
>http://www.open-mpi.org/mailman/listinfo.cgi/users
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------