Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] CUDA RDMA not selected by default
From: Nathan Hjelm (hjelmn_at_[hidden])
Date: 2012-03-19 16:15:45


The selection of cm is not wrong per se. You will find that the psm mtl is much better than the openib btl for QLogic harware.

-Nathan

On Mon, 19 Mar 2012, Jens Glaser wrote:

> Hello,
>
> I am using the latest trunk version of OMPI, in order to take advantage of the new CUDA RDMA features (smcuda BTL). RDMA support is superb, however, I have to give a manual parameter
>
> mpirun --mca pml ob1 ...
>
> to have the OB1 upper layer selected and, consequently, to get smcuda activated. Otherwise mpirun chooses the cm upper layer, which is wrong. The hardware is a
>
> InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA (rev 02).
>
> This is the output of
> mpirun - mca pml_base_verbose 100
>
> [cas002:05518] select: component cm selected
> [cas002:05518] mca: base: close: component v closed
> [cas002:05518] mca: base: close: unloading component v
> [cas002:05518] mca: base: close: component bfo closed
> [cas002:05518] mca: base: close: unloading component bfo
> [cas002:05518] mca: base: close: component csum closed
> [cas002:05518] mca: base: close: unloading component csum
> [cas002:05518] mca: base: close: component dr closed
> [cas002:05518] mca: base: close: unloading component dr
> [cas002:05518] mca: base: close: component ob1 closed
> [cas002:05518] mca: base: close: unloading component ob1
> [cas002:05520] mca: base: components_open: component cm open function successful
> [cas002:05520] mca: base: components_open: found loaded component csum
> [cas002:05520] mca: base: components_open: component csum has no register function
> [cas002:05520] mca: base: components_open: component csum open function successful
> [cas002:05520] mca: base: components_open: found loaded component dr
> [cas002:05520] mca: base: components_open: component dr has no register function
> [cas002:05520] mca: base: components_open: component dr open function successful
> [cas002:05520] mca: base: components_open: found loaded component ob1
> [cas002:05520] mca: base: components_open: component ob1 has no register function
> [cas002:05520] mca: base: components_open: component ob1 open function successful
> [cas002:05520] select: component v not in the include list
> [cas002:05520] select: component bfo not in the include list
> [cas002:05520] select: initializing pml component cm
> [cas002:05520] select: init returned priority 30
> [cas002:05520] select: component csum not in the include list
> [cas002:05520] select: component dr not in the include list
> [cas002:05520] select: initializing pml component ob1
> [cas002:05520] select: init returned failure for component ob1
> [cas002:05520] selected cm best priority 30
> [cas002:05520] select: component cm selected
> [cas002:05520] mca: base: close: component v closed
> [cas002:05520] mca: base: close: unloading component v
> [cas002:05520] mca: base: close: component bfo closed
> [cas002:05520] mca: base: close: unloading component bfo
> [cas002:05520] mca: base: close: component csum closed
> [cas002:05520] mca: base: close: unloading component csum
> [cas002:05520] mca: base: close: component dr closed
> [cas002:05520] mca: base: close: unloading component dr
> [cas002:05520] mca: base: close: component ob1 closed
> [cas002:05520] mca: base: close: unloading component ob1
> [cas002:05518] check:select: checking my pml cm against rank=0 pml cm
> [cas002:05517] check:select: rank=0
> [cas002:05520] check:select: checking my pml cm against rank=0 pml cm
> [cas002:05519] check:select: checking my pml cm against rank=0 pml cm
>
> Configure options:
> ./configure --with-openib --with-cuda --prefix=/home/it1/glaser/local --with-tm=/opt/torque --enable-shared
>
> Does anyone have any idea what causes openmpi to select cm by default?
>
> Thanks,
> Jens.
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>