Actually, since that GPUDirect is not yet officially released, but you may want to contact hpc_at_[hidden] to get the needed info and when the drivers will be released. Thanks!
- Pak
-----Original Message-----
From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On Behalf Of Pak Lui
Sent: Monday, February 28, 2011 11:30 AM
To: Open MPI Users
Subject: Re: [OMPI users] anybody tried OMPI with gpudirect?
Hi Brice,
You will need the MLNX_OFED with the GPUDirect support in order to work. I will check to there's a release of it that supports SLES and let you know.
[pak_at_maia001 ~]$ /sbin/modinfo ib_core
filename: /lib/modules/2.6.18-194.nvel5/updates/kernel/drivers/infiniband/core/ib_core.ko
<snip...>
parm: gpu_direct_enable:Enable GPU Direct [default 1] (int)
parm: gpu_direct_shares:GPU Direct Calls Number [default 0] (int)
parm: gpu_direct_pages:GPU Direct Shared Pages Number [default 0] (int)
parm: gpu_direct_fail:GPU Direct Failures Number [default 0] (int)
Once that IB driver is loaded, you should find that there are additional counters being available from ib_core. And if you are using GPUDirect, the gpu_direct_shares and gpu_direct_pages counters will be incremented. The counters are located at:
/sys/module/ib_core/parameters/gpu_direct_shares
/sys/module/ib_core/parameters/gpu_direct_pages
Regards,
- Pak
-----Original Message-----
From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On Behalf Of Brice Goglin
Sent: Monday, February 28, 2011 11:14 AM
To: Open MPI Users
Subject: Re: [OMPI users] anybody tried OMPI with gpudirect?
Le 28/02/2011 19:49, Rolf vandeVaart a écrit :
> For the GPU Direct to work with Infiniband, you need to get some updated OFED bits from your Infiniband vendor.
>
> In terms of checking the driver updates, you can do a grep on the string get_driver_pages in the file/proc/kallsyms. If it is there, then the Linux kernel is updated correctly.
>
The kernel looks ok then. But I couldn't find any kernel modules (tried
nvidia.ko and all ib modules) which references this symbol. So I guess
my OFED kernel modules aren't ok. I'll check on Mellanox website (we
have some very recent Mellanox ConnectX QDR boards).
thanks
Brice
_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users
|