Le 22/01/2013 10:27, Samuel Thibault a écrit :
> Kenneth A. Lloyd, le Mon 21 Jan 2013 22:46:37 +0100, a écrit :
>> Thanks for making this tutorial available. Using hwloc 1.7, how far down
>> into, say, NVIDIA cards can the architecture be reflected? Global memory
>> size? SMX cores? None of the above?
> None of the above for now. Both are available in the cuda svn branch,
Now the question to Kenneth is "what do YOU need?"
I didn't merge the GPU internals into the trunk yet because I'd like to
see if that matches what we would do with OpenCL and other accelerators
such as the Xeon Phi.
One thing is keep in mind is that most hwloc/GPU users will use hwloc to
get locality information but they will also still use CUDA to use the
GPU. So they will still be able to use CUDA to get in-depth GPU
information anyway. Then the question is how much CUDA info do we want
to duplicate in hwloc. hwloc could have the basic/uniform GPU
information and let users rely on CUDA for everything CUDA-specific for
instance. Right now, the basic/uniform part is almost empty (just
contain the GPU model name or so).
Also the CUDA branch creates hwloc objects inside the GPU to describe
the memory/cores/caches/... Would you use these objects in your
application ? or would you rather just have a basic GPU attribute
structure containing the number of SMX, the memory size, ... One problem
with this is that it may be hard to define a structure that works for
all GPUs, even only the NVIDIA ones. We may need an union of structs...
I am talking about "your application" above because having lstopo draw
very nice GPU internals doesn't mean the corresponding hwloc objects are
useful to real application.