bffffff... painful answer.
Is there anything easy that the administrators of the cluster could do?
How could I persuade them that this is an easy task to do?
On Thu 16 Feb 2012 14:18:07 GMT, Brice Goglin wrote:
> Your machine has a buggy BIOS. It reports an empty locality info for
> PCI device. That's why hwloc cpuset is empty as well. I guess we
> should detect this case and return the entire machine cpuset instead.
> Something like this should help:
> Index: include/hwloc/cuda.h
> --- include/hwloc/cuda.h (rÃ©vision 4302)
> +++ include/hwloc/cuda.h (copie de travail)
> @@ -92,6 +92,8 @@
> return -1;
> hwloc_linux_parse_cpumap_file(sysfile, set);
> + if (hwloc_bitmap_iszero(set))
> + hwloc_bitmap_copy(set, hwloc_topology_get_complete_cpuset(topology));
> Le 16/02/2012 15:09, Albert Solernou a Ã©crit :
>> Hi Brice,
>> I attach a tar ball with the outputs.
>> It may be also relevant to specify that I am running hwloc on a
>> cluster, so this is the output on a node with two GPU cards.
>> Thank you,
>> On 16/02/12 13:56, Brice Goglin wrote:
>>> Hello Albert,
>>> Does lstopo show PCI devices properly?
>>> Can you send these outputs?
>>> lstopo -.xml
>>> for i in /sys/bus/pci/devices/* ; do echo -n "$i " ; cat
>>> $i/local_cpus ; done
>>> Le 16/02/2012 14:28, Albert Solernou a Ã©crit :
>>>> I am receiving cpuset 0x0 when I call hwloc_cuda_get_device_cpuset.
>>>> The exact output of tests/cuda.c is:
>>>> got cpuset 0x0 for device 0
>>>> got cpuset 0x0 for device 1
>>>> I have tried hwloc 1.3 and 1.4, using gnu and intel compilers. I am on
>>>> a ROCKS cluster, with two NVidia C2050 GPU cards,
>>>> Everything else seems to be working fine... What could I check for?
>>>> What information do you need to help me?
>>>> Thank you,
>> hwloc-users mailing list