Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] Windows api threading functions equivalent to hwloc?
From: Samuel Thibault (samuel.thibault_at_[hidden])
Date: 2012-11-20 11:58:00


Andrew Somorjai, le Tue 20 Nov 2012 09:45:12 +0100, a écrit :
> I'm also confused about these two lines and whether its necessary for the second one to exist?
>
> HANDLE thread[num_threads];
> HANDLE pthread_getw32threadhandle_np(thread);
>
> Does the second api call fill the thread array or just the first element?

It does not fill anything, it returns the converted value.
The second api call should be done between pthread_create and
hwloc_set_thread_cpubind, as it needs to be called for each thread.
Otherwise it's not surprising that the threads are going around: if you
checked the error returned by hwloc_set_thread_cpubind, you would see
that it says the thread id is invalid.

What you need to understand is that that pthread_create fills
a pthread_t, not a HANDLE. That's why one then needs to use
pthread_getw32threadhandle_np to convert from the pthread_t into the
HANDLE before passing it to hwloc_set_thread_cpubind.

I.e.

> pthread_t thread[num_threads];
>
> for (t = 0; t < num_threads; t++)
> {
> printf("Creating thread %ld\n", t);
> rc = pthread_create(&thread[t], NULL, threaded_task, (void *)t);
> HANDLE handle = pthread_getw32threadhandle_np(thread[t]);
>
> hwloc_bitmap_t bitmap = hwloc_bitmap_alloc();
> hwloc_bitmap_set_only(bitmap, t);
> hwloc_set_thread_cpubind(topology, handle, bitmap, 0);
> hwloc_bitmap_free(bitmap);

In addition to that, remember what I mentioned in a previous mail (Mon,
19 Nov 2012 23:36:09 +0100): using hwloc_bitmap_set_only will use
physical indexes, which are most probably not what you want because they
depend on phases of the moon. Depending whether you want to execute one
thread per core, or per hyperthread, use the first or second of these:

rc = pthread_create(&thread[t], NULL, threaded_task, (void *)t);
HANDLE handle = pthread_getw32threadhandle_np(thread[t]);
hwloc_set_thread_cpubind(topology, handle,
        hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, t),
        0);

rc = pthread_create(&thread[t], NULL, threaded_task, (void *)t);
HANDLE handle = pthread_getw32threadhandle_np(thread[t]);
hwloc_set_thread_cpubind(topology, handle,
        hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, t),
        0);

and use

n = hwloc_get_nbobjs_by_type(topology, HWLOC_OBJ_CORE);

or

n = hwloc_get_nbobjs_by_type(topology, HWLOC_OBJ_PU);

to get the number of cores or hyperthreads.

Samuel