Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] Segmentation fault in collect_proc_cpuset, topology.c line 1074
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2013-01-17 13:01:59


Great, thanks, I'll release the final 1.6.1 later tonight then.

Brice

Le 17/01/2013 17:25, cessenat_at_[hidden] a écrit :
> Hello Brice,
>
> I wrongly tested with 1.6rc2 instead of 1.6.1rc2 !
> It works fine with 1.6.1rc2 - so end of thread for me.
>
> Thank you and sorry again,
>
> Olivier Cessenat
>
> ----- Mail original -----
> De: "Brice Goglin" <Brice.Goglin_at_[hidden]>
> À: cessenat_at_[hidden]
> Cc: "Hardware locality user list" <hwloc-users_at_[hidden]>
> Envoyé: Mercredi 16 Janvier 2013 18:56:52
> Objet: Re: [hwloc-users] Segmentation fault in collect_proc_cpuset, topology.c line 1074
>
> Please send the tarball generated by hwloc-gather-topology in hwloc 1.5
> Thanks
> Brice
>
>
>
> Le 16/01/2013 18:49, cessenat_at_[hidden] a écrit :
>> Hello,
>>
>> Unfortunately it fails as well.
>> Failure happens when the proc involved is not proc number 0 of the node.
>>
>> Cheers
>> Olivier Cessenat.
>>
>> ----- Mail original -----
>> De: "Brice Goglin" <brice.goglin_at_[hidden]>
>> À: "Hardware locality user list" <hwloc-users_at_[hidden]>, cessenat_at_[hidden]
>> Envoyé: Mardi 15 Janvier 2013 19:26:30
>> Objet: Re: [hwloc-users] Segmentation fault in collect_proc_cpuset, topology.c line 1074
>>
>> Hello
>> Indeed, there's a big cgroup crash in 1.6. Can you verify that 1.6.1rc2 works fine?
>> Thanks
>> Brice
>>
>>
>>
>>
>> cessenat_at_[hidden] a écrit :
>>
>> Hello,
>>
>> When updating from 1.5.1 to 1.6 I get a segfault when inside a
>> cgroup/cpuset in collect_proc_cpuset, file topology.c line 1074.
>>
>> It appears that an HWLOC_OBJ_CORE has a son who is it's HWLOC_OBJ_GROUP's father !
>>
>> $ cat /proc/self/cgroup
>> 2: cpuset: /slurm/test
>> 1: freezer: /
>> $ lssubsys -m cpuset
>> cpuset /cgroup/cpuset
>> $ cat /cgroup/cpuset/slurm/test/cpuset.cpus
>> 31
>> $ hwloc-1.6/bis/lstopo
>> Segmentation fault (core dumped)
>> $ gdb...
>> Program terminated with signal 11, Segmentation fault.
>> #0 0x00007ffd758d225e in collect_proc_cpuset (obj=<value opt out>, sys=0x1f4dba0) at topology.c: 1074
>>
>> The machine is made of bullx super-node S6010 (CEA Tera 100).
>>
>> Thanks for your help,
>>
>> Olivier Cessenat.
>>
>>
>>
>>
>> hwloc-users mailing list
>> hwloc-users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users