Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Bug in openmpi 1.5.4 in paffinity
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2011-09-04 17:30:53


Le 04/09/2011 22:35, Ake Sandgren a écrit :
> On Sun, 2011-09-04 at 22:13 +0200, Brice Goglin wrote:
>> Hello,
>>
>> Could you log again on this node (with same cgroups enabled), run
>> hwloc-gather-topology <name>
>> and send the resulting <name>.output and <name>.tar.bz2?
>>
>> Send them to the hwloc-devel or open a ticket on
>> https://svn.open-mpi.org/trac/hwloc (or send them to me in private if
>> you don't want to subscribe).
> Since it's a bit late here i'm lazy and sending to you directly.
>
> Output from both nodes involved in the batchjob
> slurm -N 2 --ntasks-per-node=1 ... was what i was using.
>
> Hope it helps. If not let me know if there is anything else i can do.
>
> /Ã…ke S.

Thanks, I understand the problem but it's not easy to fix. To workaround
the crash until I come with a real fix, you can comment-out
    hwloc_topology__set_distance_matrix()
at the end of look_sysfsnode() in topology-linux.c

Brice