Open MPI logo

Network Locality users Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Network Locality users mailing list

Subject: Re: [netloc-users] Trying Netloc on a small IB cluster
From: Raghu (rajachan_at_[hidden])
Date: 2013-12-05 17:17:56


Josh,

I was using a much older version indeed (v1.5). I wanted to get netloc
up and running functionally before updating to the bleeding-edge, but
this info helps. I will update hwloc and give this another spin.

Thanks!

Raghu

On Thu, Dec 5, 2013 at 3:09 PM, Josh Hursey <jjhursey_at_[hidden]> wrote:
> I wonder if this is a hwloc version/configuration issue then. What version
> of hwloc are using using? Does it have the pci access mechanism enabled?
>
> Note that the latest hwloc release (1.8) has some other special sauce that
> works nicely with netloc (i.e., hwloc topology compression).
>
> Brice might be able to help with this question a bit more when he gets back
> from travel if the above does not help.
>
>
>
> On Thu, Dec 5, 2013 at 4:04 PM, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]>
> wrote:
>>
>> (Raghu's mail didn't go through to the list because he's accidentally not
>> subscribed)
>>
>> Thanks for your patience with netloc -- it's still early days with this
>> stuff.
>>
>> It *works* for us, but a) that doesn't necessarily mean it works for
>> everyone yet (i.e., we haven't shaken out all the bugs / addressed all
>> corner cases yet), and b) as Josh said, it's not yet terribly user-friendly.
>>
>>
>>
>> On Dec 5, 2013, at 4:50 PM, Raghu <rajachan_at_[hidden]> wrote:
>>
>> > Jeff, Josh,
>> >
>> > Thanks for the responses. I did stumble upon the other email on the
>> > devel list, and tried the steps you listed there. With that route
>> > though, the gather tool couldn't find any subnets with the hwloc xmls
>> > I had provided (exited after printing "Found 0 subnets in hwloc
>> > directory"). I will try to fiddle around a little more with the other
>> > method to see if I can get things to work.
>> >
>> > Raghu
>> >
>> >
>> > On Thu, Dec 5, 2013 at 2:28 PM, Joshua Hursey <jjhursey_at_[hidden]>
>> > wrote:
>> >> Raghu,
>> >>
>> >> The probably is likely that the subnet has not been specified. The
>> >> netloc_reader_ib is not terribly user friendly at the moment. We have
>> >> some
>> >> supporting tools that help make it easier to use. I highlighted the
>> >> steps
>> >> for another user in the mail linked below:
>> >> http://www.open-mpi.org/community/lists/netloc-devel/2013/11/0005.php
>> >>
>> >> Notice that it does not call netloc_reader_ib explicitly, it is wrapped
>> >> up
>> >> as part of the netloc-ib-extract-dats script.
>> >>
>> >>
>> >> You will also need to install Jansson (if you have not already) as that
>> >> is
>> >> how netloc is currently representing the data. It can be downloaded
>> >> from:
>> >> http://www.digip.org/jansson/
>> >>
>> >>
>> >> I am currently working on some FAQs to hopefully help in the future. In
>> >> the
>> >> mean time feel free to email us or directly to the users/devel mailing
>> >> list.
>> >>
>> >> Thanks,
>> >> Josh
>> >>
>> >>
>> >>
>> >> On Thu, Dec 5, 2013 at 2:01 PM, Raghu <rajachan_at_[hidden]>
>> >> wrote:
>> >>>
>> >>> Btw, I have a pretty standard installation -- ./configure
>> >>> --prefix=/home/rajachan/netloc/install
>> >>> --with-hwloc=/home/rajachan/hwloc-1.5/install
>> >>>
>> >>> Raghu
>> >>>
>> >>>
>> >>> On Thu, Dec 5, 2013 at 2:56 PM, Raghu <rajachan_at_[hidden]>
>> >>> wrote:
>> >>>> Hi Josh, Jeff,
>> >>>>
>> >>>> I am trying out netloc (the master branch) on a small IB cluster
>> >>>> (which I have sudo access to). I got stuff built fine, but when I try
>> >>>> to generate the .ndat files, I am getting this:
>> >>>>
>> >>>> Output Directory : /home/rajachan/netloc/install/bin/output/
>> >>>> Subnet : unknown
>> >>>> ibnetdiscover File : /home/rajachan/netloc/install/bin/ibnetdata
>> >>>> ibroutes Directory : None Specified
>> >>>> Status: Querying the ibnetdiscover data for subnet unknown...
>> >>>> Error: Invalid network type provided
>> >>>> Error: Failed to create a new data file
>> >>>>
>> >>>> Here's how I am running the reader : ./netloc_reader_ib -o
>> >>>> /home/rajachan/netloc/install/bin/output/ -f
>> >>>> /home/rajachan/netloc/install/bin/ibnetdata
>> >>>>
>> >>>> Do you guys see any glaring config mistake from my end?
>> >>>>
>> >>>> Raghu
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Joshua Hursey
>> >> Assistant Professor of Computer Science
>> >> University of Wisconsin-La Crosse
>> >> http://cs.uwlax.edu/~jjhursey
>>
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> _______________________________________________
>> netloc-users mailing list
>> netloc-users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/netloc-users
>
>
>
>
> --
> Joshua Hursey
> Assistant Professor of Computer Science
> University of Wisconsin-La Crosse
> http://cs.uwlax.edu/~jjhursey
>
> _______________________________________________
> netloc-users mailing list
> netloc-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/netloc-users
>