Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Any plans to support Intel MIC (Xeon Phi) in Open-MPI?
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-05-02 22:21:39


Hmmm...well, a few points here. First, the Phi's sadly don't show up in the hwloc tree as they apparently are hidden behind the PCIe bridge. I don't know if there is a way for hwloc to "probe" and find processors on PCI cards, but that's something I'll have to defer to Jeff and Brice.

So the first problem is: how to know the Phi's are present, how many you have on each node, etc? We could push that into something like the hostfile, but that requires that someone build the file. Still, it would only have to be built once, so maybe that's not too bad - could have a "wildcard" entry if every node is the same, etc.

Next, we have to launch processes across the PCI bus. We had to do an "rsh" launch of the MPI procs onto RR's cell processors as they appeared to be separate "hosts", though only visible on the local node (i.e., there was a stripped-down OS running on the cell) - Paul's cmd line implies this may also be the case here. If the same method works here, then we have most of that code still available (needs some updating). We would probably want to look at whether or not binding could be supported on the Phi local OS.

Finally, we have to wire everything up. This is where RR got a little tricky, and we may encounter the same thing here. On RR, the cell's didn't have direct access to the interconnects - any messaging had to be relayed by a process running on the main cpu. So we had to create the ability to "route" MPI messages from processes running on the cells to processes residing on other nodes.

Solving the first two is relatively straightforward. In my mind, the primary issue is the last one - does anyone know if a process on the Phi's can "see" interconnects like a TCP NIC or an Infiniband adaptor?

On May 2, 2013, at 6:36 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:

> Jeff,
>
> I know Intel MPI (MPICH based) "just works" with Phi, but you need to do things like:
> mpirun –n 2 –host cpu host.exe : –n 4 –host mic0 mic.exe
> if you want to use the Phi for more than just kernel-offload (in which case they won't have/need an MPI rank).
> So, launch procs is PART of the problem, but certainty not all of it.
>
> At least, unlike RR, the processing elements all share the same endianness!
>
> -Paul
>
>
> On Thu, May 2, 2013 at 6:28 PM, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]> wrote:
> I know the MPICH guys did a bunch of work to support the Phi's. I don't know exactly what that means (I haven't read their docs about this stuff), but I suspect that it's more than just launching MPI processes on them...
>
>
> On May 2, 2013, at 8:54 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>
> > Ralph,
> >
> > I am not an expert, by any means, but based on a presentation I heard 4 hours ago:
> >
> > The Xeon and Phi instruction sets have a large intersection, but neither is a subset of the other.
> > In particular, Phi has its own SIMD instructions *instead* of Xeon's MMX, SSEn, etc.
> > There is also on CMPXCHG16B instruction on Phi, among others.
> > So, there will need to be different binaries, or "fat" binaries that branch based on CPU type.
> >
> > -Paul
> >
> >
> > On Thu, May 2, 2013 at 5:47 PM, Ralph Castain <rhc_at_[hidden]> wrote:
> >
> > On May 2, 2013, at 5:12 PM, Christopher Samuel <samuel_at_[hidden]> wrote:
> >
> > > -----BEGIN PGP SIGNED MESSAGE-----
> > > Hash: SHA1
> > >
> > > Hi folks,
> > >
> > > The new system we're bringing up has 10 nodes with dual Xeon Phi MIC
> > > cards, are there any plans to support them by launching MPI tasks
> > > directly on the Phis themselves (rather than just as offload devices
> > > for code on the hosts)?
> >
> > We had something similar at one time - I developed it for the Roadrunner cluster so you could run MPI tasks on the GPUs. Worked well, but eventually fell into disrepair due to lack of use.
> >
> > In this case, I suspect it will be much easier to do as the Phis appear to be a lot more visible to the host than the GPU did on RR. Looking at the documentation, the Phis just sit directly on the PCIe bus, so they should look just like any other processor, and they are Xeon binary compatible - so there is no issue with tracking which binary to run on which processor.
> >
> > Brice: do the Phis appear in the hwloc topology object?
> >
> > Chris: can you run lstopo on one of the nodes and send me the output (off-list)?
> >
> >
> > >
> > > All the best,
> > > Chris
> > > - --
> > > Christopher Samuel Senior Systems Administrator
> > > VLSCI - Victorian Life Sciences Computation Initiative
> > > Email: samuel_at_[hidden] Phone: +61 (0)3 903 55545
> > > http://www.vlsci.org.au/ http://twitter.com/vlsci
> > >
> > > -----BEGIN PGP SIGNATURE-----
> > > Version: GnuPG v1.4.11 (GNU/Linux)
> > > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> > >
> > > iEYEARECAAYFAlGDAPYACgkQO2KABBYQAh+y9ACfZ0SdqDuV7Euq3B0ANtxPhH1D
> > > 3h4An1Zlhu2Ut+OFvbTa9xbLBkspwwPY
> > > =TbIy
> > > -----END PGP SIGNATURE-----
> > > _______________________________________________
> > > devel mailing list
> > > devel_at_[hidden]
> > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> >
> > --
> > Paul H. Hargrove PHHargrove_at_[hidden]
> > Future Technologies Group
> > Computer and Data Sciences Department Tel: +1-510-495-2352
> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> --
> Paul H. Hargrove PHHargrove_at_[hidden]
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel