-----BEGIN PGP SIGNED MESSAGE-----
On 03/05/13 14:30, Ralph Castain wrote:
> On May 2, 2013, at 9:18 PM, Christopher Samuel
> <samuel_at_[hidden]> wrote:
>> We're using Slurm, and it supports them already apparently, so I'm
>> not sure if that helps?
> It does - but to be clear: your saying that you can directly launch
> processes onto the Phi's via srun?
Ah no, Slurm 2.5 supports them as coprocessors, allocated as GPUs are.
I've been told Slurm 2.6 (under development) may support them as nodes
in their own right, but that's not something I've had time to look into
> If so, then this may not be a problem, assuming you can get
> confirmation that the Phi's have direct access to the interconnects.
I'll see what I can do. There is a long README which will be my light
reading on the train home tonight here:
This seems to indicate how that works, but other parts imply that it
*may* require Intel True Scale InfiniBand adapters:
3.4 Starting Intel(R) MPSS with OFED Support
1) Start the Intel(R) MPSS service. Section 2.3, "Starting Intel(R) MPSS
Services" explains how. Do not proceed any further if Intel(R) MPSS is not
2) Start IB and HCA services.
user_prompt> sudo service openibd start
user_prompt> sudo service opensmd start
3) Start The Intel(R) Xeon Phi(TM) coprocessor specific OFED service.
user_prompt> sudo service ofed-mic start
4) To start the experimental ccl-proxy service (see /etc/mpxyd.conf)
user_prompt> sudo service mpxyd start
3.5 Stopping Intel(R) MPSS with OFED Support
o If the installed version is earlier than 2.x.28xx unload the driver using:
user_prompt> sudo modprobe -r mic
o If the installed version is 2.x.28xx or later, unload the driver using:
user_prompt> sudo service ofed-mic stop
user_prompt> sudo service mpss stop
user_prompt> sudo service mpss unload
user_prompt> sudo service opensmd stop
user_prompt> sudo service openibd stop
o If the experimental ccl-proxy driver was started, unload the driver using:
user_prompt> sudo service mpxyd stop
> If the answer to both is "yes", then just srun the MPI procs
> directly - we support direct launch and use PMI to wireup. Problem
> solved :-)
That would be ideal, I'll do more digging into Slurm 2.6 (we had
planned on starting off with that, but as coprocessors, but this
may be enough for us to change).
> And yes - that support is indeed in the 1.6 series...just configure
> --with-pmi. You may need to provide the path to where pmi.h is
> located under the slurm install, but probably not.
All the best,
Christopher Samuel Senior Systems Administrator
VLSCI - Victorian Life Sciences Computation Initiative
Email: samuel_at_[hidden] Phone: +61 (0)3 903 55545
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
-----END PGP SIGNATURE-----