$subject_val = "Re: [MTT users] FW: ALPS modifications for MTT"; include("../../include/msg-header.inc"); ?>
Subject: Re: [MTT users] FW: ALPS modifications for MTT
From: Matney Sr, Kenneth D. (matneykdsr_at_[hidden])
Date: 2008-08-20 10:38:22
Hi Jeff,
The trunk needs an additional patch to make ALPS work (without
complaints). I have attached it hereto. Also, I will send along the
ornl.ini script when I get it finalized. This wlll show how we do Cray
XT builds, run, etc.
-- Ken -----Original Message----- From: mtt-users-bounces_at_[hidden] [mailto:mtt-users-bounces_at_[hidden]] On Behalf Of Jeff Squyres Sent: Thursday, August 14, 2008 10:47 AM To: General user list for the MPI Testing Tool Subject: Re: [MTT users] FW: ALPS modifications for MTT BTW, I committed this patch to the MTT trunk. I feel a little sheepish; I should have told you to use the trunk these days, not the release branch (I know the wiki specifically says otherwise). We really need to finally make a release out of what is on the trunk -- it's much more advanced than what is on the release branch (look at the CHANGES file in the top-level dir to see what has changed since the release branch). The Cisco MTT files in SVN are for the trunk; it's possible that the features that the release branch doesn't understand will just be ignored, but I haven't tried this in a long time. On Aug 14, 2008, at 10:35 AM, Jeff Squyres wrote: > This patch looks good to me. > > I'll commit. If you want to do any more work on MTT, perhaps ORNL > can add you to its "Schedule A" form for the Open MPI Third Party > Contribution form (it's very easy to amend Schedule A -- doesn't > require any authoritative signatures), we could get you an MTT SVN > account and you could commit this stuff directly. > > > On Aug 14, 2008, at 10:24 AM, Matney Sr, Kenneth D. wrote: > >> Hi, >> >> When running MTT on the Cray XT3/XT4 machines, I found that MTT >> does not >> contain any support for ALPS. As a result, it always executes mpirun >> with "-np 1". I patched lib/MTT/Values/Functions.pm with the >> following >> to overcome this: >> >> -----Original Message----- >> From: Matney Sr, Kenneth D. >> Sent: Wednesday, August 13, 2008 5:57 PM >> To: Shipman, Galen M. >> Cc: Graham, Richard L. >> Subject: FW: ALPS modifications for MTT >> >> --- Functions-bak.pm 2008-08-06 14:31:26.256538000 -0400 >> +++ Functions.pm 2008-08-13 17:43:40.273641000 -0400 >> @@ -602,6 +602,8 @@ >> # Resource managers >> return "SLURM" >> if slurm_job(); >> + return "ALPS" >> + if alps_job(); >> return "TM" >> if pbs_job(); >> return "N1GE" >> @@ -638,6 +640,8 @@ >> # Resource managers >> return slurm_max_procs() >> if slurm_job(); >> + return alps_max_procs() >> + if alps_job(); >> return pbs_max_procs() >> if pbs_job(); >> return n1ge_max_procs() >> @@ -670,6 +674,8 @@ >> # Resource managers >> return slurm_hosts() >> if slurm_job(); >> + return alps_hosts() >> + if alps_job(); >> return pbs_hosts() >> if pbs_job(); >> return n1ge_hosts() >> @@ -1004,6 +1010,70 @@ >> >> >> #----------------------------------------------------------------------- >> --- >> >> +# Return "1" if we're running in an ALPS job; "0" otherwise. >> +sub alps_job { >> + Debug("&alps_job\n"); >> + >> +# It is true that ALPS can be run in an interactive access mode; >> however, >> +# this would not be a true managed environment. Such only can be >> +# achieved under a batch scheduler. >> + return ((exists($ENV{BATCH_PARTITION_ID}) && >> + exists($ENV{PBS_NNODES})) ? "1" : "0"); >> +} >> + >> + >> #---------------------------------------------------------------------- >> ---- >> + >> +# If in an ALPS job, return the max number of processes we can run. >> +# Otherwise, return 0. >> +sub alps_max_procs { >> + Debug("&alps_max_procs\n"); >> + >> + return "0" >> + if (!alps_job()); >> + >> +# If we were not running under PBS or some other batch system, we >> would >> +# not have the foggiest idea of how many processes mpirun could >> spawn. >> + my $ret; >> + $ret=$ENV{PBS_NNODES}; >> + >> + Debug("&alps_max_procs returning: $ret\n"); >> + return "$ret"; >> +} >> + >> + >> #---------------------------------------------------------------------- >> ---- >> + >> +# If in an ALPS job, return the hosts we can run on. Otherwise, >> return >> +# "". >> +sub alps_hosts { >> + Debug("&alps_hosts\n"); >> + >> + return "" >> + if (!alps_job()); >> + >> +# Again, we need a batch system to achieve management; return the >> uniq'ed >> +# contents of $PBS_HOSTFILE. Actually, on the Cray XT, we can >> return >> the >> +# NIDS allocated by ALPS; but, without launching servers to other >> service >> +# nodes, all communication is via the launching node and NIDS >> actually >> +# have no persistent resource allocated to the user. That is, all >> file >> +# resources accessible from a NID are shared with the launching >> node. >> >> +# And, since ALPS is managed by the batch system, only the >> launching >> node >> +# can initiate communication with a NID. In effect, the Cray XT >> model is >> +# of a single service node with a varying number of compute >> processors. >> + open (FILE, $ENV{PBS_NODEFILE}) || return ""; >> + my $lines; >> + while (<FILE>) { >> + chomp; >> + $lines->{$_} = 1; >> + } >> + >> + my @hosts = sort(keys(%$lines)); >> + my $hosts = join(",", @hosts); >> + Debug("&alps_hosts returning: $hosts\n"); >> + return "$hosts"; >> +} >> + >> + >> #---------------------------------------------------------------------- >> ---- >> + >> # Return "1" if we're running in a PBS job; "0" otherwise. >> sub pbs_job { >> Debug("&pbs_job\n"); >> >> >> >> >> -- >> Ken >> >> _______________________________________________ >> mtt-users mailing list >> mtt-users_at_[hidden] >> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users > > > -- > Jeff Squyres > Cisco Systems > > _______________________________________________ > mtt-users mailing list > mtt-users_at_[hidden] > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Jeff Squyres Cisco Systems _______________________________________________ mtt-users mailing list mtt-users_at_[hidden] http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users