Open MPI logo

MTT Users Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [MTT users] FW: ALPS modifications for MTT
From: Matney Sr, Kenneth D. (matneykdsr_at_[hidden])
Date: 2008-08-20 10:38:22


Hi Jeff,

The trunk needs an additional patch to make ALPS work (without
complaints). I have attached it hereto. Also, I will send along the
ornl.ini script when I get it finalized. This wlll show how we do Cray
XT builds, run, etc.

-- 
Ken
-----Original Message-----
From: mtt-users-bounces_at_[hidden]
[mailto:mtt-users-bounces_at_[hidden]] On Behalf Of Jeff Squyres
Sent: Thursday, August 14, 2008 10:47 AM
To: General user list for the MPI Testing Tool
Subject: Re: [MTT users] FW: ALPS modifications for MTT
BTW, I committed this patch to the MTT trunk.
I feel a little sheepish; I should have told you to use the trunk  
these days, not the release branch (I know the wiki specifically says  
otherwise).  We really need to finally make a release out of what is  
on the trunk -- it's much more advanced than what is on the release  
branch (look at the CHANGES file in the top-level dir to see what has  
changed since the release branch).
The Cisco MTT files in SVN are for the trunk; it's possible that the  
features that the release branch doesn't understand will just be  
ignored, but I haven't tried this in a long time.
On Aug 14, 2008, at 10:35 AM, Jeff Squyres wrote:
> This patch looks good to me.
>
> I'll commit.  If you want to do any more work on MTT, perhaps ORNL  
> can add you to its "Schedule A" form for the Open MPI Third Party  
> Contribution form (it's very easy to amend Schedule A -- doesn't  
> require any authoritative signatures), we could get you an MTT SVN  
> account and you could commit this stuff directly.
>
>
> On Aug 14, 2008, at 10:24 AM, Matney Sr, Kenneth D. wrote:
>
>> Hi,
>>
>> When running MTT on the Cray XT3/XT4 machines, I found that MTT  
>> does not
>> contain any support for ALPS.  As a result, it always executes mpirun
>> with "-np 1".  I patched lib/MTT/Values/Functions.pm with the  
>> following
>> to overcome this:
>>
>> -----Original Message-----
>> From: Matney Sr, Kenneth D.
>> Sent: Wednesday, August 13, 2008 5:57 PM
>> To: Shipman, Galen M.
>> Cc: Graham, Richard L.
>> Subject: FW: ALPS modifications for MTT
>>
>> --- Functions-bak.pm	2008-08-06 14:31:26.256538000 -0400
>> +++ Functions.pm	2008-08-13 17:43:40.273641000 -0400
>> @@ -602,6 +602,8 @@
>>    # Resource managers
>>    return "SLURM"
>>        if slurm_job();
>> +    return "ALPS"
>> +        if alps_job();
>>    return "TM"
>>        if pbs_job();
>>    return "N1GE"
>> @@ -638,6 +640,8 @@
>>    # Resource managers
>>    return slurm_max_procs()
>>        if slurm_job();
>> +    return alps_max_procs()
>> +        if alps_job();
>>    return pbs_max_procs()
>>        if pbs_job();
>>    return n1ge_max_procs()
>> @@ -670,6 +674,8 @@
>>    # Resource managers
>>    return slurm_hosts()
>>        if slurm_job();
>> +    return alps_hosts()
>> +        if alps_job();
>>    return pbs_hosts()
>>        if pbs_job();
>>    return n1ge_hosts()
>> @@ -1004,6 +1010,70 @@
>>
>>
>>
#-----------------------------------------------------------------------
>> ---
>>
>> +# Return "1" if we're running in an ALPS job; "0" otherwise.
>> +sub alps_job {
>> +    Debug("&alps_job\n");
>> +
>> +#   It is true that ALPS can be run in an interactive access mode;
>> however,
>> +#   this would not be a true managed environment.  Such only can be
>> +#   achieved under a batch scheduler.
>> +    return ((exists($ENV{BATCH_PARTITION_ID}) &&
>> +             exists($ENV{PBS_NNODES})) ? "1" : "0");
>> +}
>> +
>> + 
>>
#----------------------------------------------------------------------
>> ----
>> +
>> +# If in an ALPS job, return the max number of processes we can run.
>> +# Otherwise, return 0.
>> +sub alps_max_procs {
>> +    Debug("&alps_max_procs\n");
>> +
>> +    return "0"
>> +        if (!alps_job());
>> +
>> +#   If we were not running under PBS or some other batch system, we
>> would
>> +#   not have the foggiest idea of how many processes mpirun could
>> spawn.
>> +    my $ret;
>> +    $ret=$ENV{PBS_NNODES};
>> +
>> +    Debug("&alps_max_procs returning: $ret\n");
>> +    return "$ret";
>> +}
>> +
>> + 
>>
#----------------------------------------------------------------------
>> ----
>> +
>> +# If in an ALPS job, return the hosts we can run on.  Otherwise,  
>> return
>> +# "".
>> +sub alps_hosts {
>> +    Debug("&alps_hosts\n");
>> +
>> +    return ""
>> +        if (!alps_job());
>> +
>> +#   Again, we need a batch system to achieve management; return the
>> uniq'ed
>> +#   contents of $PBS_HOSTFILE.  Actually, on the Cray XT, we can  
>> return
>> the
>> +#   NIDS allocated by ALPS; but, without launching servers to other
>> service
>> +#   nodes, all communication is via the launching node and NIDS
>> actually
>> +#   have no persistent resource allocated to the user.  That is, all
>> file
>> +#   resources accessible from a NID are shared with the launching  
>> node.
>>
>> +#   And, since ALPS is managed by the batch system, only the  
>> launching
>> node
>> +#   can initiate communication with a NID.  In effect, the Cray XT
>> model is
>> +#   of a single service node with a varying number of compute
>> processors.
>> +    open (FILE, $ENV{PBS_NODEFILE}) || return "";
>> +    my $lines;
>> +    while (<FILE>) {
>> +        chomp;
>> +        $lines->{$_} = 1;
>> +    }
>> +
>> +    my @hosts = sort(keys(%$lines));
>> +    my $hosts = join(",", @hosts);
>> +    Debug("&alps_hosts returning: $hosts\n");
>> +    return "$hosts";
>> +}
>> +
>> + 
>>
#----------------------------------------------------------------------
>> ----
>> +
>> # Return "1" if we're running in a PBS job; "0" otherwise.
>> sub pbs_job {
>>    Debug("&pbs_job\n");
>>
>>
>>
>>
>> -- 
>> Ken
>>
>> _______________________________________________
>> mtt-users mailing list
>> mtt-users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>
>
> -- 
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> mtt-users mailing list
> mtt-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
-- 
Jeff Squyres
Cisco Systems
_______________________________________________
mtt-users mailing list
mtt-users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users