Open MPI logo

PLPA Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all PLPA Users mailing list

Subject: Re: [PLPA users] Newbie question
From: Hughes, Mike (msh_at_[hidden])
Date: 2008-01-15 12:31:42


Point 1: "- it doesn't seem like you are checking the return code of
plpa_sched_setaffinity. You might want to ensure that it's returning
success." -- I modified the test prpgram to test return value for all calls
to: plpa_sched_setaffinity(), the return value is 0 for all ranks.

Point 2: "- you might want to call plpa_sched_getaffinity and verify that the
mask you set is the mask that the OS actually uses (it *should* be,
but...)" -- This is the toughest one to check since I don't really understand
the plpa_cpu_set_t type. I've modified my test program (included below) to
have two variables of this type (mycpu,mycpu_check)one is set with the
function:
        PLPA_CPU_ZERO(&mycpu);
The other is set with:
        error_code=plpa_sched_getaffinity(0, sizeof(plpa_cpu_set_t),
&mycpu_check);
(This is done after the call to plpa_sched_setaffinity, I've also checked
that the return value set in error_code is zero)
I can use my debugger (kdbg) to compare the values of the two variables and
they appear to be the same.
(I've included a screen shot of the debugger to illustrate exactly what I
mean -- I hope)

Point 3: "also make sure that there aren't other processes consuming your
cores -- e.g., try running the same test on 6 cores instead of 8 to let other
OS/daemons have 2 free cores without trashing the execute times of your MPI
processes." -- I modifed my execution to use 6 processors, ececution time
still varies by over a factor of two.

Is there anything else I should check?

Thanks for your help.

Regards,

Mike Hughes

New test program.....
#include <string>
#include <iostream>

#include <math.h>
#include <time.h>
#include <mpi.h>

#ifdef __cplusplus
extern "C"
{
#include <plpa.h>
}
#else
#include <plpa.h>
#endif
const int DUMMYTAG=1;

//I usually use my favorite debugger for whatever platform I'm on (I use
//Sun's workshop debugger on Solaris) and put in this type of code, usually
//pretty soon after MPI_Init():
static void wait_for_debugger(void)
{
        int error_code;
        int rank;

        error_code=MPI_Comm_rank(MPI_COMM_WORLD, &rank);
        if (rank == 0)
        {
                printf("Waiting for debugger attachment. Please hit
enter.\n");
                getchar();
        }

        error_code=MPI_Barrier(MPI_COMM_WORLD);
}

/****************************************************************************
*************/
// comppilation: mpic++ plpa_use.cpp -lplpa -o plpa_use
// Invocation: (assuming you want eight processes on eight processors)
// mpirun -np <numberOfProcessors> plpa_use <limit>
// will compute sin(3.14159) <limit> times on <numberOfProcessors> processors
// typical invocation: mpirun -np 8 plpa_use 10000000
/****************************************************************************
*************/
int main(int argc, char *argv[]);

int main(int argc, char *argv[])
{
        int ntasks,error_code, myrank,rank,limit,i;
        double z;
        clock_t startTime, stopTime;
        MPI_Status status;
        plpa_cpu_set_t mycpu,mycpu_check;

        MPI_Init(&argc, &argv);
        limit = atoi(argv[1]);
        error_code=MPI_Errhandler_set(MPI_COMM_WORLD,MPI_ERRORS_RETURN);
        error_code=MPI_Comm_rank(MPI_COMM_WORLD,&myrank);
        error_code=MPI_Comm_size(MPI_COMM_WORLD,&ntasks);

        wait_for_debugger();

        //USE PLPA To lock each rank onto different
processors--------------------------------------------------
        PLPA_CPU_ZERO(&mycpu);
        PLPA_CPU_SET(myrank,&mycpu);
        error_code=plpa_sched_setaffinity(0, sizeof(plpa_cpu_set_t),
&mycpu);//set first argument to zero to use calling process pid
        std::cout << "\tProcess of rank:"<< myrank << "
plpa_sched_setaffinity returned: " << error_code <<std::endl;
        error_code=plpa_sched_getaffinity(0, sizeof(plpa_cpu_set_t),
&mycpu_check);//set first argument to zero to use calling process pid
        std::cout << "\tProcess of rank:"<< myrank << "
plpa_sched_getaffinity returned: " << error_code <<std::endl;
        if(1)
                std::cout << "\tProcess of rank:"<< myrank << "
plpa_sched_setaffinity seems to have worked" <<std::endl;
        if(myrank==0)
        {

                error_code=MPI_Comm_size(MPI_COMM_WORLD,&ntasks);

                if (PLPA_PROBE_OK == plpa_api_probe())
                {
                        std::cout << "PLP is working on processor for rank 0"
<< std::endl;
                }
                
                startTime = clock ();
                for(rank=1;rank<ntasks;rank++)
                {
                        std::cout << "Processor rank=0, Sent
limit="<<limit<<" To Processor: "<< rank << std::endl;
        
error_code=MPI_Send(&limit,1,MPI_LONG,rank,DUMMYTAG,MPI_COMM_WORLD);
                }
                for(rank=1;rank<ntasks;rank++)//Get back results in order

                {
        
error_code=MPI_Recv(&rank,1,MPI_LONG,rank,MPI_ANY_TAG,MPI_COMM_WORLD,&status)
;
                        std::cout << "\tProcess of rank:"<< rank << "
Completed execution" << std::endl;
                }
                stopTime = clock ();
                std::cout << "MANAGER::main: Execution Time: " <<
((double)(stopTime-startTime))/((double)CLOCKS_PER_SEC) << " sec" <<
std::endl;
        }
        else
        {
        
error_code=MPI_Recv(&limit,1,MPI_LONG,0,MPI_ANY_TAG,MPI_COMM_WORLD,&status);
                for(i=0;i<limit;i++)
                        z=sin(3.14159);
        
error_code=MPI_Send(&myrank,1,MPI_LONG,0,DUMMYTAG,MPI_COMM_WORLD);
        }
        error_code=MPI_Finalize(); /* cleanup MPI */

        return 0;
}

-----Original Message-----
From: plpa-users-bounces_at_[hidden]
[mailto:plpa-users-bounces_at_[hidden]] On Behalf Of Jeff Squyres
Sent: Monday, January 14, 2008 10:13 PM
To: PLPA users list
Subject: Re: [PLPA users] Newbie question

One would hope that OMPI's affinity_alone param should do what it is supposed
to do, but perhaps it's not -- your explicit use of PLPA is a good way to
test it.

There could be a few things occurring here:

- it doesn't seem like you are checking the return code of
plpa_sched_setaffinity. You might want to ensure that it's returning
success.

- you might want to call plpa_sched_getaffinity and verify that the mask you
set is the mask that the OS actually uses (it *should* be,
but...)

- also make sure that there aren't other processes consuming your cores --
e.g., try running the same test on 6 cores instead of 8 to let other
OS/daemons have 2 free cores without trashing the execute times of your MPI
processes.

On Jan 12, 2008, at 10:16 PM, Hughes, Mike wrote:

> I am trying to use the plpa library to lock each rank of an MPI
> program onto different processors. I have eight cores on my computer.
> I've written the test program below, but I am not sure it is actually
> doing what I intended (assigning a process to one and only one core)
> since the execution times of the program vary by up to a factor of two
>
> (I used to have an eight core workstation built using an intel
> motherboard and I could run MPI programs on it with the command:
> mpirun --mca mpi_paffinity_alone 1 -np 8 <ProgramName> and execution
> times varied by only
> 0.01 seconds out of ~48 sec. I'm now using a supermicro-based
> workstation running ubuntu gutsy and the --mca mpi_paffinity_alone 1
> results in execution times varying by over a factor of two. In
> addition, total exection time has more than doubled comapred to intel
> based system running the same version of linux. The OpenMPI FAQ seems
> to suggest that this may be due to processor affinity not working.)
>
> The relevant lines in the code below are:
>
> //USE PLPA To lock each rank onto different
> processors---------------------------------
> PLPA_CPU_ZERO(&mycpu);
> PLPA_CPU_SET(myrank,&mycpu);
> error_code=plpa_sched_setaffinity(0, sizeof(plpa_cpu_set_t),
> &mycpu);
> //set first argument to zero to use calling process pid
>
> Is this the correct way to use the library?
>
> BTW the results of plpa_info are:
> [msh_at_hugherNaught] $ plpa_info
> PLPA_PROBE_OK
> [msh_at_hugherNaught] $
>
> Regards,
>
> msh
>
> Test program listing....
> #include <string>
> #include <iostream>
>
> #include <math.h>
> #include <time.h>
> #include <mpi.h>
>
> #ifdef __cplusplus
> extern "C"
> {
> #include <plpa.h>
> }
> #else
> #include <plpa.h>
> #endif
> const int DUMMYTAG=1;
>
> //I usually use my favorite debugger for whatever platform I'm on (I
> use //Sun's workshop debugger on Solaris) and put in this type of
> code, usually //pretty soon after MPI_Init():
> static void wait_for_debugger(void)
> {
> int error_code;
> int rank;
>
> error_code=MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> if (rank == 0)
> {
> printf("Waiting for debugger attachment. Please hit
> enter.\n");
> getchar();
> }
>
> error_code=MPI_Barrier(MPI_COMM_WORLD);
> }
>
> /
> **********************************************************************
> ******
> *************/
> // compilation: mpic++ plpa_use.cpp -lplpa -o plpa_use // Invocation:
> (assuming you want eight processes on eight processors) // mpirun -np
> <numberOfProcessors> plpa_use <limit> // will compute sin(3.14159)
> <limit> times on <numberOfProcessors> processors // typical
> invocation: mpirun -np 8 plpa_use 10000000 /
> **********************************************************************
> ******
> *************/
> int main(int argc, char *argv[]);
>
> int main(int argc, char *argv[])
> {
> int ntasks,error_code, myrank,rank,limit,i;
> double z;
> clock_t startTime, stopTime;
> MPI_Status status;
> plpa_cpu_set_t mycpu;
>
> MPI_Init(&argc, &argv);
> limit = atoi(argv[1]);
>
> error_code=MPI_Errhandler_set(MPI_COMM_WORLD,MPI_ERRORS_RETURN);
> error_code=MPI_Comm_rank(MPI_COMM_WORLD,&myrank);
> error_code=MPI_Comm_size(MPI_COMM_WORLD,&ntasks);
>
> wait_for_debugger();
>
> //USE PLPA To lock each rank onto different
> processors---------------------------------
> PLPA_CPU_ZERO(&mycpu);
> PLPA_CPU_SET(myrank,&mycpu);
> error_code=plpa_sched_setaffinity(0, sizeof(plpa_cpu_set_t),
> &mycpu);
> //set first argument to zero to use calling process pid
>
> if(myrank==0)
> {
> error_code=MPI_Comm_size(MPI_COMM_WORLD,&ntasks);
>
> if (PLPA_PROBE_OK == plpa_api_probe())
> {
> std::cout << "PLP is working on processor for
> rank 0"
> << std::endl;
> }
>
> startTime = clock ();
> for(rank=1;rank<ntasks;rank++)
> {
> std::cout << "Processor rank=0, Sent
> limit="<<limit<<" To Processor: "<<
> rank << std::endl;
>
> error_code=MPI_Send(&limit,1,MPI_LONG,rank,DUMMYTAG,MPI_COMM_WORLD);
> }
> for(rank=1;rank<ntasks;rank++)//Get back results in
> order
>
> {
>
> error_code=MPI_Recv(&rank,
> 1,MPI_LONG,rank,MPI_ANY_TAG,MPI_COMM_WORLD,&status)
> ;
> std::cout << "\tProcess of rank:"<< rank << "
> Completed execution" << std::endl;
> }
> stopTime = clock ();
> std::cout << "MANAGER::main: Execution Time: " <<
> ((double)(stopTime-startTime))/
> ((double)CLOCKS_PER_SEC) << "
> sec" << std::endl;
> }
> else
> {
>
> error_code=MPI_Recv(&limit,1,MPI_LONG,
> 0,MPI_ANY_TAG,MPI_COMM_WORLD,&status);
> for(i=0;i<limit;i++)
> z=sin(3.14159);
>
> error_code=MPI_Send(&myrank,1,MPI_LONG,0,DUMMYTAG,MPI_COMM_WORLD);
> }
> error_code=MPI_Finalize(); /* cleanup MPI */
>
> return 0;
> }
>
> _______________________________________________
> plpa-users mailing list
> plpa-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/plpa-users

--
Jeff Squyres
Cisco Systems
_______________________________________________
plpa-users mailing list
plpa-users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/plpa-users


gkrellShoot_01-15-08_111754.jpg