Open MPI logo

PLPA Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all PLPA Users mailing list

Subject: Re: [PLPA users] Newbie question
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-01-18 21:04:37


On Jan 15, 2008, at 12:31 PM, Hughes, Mike wrote:

> Point 1: "- it doesn't seem like you are checking the return code of
> plpa_sched_setaffinity. You might want to ensure that it's returning
> success." -- I modified the test prpgram to test return value for
> all calls
> to: plpa_sched_setaffinity(), the return value is 0 for all ranks.

Bummer. :-(

> Point 2: "- you might want to call plpa_sched_getaffinity and verify
> that the
> mask you set is the mask that the OS actually uses (it *should* be,
> but...)" -- This is the toughest one to check since I don't really
> understand
> the plpa_cpu_set_t type. I've modified my test program (included
> below) to
> have two variables of this type (mycpu,mycpu_check)one is set with the
> function:
> PLPA_CPU_ZERO(&mycpu);
> The other is set with:
> error_code=plpa_sched_getaffinity(0, sizeof(plpa_cpu_set_t),
> &mycpu_check);
> (This is done after the call to plpa_sched_setaffinity, I've also
> checked
> that the return value set in error_code is zero)
> I can use my debugger (kdbg) to compare the values of the two
> variables and
> they appear to be the same.
> (I've included a screen shot of the debugger to illustrate exactly
> what I
> mean -- I hope)

The PLPA_CPU_* macros are similar to the FD_* macros (see select(2)).
So what I meant was something like this (typed off the top of my head;
not verified):

   for (i = 0; i < PLPA_BITMASK_CPU_MAX; ++i) {
     printf("cpu %d is %d\n", i, PLPA_CPU_ISSET(&mycpu, i));
   }

Or, it would probably be better to check the PLPA_CPU_ISSET value from
what you set with setaffinity vs. what you got back from getaffinity...?

I wonder if your virtual processor ID's are not contiguous, perhaps...?

> Point 3: "also make sure that there aren't other processes consuming
> your
> cores -- e.g., try running the same test on 6 cores instead of 8 to
> let other
> OS/daemons have 2 free cores without trashing the execute times of
> your MPI
> processes." -- I modifed my execution to use 6 processors,
> ececution time
> still varies by over a factor of two.

Bummer.

>
> Is there anything else I should check?
>
> Thanks for your help.
>
> Regards,
>
> Mike Hughes
>
>
> New test program.....
> #include <string>
> #include <iostream>
>
> #include <math.h>
> #include <time.h>
> #include <mpi.h>
>
> #ifdef __cplusplus
> extern "C"
> {
> #include <plpa.h>
> }
> #else
> #include <plpa.h>
> #endif
> const int DUMMYTAG=1;
>
> //I usually use my favorite debugger for whatever platform I'm on (I
> use
> //Sun's workshop debugger on Solaris) and put in this type of code,
> usually
> //pretty soon after MPI_Init():
> static void wait_for_debugger(void)
> {
> int error_code;
> int rank;
>
> error_code=MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> if (rank == 0)
> {
> printf("Waiting for debugger attachment. Please hit
> enter.\n");
> getchar();
> }
>
> error_code=MPI_Barrier(MPI_COMM_WORLD);
> }
>
> /
> ****************************************************************************
> *************/
> // comppilation: mpic++ plpa_use.cpp -lplpa -o plpa_use
> // Invocation: (assuming you want eight processes on eight processors)
> // mpirun -np <numberOfProcessors> plpa_use <limit>
> // will compute sin(3.14159) <limit> times on <numberOfProcessors>
> processors
> // typical invocation: mpirun -np 8 plpa_use 10000000
> /
> ****************************************************************************
> *************/
> int main(int argc, char *argv[]);
>
> int main(int argc, char *argv[])
> {
> int ntasks,error_code, myrank,rank,limit,i;
> double z;
> clock_t startTime, stopTime;
> MPI_Status status;
> plpa_cpu_set_t mycpu,mycpu_check;
>
> MPI_Init(&argc, &argv);
> limit = atoi(argv[1]);
> error_code=MPI_Errhandler_set(MPI_COMM_WORLD,MPI_ERRORS_RETURN);
> error_code=MPI_Comm_rank(MPI_COMM_WORLD,&myrank);
> error_code=MPI_Comm_size(MPI_COMM_WORLD,&ntasks);
>
> wait_for_debugger();
>
> //USE PLPA To lock each rank onto different
> processors--------------------------------------------------
> PLPA_CPU_ZERO(&mycpu);
> PLPA_CPU_SET(myrank,&mycpu);
> error_code=plpa_sched_setaffinity(0, sizeof(plpa_cpu_set_t),
> &mycpu);//set first argument to zero to use calling process pid
> std::cout << "\tProcess of rank:"<< myrank << "
> plpa_sched_setaffinity returned: " << error_code <<std::endl;
> error_code=plpa_sched_getaffinity(0, sizeof(plpa_cpu_set_t),
> &mycpu_check);//set first argument to zero to use calling process pid
> std::cout << "\tProcess of rank:"<< myrank << "
> plpa_sched_getaffinity returned: " << error_code <<std::endl;
> if(1)
> std::cout << "\tProcess of rank:"<< myrank << "
> plpa_sched_setaffinity seems to have worked" <<std::endl;
> if(myrank==0)
> {
>
> error_code=MPI_Comm_size(MPI_COMM_WORLD,&ntasks);
>
> if (PLPA_PROBE_OK == plpa_api_probe())
> {
> std::cout << "PLP is working on processor for rank 0"
> << std::endl;
> }
>
> startTime = clock ();
> for(rank=1;rank<ntasks;rank++)
> {
> std::cout << "Processor rank=0, Sent
> limit="<<limit<<" To Processor: "<< rank << std::endl;
>
> error_code=MPI_Send(&limit,1,MPI_LONG,rank,DUMMYTAG,MPI_COMM_WORLD);
> }
> for(rank=1;rank<ntasks;rank++)//Get back results in order
>
> {
>
> error_code=MPI_Recv(&rank,
> 1,MPI_LONG,rank,MPI_ANY_TAG,MPI_COMM_WORLD,&status)
> ;
> std::cout << "\tProcess of rank:"<< rank << "
> Completed execution" << std::endl;
> }
> stopTime = clock ();
> std::cout << "MANAGER::main: Execution Time: " <<
> ((double)(stopTime-startTime))/((double)CLOCKS_PER_SEC) << " sec" <<
> std::endl;
> }
> else
> {
>
> error_code=MPI_Recv(&limit,1,MPI_LONG,
> 0,MPI_ANY_TAG,MPI_COMM_WORLD,&status);
> for(i=0;i<limit;i++)
> z=sin(3.14159);
>
> error_code=MPI_Send(&myrank,1,MPI_LONG,0,DUMMYTAG,MPI_COMM_WORLD);
> }
> error_code=MPI_Finalize(); /* cleanup MPI */
>
> return 0;
> }
>
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: plpa-users-bounces_at_[hidden]
> [mailto:plpa-users-bounces_at_[hidden]] On Behalf Of Jeff Squyres
> Sent: Monday, January 14, 2008 10:13 PM
> To: PLPA users list
> Subject: Re: [PLPA users] Newbie question
>
> One would hope that OMPI's affinity_alone param should do what it is
> supposed
> to do, but perhaps it's not -- your explicit use of PLPA is a good
> way to
> test it.
>
> There could be a few things occurring here:
>
> - it doesn't seem like you are checking the return code of
> plpa_sched_setaffinity. You might want to ensure that it's returning
> success.
>
> - you might want to call plpa_sched_getaffinity and verify that the
> mask you
> set is the mask that the OS actually uses (it *should* be,
> but...)
>
> - also make sure that there aren't other processes consuming your
> cores --
> e.g., try running the same test on 6 cores instead of 8 to let other
> OS/daemons have 2 free cores without trashing the execute times of
> your MPI
> processes.
>
>
>
>
>
> On Jan 12, 2008, at 10:16 PM, Hughes, Mike wrote:
>
>> I am trying to use the plpa library to lock each rank of an MPI
>> program onto different processors. I have eight cores on my
>> computer.
>> I've written the test program below, but I am not sure it is actually
>> doing what I intended (assigning a process to one and only one core)
>> since the execution times of the program vary by up to a factor of
>> two
>>
>> (I used to have an eight core workstation built using an intel
>> motherboard and I could run MPI programs on it with the command:
>> mpirun --mca mpi_paffinity_alone 1 -np 8 <ProgramName> and execution
>> times varied by only
>> 0.01 seconds out of ~48 sec. I'm now using a supermicro-based
>> workstation running ubuntu gutsy and the --mca mpi_paffinity_alone 1
>> results in execution times varying by over a factor of two. In
>> addition, total exection time has more than doubled comapred to intel
>> based system running the same version of linux. The OpenMPI FAQ
>> seems
>> to suggest that this may be due to processor affinity not working.)
>>
>> The relevant lines in the code below are:
>>
>> //USE PLPA To lock each rank onto different
>> processors---------------------------------
>> PLPA_CPU_ZERO(&mycpu);
>> PLPA_CPU_SET(myrank,&mycpu);
>> error_code=plpa_sched_setaffinity(0, sizeof(plpa_cpu_set_t),
>> &mycpu);
>> //set first argument to zero to use calling process pid
>>
>> Is this the correct way to use the library?
>>
>> BTW the results of plpa_info are:
>> [msh_at_hugherNaught] $ plpa_info
>> PLPA_PROBE_OK
>> [msh_at_hugherNaught] $
>>
>> Regards,
>>
>> msh
>>
>> Test program listing....
>> #include <string>
>> #include <iostream>
>>
>> #include <math.h>
>> #include <time.h>
>> #include <mpi.h>
>>
>> #ifdef __cplusplus
>> extern "C"
>> {
>> #include <plpa.h>
>> }
>> #else
>> #include <plpa.h>
>> #endif
>> const int DUMMYTAG=1;
>>
>> //I usually use my favorite debugger for whatever platform I'm on (I
>> use //Sun's workshop debugger on Solaris) and put in this type of
>> code, usually //pretty soon after MPI_Init():
>> static void wait_for_debugger(void)
>> {
>> int error_code;
>> int rank;
>>
>> error_code=MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>> if (rank == 0)
>> {
>> printf("Waiting for debugger attachment. Please hit
>> enter.\n");
>> getchar();
>> }
>>
>> error_code=MPI_Barrier(MPI_COMM_WORLD);
>> }
>>
>> /
>> **********************************************************************
>> ******
>> *************/
>> // compilation: mpic++ plpa_use.cpp -lplpa -o plpa_use //
>> Invocation:
>> (assuming you want eight processes on eight processors) // mpirun -np
>> <numberOfProcessors> plpa_use <limit> // will compute sin(3.14159)
>> <limit> times on <numberOfProcessors> processors // typical
>> invocation: mpirun -np 8 plpa_use 10000000 /
>> **********************************************************************
>> ******
>> *************/
>> int main(int argc, char *argv[]);
>>
>> int main(int argc, char *argv[])
>> {
>> int ntasks,error_code, myrank,rank,limit,i;
>> double z;
>> clock_t startTime, stopTime;
>> MPI_Status status;
>> plpa_cpu_set_t mycpu;
>>
>> MPI_Init(&argc, &argv);
>> limit = atoi(argv[1]);
>>
>> error_code=MPI_Errhandler_set(MPI_COMM_WORLD,MPI_ERRORS_RETURN);
>> error_code=MPI_Comm_rank(MPI_COMM_WORLD,&myrank);
>> error_code=MPI_Comm_size(MPI_COMM_WORLD,&ntasks);
>>
>> wait_for_debugger();
>>
>> //USE PLPA To lock each rank onto different
>> processors---------------------------------
>> PLPA_CPU_ZERO(&mycpu);
>> PLPA_CPU_SET(myrank,&mycpu);
>> error_code=plpa_sched_setaffinity(0, sizeof(plpa_cpu_set_t),
>> &mycpu);
>> //set first argument to zero to use calling process pid
>>
>> if(myrank==0)
>> {
>> error_code=MPI_Comm_size(MPI_COMM_WORLD,&ntasks);
>>
>> if (PLPA_PROBE_OK == plpa_api_probe())
>> {
>> std::cout << "PLP is working on processor for
>> rank 0"
>> << std::endl;
>> }
>>
>> startTime = clock ();
>> for(rank=1;rank<ntasks;rank++)
>> {
>> std::cout << "Processor rank=0, Sent
>> limit="<<limit<<" To Processor: "<<
>> rank << std::endl;
>>
>> error_code=MPI_Send(&limit,1,MPI_LONG,rank,DUMMYTAG,MPI_COMM_WORLD);
>> }
>> for(rank=1;rank<ntasks;rank++)//Get back results in
>> order
>>
>> {
>>
>> error_code=MPI_Recv(&rank,
>> 1,MPI_LONG,rank,MPI_ANY_TAG,MPI_COMM_WORLD,&status)
>> ;
>> std::cout << "\tProcess of rank:"<< rank << "
>> Completed execution" << std::endl;
>> }
>> stopTime = clock ();
>> std::cout << "MANAGER::main: Execution Time: " <<
>> ((double)(stopTime-startTime))/
>> ((double)CLOCKS_PER_SEC) << "
>> sec" << std::endl;
>> }
>> else
>> {
>>
>> error_code=MPI_Recv(&limit,1,MPI_LONG,
>> 0,MPI_ANY_TAG,MPI_COMM_WORLD,&status);
>> for(i=0;i<limit;i++)
>> z=sin(3.14159);
>>
>> error_code=MPI_Send(&myrank,1,MPI_LONG,0,DUMMYTAG,MPI_COMM_WORLD);
>> }
>> error_code=MPI_Finalize(); /* cleanup MPI */
>>
>> return 0;
>> }
>>
>> _______________________________________________
>> plpa-users mailing list
>> plpa-users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/plpa-users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> plpa-users mailing list
> plpa-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/plpa-users
> <gkrellShoot_01-15-08_111754.jpg><mime-attachment.txt>

-- 
Jeff Squyres
Cisco Systems