Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] 2 to 1 oversubscription
From: Robert Kubrick (robertkubrick_at_[hidden])
Date: 2009-07-15 20:11:59


Jody,

On Linux, you can check which process are running on which core in
top, but I don't think the mac os version allows this. The OS *will*
move processes on different cores because of the time-sharing nature
of the scheduling algorithm. There are a lot more details online
about what this means, but basically a time-sharing system will try
to distribute cpu time "equally" between processes. In some cases
this translates into reducing the priority, or moving out, the most
cpu hungry tasks. Sometimes this is exactly the opposite of how a
real-time or parallel application is supposed to run.

Anyway, any parallel benchmark in my view should be run using the
real time scheduling algorithm and setting processor affinity for
each process. On Linux, the two commands used to this are 'chrt' and
'taskset'.

On Jul 15, 2009, at 9:17 AM, Klymak Jody wrote:

> Hi Robert,
>
> Sorry if this is offtopic for the more knowledgeable here...
>
> On 14-Jul-09, at 7:50 PM, Robert Kubrick wrote:
>> By setting processor affinity you can force execution of each
>> process on a specific core, thus limiting context switching. I
>> know affinity wasn't supported on MacOS last year, I don't know if
>> the situation has changed.
>> But running oversubscription without process affinity might cancel
>> the benefit of SMT because the OS will try to allocate each
>> process to whatever core becomes available, thus increasing
>> context switching.
>
> This is a little over my head (i.e. SMT?). However, to explain,
> the jobs were a gridded simulation, with the grid divided into 8,
> or 16 'tiles' . Each core gets a tile and passes info the the
> adjacent ones. I would be very surprised to find out that the
> tiles were changing cores mid simulation. Why would the OS do
> something so silly?
>
> The machines were certainly still running other processes to keep
> the operating system going. If you watch the cpu monitor, the
> total would occasionally drop from 100% to 98% as some operating
> system process kicked in, but in general the jobs were pegged,
> leaving little opportunity for one core to decide to take over what
> another core was already doing.
>
> Thanks, and if I'm incorrect about how the jobs get distributed
> between cores, I'd be more than happy to be corrected. As I said,
> my knowledge of this stuff is pretty abstract.
>
> Thanks, Jody
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users