Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Change in OPAL / OMPI DPM system time during MPI_INIT
From: Barrett, Brian W (bwbarre_at_[hidden])
Date: 2010-11-22 11:35:49


Um, the counter starts initialized at one.

Brian

On Nov 22, 2010, at 9:32 AM, Jeff Squyres wrote:

> A user noticed a specific change that we made between 1.4.2 and 1.4.3:
>
> https://svn.open-mpi.org/trac/ompi/changeset/23448
>
> which is from CMR https://svn.open-mpi.org/trac/ompi/ticket/2489, and originally from trunk https://svn.open-mpi.org/trac/ompi/changeset/23434. I removed the opal_progress_event_users_decrement() from ompi_mpi_init() because the ORTE DPM does its own _increment() and _decrement().
>
> However, it seems that there was an unintended consequence of this -- look at the annotated Ganglia graph that the user sent (see attached). In 1.4.2, all of the idle time was "user" CPU usage. In 1.4.3, it's split between user and system CPU usage. The application that he used to test is basically an init / finalize test (with some additional MPI middleware). See:
>
> http://www.open-mpi.org/community/lists/users/2010/11/14773.php
>
> Can anyone think of why this occurs, and/or if it's a Bad Thing?
>
> If removing this decrement enabled a bunch more system CPU time, that would seem to imply that we're calling libevent more frequently than we used to (vs. polling the opal event callbacks), and therefore that there might now be an unmatched increment somewhere.
>
> Right...?
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> <openmpi143.jpeg><ATT00002..txt>

-- 
  Brian W. Barrett
  Dept. 1423: Scalable System Software
  Sandia National Laboratories