Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Process Migration
From: Ioannis Papadopoulos (giannis.papadopoulos_at_[hidden])
Date: 2011-11-11 14:51:02


Offtopic: You might want to have a look at AMPI:
http://charm.cs.uiuc.edu/research/ampi

On 11/10/2011 10:30 AM, Mudassar Majeed wrote:
> For example there are 10 nodes, and each node contains 20 cores. We
> will have 200 cores in total and let say there are 2000 MPI processes.
> We start the application with 10 MPI on each core. Let say Comm(Pi,
> Pj) denotes how much communication Pi and Pj make with each other and
> let say each process Pi has to communicate with few other processes
> Pj, Pk, Pl, Pm..... Pz. Secondly let say Load(Pi) denotes the
> computational load of process Pi.
>
> Now, we know that sending a message between two nodes is more
> expensive then sending a message within a node (two processes that
> communicate reside on the cores that exist in the same node). This is
> true atleast in my supercomputing centers that I use. In my previous
> work I only consider Load[ ] and not Comm[ ]. In that work, all the
> MPI processes calculate their new ranks and then call MPI_Comm_split
> with key = new_rank and color = 0. So all the processes get the new
> rank and then the actual data is provided to each process for
> computation. We have found that the total execution time decreases.
> Now we need to consider the communications as well. We will bring the
> computational load balance but those MPI which communicate more will
> be mapped to the same node (not necessarily same cores). I have solved
> this optimization problem using ILP and that shows good results. But
> the thing is, in the solution I have found that after applying ILP or
> my heuristic, the cores (on all nodes) will no longer contain same
> number of MPI processes (load and communications are balanced instead
> of count of MPI processes per core). So this means either I use
> process migration for few processes or I run more than 2000 (means at
> every core I run few more processes) so that at the end imbalance in
> the number or MPI processes per core can be achieved (to achieve
> balance in load and communications). I need your suggestions in these
> regards,
>
> thanks and best regards,
> Mudassar
> ------------------------------------------------------------------------
> *From:* Josh Hursey <jjhursey_at_[hidden]>
> *To:* Open MPI Users <users_at_[hidden]>
> *Cc:* Mudassar Majeed <mudassarm30_at_[hidden]>
> *Sent:* Thursday, November 10, 2011 5:11 PM
> *Subject:* Re: [OMPI users] Process Migration
>
> Note that the "migrate me from my current node to node <foo>" scenario
> is covered by the migration API exported by the C/R infrastructure, as
> I noted earlier.
> http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_migrate
>
> The "move rank N to node <foo>" scenario could probably be added as an
> extension of this interface (since you can do that via the command
> line now) if that is what you are looking for.
>
> -- Josh
>
> On Thu, Nov 10, 2011 at 11:03 AM, Ralph Castain <rhc_at_[hidden]
> <mailto:rhc_at_[hidden]>> wrote:
> > So what you are looking for is an MPI extension API that let's you say
> > "migrate me from my current node to node <foo>"? Or do you have a
> rank that
> > is the "master" that would order "move rank N to node <foo>"?
> > Either could be provided, I imagine - just want to ensure I
> understand what
> > you need. Can you pass along a brief description of the syntax and
> > functionality you would need?
> >
> > On Nov 10, 2011, at 8:27 AM, Mudassar Majeed wrote:
> >
> > Thank you for your reply. In our previous publication, we have
> figured it
> > out that run more than one processes on cores and balancing the
> > computational load considerably reduces the total execution time.
> You know
> > the MPI_Graph_create function, we created another function
> MPI_Load_create
> > that maps the processes on cores such that balance of computational
> load can
> > be achieved on cores. We were having some issues with increase in
> > communication cost due to ranks rearrangements (due to
> MPI_Comm_split, with
> > color=0), so in this research work we will see how can we balance both
> > computation load on each core and communication load on each node. Those
> > processes that communicate more will reside on the same node keeping the
> > computational load balance over the cores. I solved this problem
> using ILP
> > but ILP takes time and can't be used in run time so I am thinking
> about an
> > heuristic. That's why I want to see if it is possible to migrate a
> process
> > from one core to another or not. Then I will see how good my
> heuristic will
> > be.
> >
> > thanks
> > Mudassar
> >
> > ________________________________
> > From: Jeff Squyres <jsquyres_at_[hidden] <mailto:jsquyres_at_[hidden]>>
> > To: Mudassar Majeed <mudassarm30_at_[hidden]
> <mailto:mudassarm30_at_[hidden]>>; Open MPI Users
> > <users_at_[hidden] <mailto:users_at_[hidden]>>
> > Cc: Ralph Castain <rhc_at_[hidden] <mailto:rhc_at_[hidden]>>
> > Sent: Thursday, November 10, 2011 2:19 PM
> > Subject: Re: [OMPI users] Process Migration
> >
> > On Nov 10, 2011, at 8:11 AM, Mudassar Majeed wrote:
> >
> >> Thank you for your reply. I am implementing a load balancing
> function for
> >> MPI, that will balance the computation load and the communication
> both at a
> >> time. So my algorithm assumes that all the cores may at the end get
> >> different number of processes to run.
> >
> > Are you talking about over-subscribing cores? I.e., putting more
> than 1 MPI
> > process on each core?
> >
> > In general, that's not a good idea.
> >
> >> In the beginning (before that function will be called), each core will
> >> have equal number of processes. So I am thinking either to start more
> >> processes on each core (than needed) and run my function for load
> balancing
> >> and then block the remaining processes (on each core). In this way
> I will be
> >> able to achieve different number of processes per core.
> >
> > Open MPI spins aggressively looking for network progress. For
> example, if
> > you block in an MPI_RECV waiting for a message, Open MPI is actively
> banging
> > on the CPU looking for network progress. Because of this (and other
> > reasons), you probably do not want to over-subscribe your processors
> > (meaning: you probably don't want to put more than 1 process per core).
> >
> > --
> > Jeff Squyres
> > jsquyres_at_[hidden] <mailto:jsquyres_at_[hidden]>
> > For corporate legal information go to:
> > http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> >
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden] <mailto:users_at_[hidden]>
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
>
>
> --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users