Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] users Digest, Vol 2064, Issue 3
From: Mudassar Majeed (mudassarm30_at_[hidden])
Date: 2011-11-10 14:22:55

I have mentioned it two times that by doing this we got good results and the question was just in terms of process migration ...... I have explained the current work as well. regards, Mudassar ________________________________ From: "users-request_at_[hidden]" <users-request_at_[hidden]> To: users_at_[hidden] Sent: Thursday, November 10, 2011 7:48 PM Subject: users Digest, Vol 2064, Issue 3 Send users mailing list submissions to     users_at_[hidden] To subscribe or unsubscribe via the World Wide Web, visit or, via email, send a message with subject or body 'help' to     users-request_at_[hidden] You can reach the person managing the list at     users-owner_at_[hidden] When replying, please edit your Subject line so it is more specific than "Re: Contents of users digest..." Today's Topics:   1. Re: Process Migration (Jeff Squyres)   2. Re: Problems compiling and running openmpi-1.4.4       (amosleff_at_[hidden]) ---------------------------------------------------------------------- Message: 1 Date: Thu, 10 Nov 2011 12:59:17 -0500 From: Jeff Squyres <jsquyres_at_[hidden]> Subject: Re: [OMPI users] Process Migration To: Open MPI Users <users_at_[hidden]> Message-ID: <811FFDFC-C3B6-4BF7-9E53-95C0B572FD74_at_[hidden]> Content-Type: text/plain; charset=us-ascii On Nov 10, 2011, at 11:30 AM, Mudassar Majeed wrote: > For example there are 10 nodes, and each node contains 20 cores. We will have 200 cores in total and let say there are 2000 MPI processes. We start the application with 10 MPI on each core. Is this just to be able to simulate very large MPI jobs, or are you thinking that people will actually run that way (heavily over-subscribing cores)? > Let say Comm(Pi, Pj) denotes how much communication Pi and Pj make with each other and let say each process Pi has to communicate with few other processes Pj, Pk, Pl, Pm..... Pz. Secondly let say Load(Pi) denotes the computational load of process Pi. Depending on how you define Load(Pi), this really only matters if you're over-subscribing processors.  Meaning: if you have only one MPI process per processor core, then Load(Pi) is probably irrelevant (excluding other effects, like cache thrashing, memory and PCI bandwidth usage, etc.). Right? > Now, we know that sending a message between two nodes is more expensive then sending a message within a node (two processes that communicate reside on the cores that exist in the same node). This is true atleast in my supercomputing centers that I use. In my previous work I only consider Load[ ] and not Comm[ ]. In that work, all the MPI processes calculate their new ranks and then call MPI_Comm_split with key = new_rank and color = 0. So all the processes get the new rank and then the actual data is provided to each process for computation. We have found that the total execution time decreases. In an oversubscribed case, I'm still not sure how this works.  Do you have some MPI processes doing work and some not?  (e.g., blocking in sleep() or something) I think the reason for my confusion is that MPI processes are generally designed to run 1 per core (or perhaps 1 MPI process per more-than-1-core, if the MPI process is multi-threaded).  MPI processes are generally assumed to aggressively use the entire computational resource that is given to them -- sharing computational resources (e.g., cores) between multiple MPI processes would seem to violate that assumption, and therefore result in bad overall performance. I feel like I must be missing something in what you're trying to describe... > Now we need to consider the communications as well. We will bring the computational load balance but those MPI which communicate more will be mapped to the same node (not necessarily same cores). I have solved this optimization problem using ILP and that shows good results. But the thing is, in the solution I have found that after applying ILP or my heuristic, the cores (on all nodes) will no longer contain same number of MPI processes (load and communications are balanced instead of count of MPI processes per core). So this means either I use process migration for few processes or I run more than 2000 (means at every core I run few more processes) so that at the end imbalance in the number or MPI processes per core can be achieved (to achieve balance in load and communications). I need your suggestions in these regards, > > thanks and best regards, > Mudassar > From: Josh Hursey <jjhursey_at_[hidden]> > To: Open MPI Users <users_at_[hidden]> > Cc: Mudassar Majeed <mudassarm30_at_[hidden]> > Sent: Thursday, November 10, 2011 5:11 PM > Subject: Re: [OMPI users] Process Migration > > Note that the "migrate me from my current node to node <foo>" scenario > is covered by the migration API exported by the C/R infrastructure, as > I noted earlier. > > > The "move rank N to node <foo>" scenario could probably be added as an > extension of this interface (since you can do that via the command > line now) if that is what you are looking for. > > -- Josh > > On Thu, Nov 10, 2011 at 11:03 AM, Ralph Castain <rhc_at_[hidden]> wrote: > > So what you are looking for is an MPI extension API that let's you say > > "migrate me from my current node to node <foo>"? Or do you have a rank that > > is the "master" that would order "move rank N to node <foo>"? > > Either could be provided, I imagine - just want to ensure I understand what > > you need. Can you pass along a brief description of the syntax and > > functionality you would need? > > > > On Nov 10, 2011, at 8:27 AM, Mudassar Majeed wrote: > > > > Thank you for your reply. In our previous publication, we have figured it > > out that run more than one processes on cores and balancing the > > computational load considerably reduces the total execution time. You know > > the MPI_Graph_create function, we created another function MPI_Load_create > > that maps the processes on cores such that balance of computational load can > > be achieved on cores. We were having some issues with increase in > > communication cost due to ranks rearrangements (due to MPI_Comm_split, with > > color=0), so in this research work we will see how can we balance both > > computation load on each core and communication load on each node. Those > > processes that communicate more will reside on the same node keeping the > > computational load balance over the cores. I solved this problem using ILP > > but ILP takes time and can't be used in run time so I am thinking about an > > heuristic. That's why I want to see if it is possible to migrate a process > > from one core to another or not. Then I will see how good my heuristic will > > be. > > > > thanks > > Mudassar > > > > ________________________________ > > From: Jeff Squyres <jsquyres_at_[hidden]> > > To: Mudassar Majeed <mudassarm30_at_[hidden]>; Open MPI Users > > <users_at_[hidden]> > > Cc: Ralph Castain <rhc_at_[hidden]> > > Sent: Thursday, November 10, 2011 2:19 PM > > Subject: Re: [OMPI users] Process Migration > > > > On Nov 10, 2011, at 8:11 AM, Mudassar Majeed wrote: > > > >> Thank you for your reply. I am implementing a load balancing function for > >> MPI, that will balance the computation load and the communication both at a > >> time. So my algorithm assumes that all the cores may at the end get > >> different number of processes to run. > > > > Are you talking about over-subscribing cores?  I.e., putting more than 1 MPI > > process on each core? > > > > In general, that's not a good idea. > > > >> In the beginning (before that function will be called), each core will > >> have equal number of processes. So I am thinking either to start more > >> processes on each core (than needed) and run my function for load balancing > >> and then block the remaining processes (on each core). In this way I will be > >> able to achieve different number of processes per core. > > > > Open MPI spins aggressively looking for network progress.  For example, if > > you block in an MPI_RECV waiting for a message, Open MPI is actively banging > > on the CPU looking for network progress.  Because of this (and other > > reasons), you probably do not want to over-subscribe your processors > > (meaning: you probably don't want to put more than 1 process per core). > > > > -- > > Jeff Squyres > > jsquyres_at_[hidden] > > For corporate legal information go to: > > > > > > > > > > > > > > _______________________________________________ > > users mailing list > > users_at_[hidden] > > > > > > > > -- > Joshua Hursey > Postdoctoral Research Associate > Oak Ridge National Laboratory > > > > _______________________________________________ > users mailing list > users_at_[hidden] > -- Jeff Squyres jsquyres_at_[hidden] For corporate legal information go to: ------------------------------ Message: 2 Date: Thu, 10 Nov 2011 13:48:06 -0500 From: "amosleff_at_[hidden]" <amosleff_at_[hidden]> Subject: Re: [OMPI users] Problems compiling and running openmpi-1.4.4 To: Open MPI Users <users_at_[hidden]> Message-ID:     <CAHNB0nPJzkK7Q9jxR8UhDrPO_Vvm+Myjy+zZ6LEh6r38k-_-Ag_at_[hidden]> Content-Type: text/plain; charset="iso-8859-1" Hi Jeff,           In the attached file Compile_out.tar.bz2 I have included the out files for config, make, and install.  I also included another copy of the out_test file so that it gives you all of the info that I have.  Again your help is much appreciated. Amos Leffler On Wed, Nov 9, 2011 at 12:23 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote: > On Nov 9, 2011, at 12:16 PM, amosleff_at_[hidden] wrote: > > >            The file was the output to the command: > >                            "mpicc hello_cc.c -o hello_cc > > and lists files which do not appear to be present.  I checked the > permissions and they seem to be correct so I am stumped,  I did use the > make and install commands and they seemed to go properly.  I have the out > files for the three commands and could send them to you if you want. > > Please send all the information listed here: > > > > -- > Jeff Squyres > jsquyres_at_[hidden] > For corporate legal information go to: > > > > _______________________________________________ > users mailing list > users_at_[hidden] > > -------------- next part -------------- HTML attachment scrubbed and removed -------------- next part -------------- A non-text attachment was scrubbed... Name: out_test~ Type: application/octet-stream Size: 581 bytes Desc: not available URL: <> -------------- next part -------------- A non-text attachment was scrubbed... Name: Compile_out.tar.bz2 Type: application/x-bzip2 Size: 85433 bytes Desc: not available URL: <> ------------------------------ _______________________________________________ users mailing list users_at_[hidden] End of users Digest, Vol 2064, Issue 3 **************************************