where "dir" is a process-number-dependent directory, to ensure the processes don't over-write each other, and machinefile is written earlier by using hostname to obtain the node of the current process, so this new program launches using the same node as the one that launches it.
In fortran, the program automatically waits until the system call is complete.
Since you mentioned MPI_COMM_SPAWN, I looked into this. I read that it's non-blocking, so somehow I'd have to prevent the program from moving forwards until it was complete, and secondly, I need to "cd" into the directory I mentioned above, before running the external command, and I don't know how one would achieve this with this command.
Do you think MPI_COMM_SPAWN can help? I appreciate your time.
From: firstname.lastname@example.org Date: Fri, 5 Mar 2010 07:55:59 -0700 To: email@example.com Subject: Re: [OMPI users] running external program on same processor (Fortran)
How are you trying to start this external program? With an MPI_Comm_spawn? Or are you just fork/exec'ing it?
How are you waiting for this external program to finish?
On Mar 5, 2010, at 7:52 AM, abc def wrote:
Thanks for the comments. Indeed, until yesterday, I didn't realise the difference between MVAPICH, MVAPICH2 and Open-MPI.
This problem has moved from mvapich2 to open-mpi now however, because I now realise that the production environment uses Open-MPI, which means my solution for mvapich2 doesn't work now. So if I may ask your kind assistance:
Just to re-cap, I have an MPI fortran program, which runs on N nodes, and each node needs to run an external program. This is external program was written for MPI, although I want to run it in serial as part of the process on each node.
Is there any way at all to launch this external MPI program so it's treated simply as a (serial) extension of the current node's processes? As I say, the MPI originating program simply waits for the external program to finish before continuing, so it it's essentially a bit like a "subroutine", in that sense.
I'm having problems launching this external program from within my MPI program, under the open-mpi system, even without worrying about node assignment, and I think this might be because the system detects that I'm trying to launch another process from one of the nodes, and stops it. I'm guessing here, but it simply stops with an error saying the MPI process was stopped.
Any help is very much appreciated; I have been going at this for more than a day now and don't seem to be getting anywhere.
It also would have been really helpful to know that you were using MVAPICH and -not- Open MPI as this mailing list is for the latter. We could have directed you to the appropriate place if we had known.
On Mar 3, 2010, at 5:17 AM, abc def wrote:
I don't know (I'm a little new to this area), but I figured out how to get around the problem:
Using SGE and MVAPICH2, the "-env MV2_CPU_MAPPING 0:1....." option in mpiexec seems to do the trick.
So when calling the external program with mpiexec, I map the called process to the current core rank, and it seems to stay distributed and separated as I want.
Hope someone else finds this useful in the future.
> Date: Wed, 3 Mar 2010 13:13:01 +1100 > Subject: Re: [OMPI users] running external program on same processor (Fortran) > > Surely this is the problem of the scheduler that your system uses, > rather than MPI? > > > On Wed, 2010-03-03 at 00:48 +0000, abc def wrote: > > Hello, > > > > I wonder if someone can help. > > > > The situation is that I have an MPI-parallel fortran program. I run it > > and it's distributed on N cores, and each of these processes must call > > an external program. > > > > This external program is also an MPI program, however I want to run it > > in serial, on the core that is calling it, as if it were part of the > > fortran program. The fortran program waits until the external program > > has completed, and then continues. > > > > The problem is that this external program seems to run on any core, > > and not necessarily the (now idle) core that called it. This slows > > things down a lot as you get one core doing multiple tasks. > > > > Can anyone tell me how I can call the program and ensure it runs only > > on the core that's calling it? Note that there are several cores per > > node. I can ID the node by running the hostname command (I don't know > > a way to do this for individual cores). > > > > Thanks! > > > > ==== > > > > Extra information that might be helpful: > > > > If I simply run the external program from the command line (ie, type > > "/path/myprogram.ex <enter>"), it runs fine. If I run it within the > > fortran program by calling it via > > > > CALL SYSTEM("/path/myprogram.ex") > > > > it doesn't run at all (doesn't even start) and everything crashes. I > > don't know why this is. > > > > If I call it using mpiexec: > > > > CALL SYSTEM("mpiexec -n 1 /path/myprogram.ex") > > > > then it does work, but I get the problem that it can go on any core. > > > > ______________________________________________________________________ > > Do you want a Hotmail account? Sign-up now - Free > > _______________________________________________ > > users mailing list > >firstname.lastname@example.org > >http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list >email@example.com >http://www.open-mpi.org/mailman/listinfo.cgi/users