Hello,

Following advice on other branches of this thread, I've managed to get to the point where I can spawn the second program. I've also managed to ensure that I'm running in an open-mpi environment during the testing.

I would like to run my next steps by everyone in case anyone with more experience knows a better way:

1. Since Spawn is non-blocking, but I need the parent to wait until the child completes, I am thinking there must be a way to pass a variable from the child to the parent just prior to the FINALIZE command in the child, to signal that the parent can pick up the output files from the child. Am I right in assuming that the message from the child to the parent will go to the correct parent process? The value of "parent" in "CALL MPI_COMM_GET_PARENT(parent, ierr)" is the same in all spawned processes, which is why I ask this question.

2. By launching the parent with the "--mca mpi_yield_when_idle 1" option, the child should be able to take CPU power from any blocked parent process, thus avoiding the busy-poll problem mentioned below. If each host has 4 processors and I'm running on 2 hosts (ie, 8 processors in total), then I also assume that the spawned child will launch on the same host as the associated parent?

Does anyone have any better suggestions? Since I'm quite new to this, I thought it might be best to check...

Thanks!

> From: jsquyres@cisco.com
> Date: Fri, 5 Mar 2010 15:02:57 -0500
> To: users@open-mpi.org
> Subject: Re: [OMPI users] running externalprogram on same processor (Fortran)
>
> On Mar 5, 2010, at 2:38 PM, Ralph Castain wrote:
>
> >> CALL SYSTEM("cd " // TRIM(dir) // " ; mpirun -machinefile ./machinefile -np 1 /home01/group/Execute/DLPOLY.X > job.out 2> job.err ; cd - > /dev/null")
> >
> > That is guaranteed not to work. The problem is that mpirun sets environmental variables for the original launch. Your system call carries over those envars, causing mpirun to become confused.
>
> You should be able to use MPI_COMM_SPAWN to launch this MPI job. Check the man page for MPI_COMM_SPANW; I believe we have info keys to specify things like what hosts to launch on, etc.
>
> >> Do you think MPI_COMM_SPAWN can help?
> >
> > It's the only method supported by the MPI standard. If you need it to block until this new executable completes, you could use a barrier or other MPI method to determine it.
>
> I believe that the user said they wanted to use the same cores as their original MPI job occupies for the new job -- they basically want the old job to block until the new job completes. Keep in mind that OMPI busy-polls waiting for progress, so you might actually get hosed here (two procs competing for time on the same core).
>
> I'm not immediately thinking of a good way to avoid this issue -- perhaps you could kludge something up such that the parent job polls on sleep() and checking to see if a message has arrived from the child (i.e., the last thing the child does before it calls MPI_FINALIZE is to send a message to its parents and then MPI_COMM_DISCONNECT from its parents). If the parent finds that it has a message from the child(ren), it can MPI_COMM_DISCONNECT and continue processing.
>
> Kinda hackey, but it might work...?
>
> --
> Jeff Squyres
> jsquyres@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Do you want a Hotmail account? Sign-up now - Free