Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI users] Spawn and OpenFabrics
From: Allen Barnett (allen_at_[hidden])
Date: 2009-06-02 11:37:53


On Tue, 2009-05-19 at 08:29 -0400, Jeff Squyres wrote:
> fork() support in OpenFabrics has always been dicey -- it can lead to
> random behavior like this. Supposedly it works in a specific set of
> circumstances, but I don't have a recent enough kernel on my machines
> to test.
>
> It's best not to use calls to system() if they can be avoided.
> Indeed, Open MPI v1.3.x will warn you if you create a child process
> after MPI_INIT when using OpenFabrics networks.

My C++ OMPI program uses system() to invoke an external mesh partitioner
program after MPI_INIT is called. Sometimes (with frustrating
randomness), on systems using OFED the system() call fails with EFAULT
(Bad address). The linux kernel appears to feel that the execve()
function is being passed a string which isn't in the process' address
space. The exec string is constructed immediately before calling
system() like this:

std::stringstream ss;
ss << "partitioner_program " << COMM_WORLD_SIZE;
system( ss.str().c_str() );

Could this behavior related to this admonition?

Also, would MPI_COMM_SPAWN suffer from the same difficulties?

Thanks,
Allen

-- 
Allen Barnett
E-Mail: allen_at_[hidden]
Skype:  allenbarnett
Ph:     518-887-2930