Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Multi-threading with OpenMPI ?
From: Ashika Umanga Umagiliya (aumanga_at_[hidden])
Date: 2009-10-05 03:33:59


Greetings all,

First of all thank you all for the help.

I tried using locks and still I get following problems :

1) When multiple threads calling MPI_Comm_Spawn (sequentially or in
parallel), some spawned processes hang up on its
"MPI_Init_thread(NULL,NULL,MPI_THREAD_MULTIPLE,&sup);"
method. (I can see list of all spawned processes are stacked in the
'top' command.)

2) Randomly, program (webservice) crashes with the error

"[umanga:06488] [[4594,0],0] ORTE_ERROR_LOG: The system limit on number
of pipes a process can open was reached in file odls_default_module.c at
line 218
[umanga:06488] [[4594,0],0] ORTE_ERROR_LOG: The system limit on number
of network connections a process can open was reached in file oob_tcp.c
at line 447
--------------------------------------------------------------------------
Error: system limit exceeded on number of network connections that can
be open

This can be resolved by setting the mca parameter
opal_set_max_sys_limits to 1,
increasing your limit descriptor setting (using limit or ulimit commands),
or asking the system administrator to increase the system limit.
--------------------------------------------------------------------------"

Any advices ?

Thank you,
umanga

Richard Treumann wrote:
>
> MPI_COMM_SELF is one example. The only task it contains is the local task.
>
> The other case I had in mind is where there is a master doing all
> spawns. Master is launched as an MPI "job" but it has only one task.
> In that master, even MPI_COMM_WORLD is what I called a "single task
> communicator".
>
> Because the collective spawn call is "collective: across only one task
> in this case, it does not have the same sort of dependency on what
> other tasks do.
>
> I think it is common for a single task master to have responsibility
> for all spawns in the kind of model yours sounds like. I did not study
> the conversation enough to knew if you are doing all spawn calls from
> a "single task communicator" and I was trying to give a broadly useful
> explanation.
>
>
> Dick Treumann - MPI Team
> IBM Systems & Technology Group
> Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> Tele (845) 433-7846 Fax (845) 433-8363
>
>
> users-bounces_at_[hidden] wrote on 09/25/2009 02:59:04 AM:
>
> > [image removed]
> >
> > Re: [OMPI users] Multi-threading with OpenMPI ?
> >
> > Ashika Umanga Umagiliya
> >
> > to:
> >
> > Open MPI Users
> >
> > 09/25/2009 03:00 AM
> >
> > Sent by:
> >
> > users-bounces_at_[hidden]
> >
> > Please respond to Open MPI Users
> >
> > Thank you Dick for your detailed reply,
> >
> > I am sorry, could you explain more what you meant by "unless you are
> > calling MPI_Comm_spawn on a single task communicator you would need
> > to have a different input communicator for each thread that will
> > make an MPI_Comm_spawn call" , i am confused with the term "single
> > task communicator"
> >
> > Best Regards,
> > umanga
> >
> > Richard Treumann wrote:
> > It is dangerous to hold a local lock (like a mutex} across a
> > blocking MPI call unless you can be 100% sure everything that must
> > happen remotely will be completely independent of what is done with
> > local locks & communication dependancies on other tasks.
> >
> > It is likely that a MPI_Comm_spawn call in which the spawning
> > communicator is MPI_COMM_SELF would be safe to serialize with a
> > mutex. But be careful and do not view this as an approach to making
> > MPI applications thread safe in general. Also, unless you are
> > calling MPI_Comm_spawn on a single task communicator you would need
> > to have a different input communicator for each thread that will
> > make an MPI_Comm_spawn call. MPI requires that collective calls on a
> > given communicator be made in the same order by all participating tasks.
> >
> > If there are two or more tasks making the MPI_Comm_spawn call
> > collectively from multiple threads (even with per-thread input
> > communicators) then using a local lock this way is pretty sure to
> > deadlock at some point. Say task 0 serializes spawning threads as A
> > then B and task 1 serializes them as B then A. The job will deadlock
> > because task 0 cannot free its lock for thread A until task 1 makes
> > the spawn call for thread A as well. That will never happen if task
> > 1 is stuck in a lock that will not release until task 0 makes its
> > call for thread B.
> >
> > When you look at the code for a particular task and consider thread
> > interactions within the task, the use of the lock looks safe. It is
> > only when you consider the dependancies on what other tasks are
> > doing that the danger becomes clear. This particular case is pretty
> > easy to see but sometime when there is a temptation to hold a local
> > mutex across an blocking MPI call, the chain of dependancies that
> > can lead to deadlock becomes very hard to predict.
> >
> > BTW - maybe this is obvious but you also need to protect the logic
> > which calls MPI_Thread_init to make sure you do not have a a race in
> > which 2 threads each race to test the flag for whether
> > MPI_Init_thread has already been called. If two thread do:
> > 1) if (MPI_Inited_flag == FALSE) {
> > 2) set MPI_Inited_flag
> > 3) MPI_Init_thread
> > 4) }
> > You have a couple race conditions.
> > 1) Two threads may both try to call MPI_Iint_thread if one thread
> > tests " if (MPI_Inited_flag == FALSE)" while the other is between
> > statements 1 & 2.
> > 2) If some thread tests "if (MPI_Inited_flag == FALSE)" while
> > another thread is between statements 2 and 3, that thread could
> > assume MPI_Init_thread is done and make the MPI_Comm_spawn call
> > before the thread that is trying to initialize MPI manages to do it.
> >
> > Dick
> >
> >
> > Dick Treumann - MPI Team
> > IBM Systems & Technology Group
> > Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> > Tele (845) 433-7846 Fax (845) 433-8363
> >
> >
> > users-bounces_at_[hidden] wrote on 09/17/2009 11:36:48 PM:
> >
> > > [image removed]
> > >
> > > Re: [OMPI users] Multi-threading with OpenMPI ?
> > >
> > > Ralph Castain
> > >
> > > to:
> > >
> > > Open MPI Users
> > >
> > > 09/17/2009 11:37 PM
> > >
> > > Sent by:
> > >
> > > users-bounces_at_[hidden]
> > >
> > > Please respond to Open MPI Users
> > >
> > > Only thing I can suggest is to place a thread lock around the call
> to
> > > comm_spawn so that only one thread at a time can execute that
> > > function. The call to mpi_init_thread is fine - you just need to
> > > explicitly protect the call to comm_spawn.
> > >
> > >
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users