Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Ralph Castain (rhc_at_[hidden])
Date: 2006-07-06 20:26:39


Hi Saadat

That¹s the problem, then ­ you need to run comm_spawn applications using
mpirun, I¹m afraid. We plan to fix this in the near future, but for now we
can only offer that workaround.

Ralph

On 7/6/06 5:30 PM, "s anwar" <sanwar_at_[hidden]> wrote:

> Ralph:
>
> I am running the application without mpirun, i.e. ./foobar. So, according to
> you definition of singleton above, I am calling comm_spawn from a singleton.
>
> Thanks.
> Saadat.
>
>
> On 7/6/06, Ralph Castain <rhc_at_[hidden]> wrote:
>> Thanks Saadat
>>
>> Could you clarify how you are running this application? We have a known
>> problem with comm_spawn from a singleton (i.e., if you just did a.out instead
>> of mpirun ‹np 1 a.out) - the errors look somewhat like what you are showing
>> here, hence our curiousity.
>>
>> Thanks
>> Ralph
>>
>>
>>
>>
>> On 7/6/06 3:12 PM, "s anwar" <sanwar_at_[hidden]> wrote:
>>
>>> Ralph:
>>>
>>> I am using Fedora Core 4 (Linux turkana 2.6.12-1.1390_FC4smp #1 SMP Tue Jul
>>> 5 20:21:11 EDT 2005 i686 athlon i386 GNU/Linux). The machine is a dual
>>> processor Athlon based machine. No, cluster resource manager, just an
>>> rsh/ssh based setup.
>>>
>>> Thanks.
>>> Saadat.
>>>
>>> On 7/6/06, Ralph H Castain <rhc_at_[hidden]> wrote:
>>>> Hi Saadat
>>>>
>>>> Could you tell us something more about the system you are using? What type
>>>> of processors, operating system, any resource manager (e.g., SLURM, PBS),
>>>> etc?
>>>>
>>>> Thanks
>>>> Ralph
>>>>
>>>>
>>>>
>>>>
>>>> On 7/6/06 10:49 AM, "s anwar" <sanwar_at_[hidden]> wrote:
>>>>
>>>> Good Day:
>>>>
>>>> I am getting the following error messages every time I run a very simple
>>>> program that spawns child processes:
>>>> [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file
>>>> base/soh_base_get_proc_soh.c at line 80
>>>> [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file
>>>> base/oob_base_xcast.c at line 108
>>>> [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file
>>>> base/rmgr_base_stage_gate.c at line 276
>>>> [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file
>>>> base/soh_base_get_proc_soh.c at line 80
>>>> [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file
>>>> base/oob_base_xcast.c at line 108
>>>> [turkana:27949] [0,0,0] ORTE_ERROR_LOG: Not found in file
>>>> base/rmgr_base_stage_gate.c at line 276
>>>>
>>>> These errors are being generated by the master process. Does any body know
>>>> what do they mean?
>>>>
>>>> Also, if I spawn four child processes, not all of them run to completion,
>>>> i.e. till MPI_Finalize.
>>>>
>>>> Saadat.
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users