Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] mpirun hangs
From: Greg Watson (g.watson_at_[hidden])
Date: 2008-05-28 09:11:01


That fixed it, thanks. I wonder if this is the same problem I'm seeing
for 1.2.x?

Greg

On May 27, 2008, at 10:34 PM, Ralph Castain wrote:

> Aha! This is a problem that continues to bite us - it relates to the
> pty
> problem in Mac OSX. Been a ton of chatter about this, but Mac
> doesn't seem
> inclined to fix it.
>
> Try configuring --disable-pty-support and see if that helps. FWIW,
> you will
> find a platform file for Mac OSX in the trunk - I always build with
> it, and
> have spent considerable time fine-tuning it. You configure with:
>
> ./configure --prefix=whatever
> --with-platform=contrib/platform/lanl/macosx-dynamic
>
> In that directory, you will also find platform files for static
> builds under
> both Tiger and Leopard (slight differences).
>
> ralph
>
>
> On 5/27/08 8:01 PM, "Greg Watson" <g.watson_at_[hidden]> wrote:
>
>> Ralph,
>>
>> I tried rolling back to 18513 but no luck. Steps:
>>
>> $ ./autogen.sh
>> $ ./configure --prefix=/usr/local/openmpi-1.3-devel
>> $ make
>> $ make install
>> $ mpicc -g -o xxx xxx.c
>> $ mpirun -np 2 ./xxx
>> $ ps x
>> 44832 s001 R+ 0:50.00 mpirun -np 2 ./xxx
>> 44833 s001 S+ 0:00.03 ./xxx
>> $ gdb /usr/local/openmpi-1.3-devel/bin/mpirun
>> ...
>> (gdb) attach 44832
>> Attaching to program: `/usr/local/openmpi-1.3-devel/bin/mpirun',
>> process 44832.
>> Reading symbols for shared libraries ++++
>> +.......................................... done
>> 0x9371b3dd in ioctl ()
>> (gdb) where
>> #0 0x9371b3dd in ioctl ()
>> #1 0x93754812 in grantpt ()
>> #2 0x9375470b in openpty ()
>> #3 0x001446d9 in opal_openpty ()
>> #4 0x000bf3bf in orte_iof_base_setup_prefork ()
>> #5 0x003da62f in odls_default_fork_local_proc (context=0x216a60,
>> child=0x216dd0, environ_copy=0x217930) at odls_default_module.c:191
>> #6 0x000c3e76 in orte_odls_base_default_launch_local ()
>> #7 0x003daace in orte_odls_default_launch_local_procs
>> (data=0x216780)
>> at odls_default_module.c:360
>> #8 0x000ad2f6 in process_commands (sender=0x216768, buffer=0x216780,
>> tag=1) at orted/orted_comm.c:441
>> #9 0x000acd52 in orte_daemon_cmd_processor (fd=-1, opal_event=1,
>> data=0x216750) at orted/orted_comm.c:346
>> #10 0x0012bd21 in event_process_active () at opal_object.h:498
>> #11 0x0012c3c5 in opal_event_base_loop () at opal_object.h:498
>> #12 0x0012bf8c in opal_event_loop () at opal_object.h:498
>> #13 0x0011b334 in opal_progress () at runtime/opal_progress.c:169
>> #14 0x000cd9b4 in orte_plm_base_report_launched () at opal_object.h:
>> 498
>> #15 0x000cc2b7 in orte_plm_base_launch_apps () at opal_object.h:498
>> #16 0x0003d626 in orte_plm_rsh_launch (jdata=0x200ae0) at
>> plm_rsh_module.c:1126
>> #17 0x00002604 in orterun (argc=4, argv=0xbffff880) at orterun.c:549
>> #18 0x00001bd6 in main (argc=4, argv=0xbffff880) at main.c:13
>>
>> On May 27, 2008, at 9:11 PM, Ralph Castain wrote:
>>
>>> Yo Greg
>>>
>>> I'm not seeing any problem on my Mac OSX - I'm running Leopard. Can
>>> you tell
>>> me how you configured, and the precise command you executed?
>>>
>>> Thanks
>>> Ralph
>>>
>>>
>>>
>>> On 5/27/08 5:15 PM, "Ralph Castain" <rhc_at_[hidden]> wrote:
>>>
>>>> Hmmm...well, it was working about 3 hours ago! I'll try to take a
>>>> look
>>>> tonight, but it may be tomorrow.
>>>>
>>>> Try rolling it back just a little to r18513 - that's the last rev I
>>>> tested
>>>> on my Mac.
>>>>
>>>>
>>>> On 5/27/08 5:00 PM, "Greg Watson" <g.watson_at_[hidden]> wrote:
>>>>
>>>>> Something seems to be broken in the trunk for MacOS X. I can run
>>>>> a 1
>>>>> process job, but a >1 process job hangs. It was working a few days
>>>>> ago.
>>>>>
>>>>> Greg
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>