Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2005-08-18 16:19:37


Rainer's problem looks different than the one in orte_init_stage1.
ompi_info reports that he doesn't have any sds components built.
Actually, it doesn't list *any* orte components, which seems broken
to me. I should pretty print an error message and abort if an sds
isn't found, but it looks like the bigger problem is a lack of
components in his install.

Brian

On Aug 18, 2005, at 3:33 PM, Tim S. Woodall wrote:

> I'm seeing a problem in orte_init_stage1 when running w/ a persistent
> daemon.
> The problem is that the orte_inti call attempts to call rds subsystem
> directly,
> which is not supposed to be exposed at that level. rds is used
> internally by
> the rmgr - and only initialized on the seed. The proxy rmgr is
> loaded when
> a persistent daemon is available - and therefore the rds is not
> loaded.
>
> So... orte_init_stage1 shouldn't be calling rds directly...
>
> Tim
>
>
> Brian Barrett wrote:
>
>
>> Yeah, although there really shouldn't be a way for the pointer to be
>> NULL. Was this a static build? I was seeing some weird memory
>> issues on static builds last night... I'll take a look on odin and
>> see what I can find.
>>
>> Brian
>>
>> On Aug 18, 2005, at 11:18 AM, Tim S. Woodall wrote:
>>
>>
>>
>>
>>> Brian,
>>>
>>> Wasn't the introduction of sds part of your changes for redstorm?
>>> Any ideas
>>> why it would be NULL here?
>>>
>>> Thanks,
>>> Tim
>>>
>>> Rainer Keller wrote:
>>>
>>>
>>>
>>>
>>>
>>>> Hello,
>>>> see the "same" (well probably not exactly same) thing here in
>>>> Opteron with
>>>> 64bit (-g and so on), I get:
>>>>
>>>> #0 0x0000000040085160 in orte_sds_base_contact_universe ()
>>>> at ../../../../../orte/mca/sds/base/sds_base_interface.c:29
>>>> 29 return orte_sds_base_module->contact_universe();
>>>> (gdb) where
>>>> #0 0x0000000040085160 in orte_sds_base_contact_universe ()
>>>> at ../../../../../orte/mca/sds/base/sds_base_interface.c:29
>>>> #1 0x0000000040063e95 in orte_init_stage1 ()
>>>> at ../../../orte/runtime/orte_init_stage1.c:185
>>>> #2 0x0000000040017e7d in orte_system_init ()
>>>> at ../../../orte/runtime/orte_system_init.c:38
>>>> #3 0x00000000400148f5 in orte_init () at ../../../orte/runtime/
>>>> orte_init.c:46
>>>> #4 0x000000004000dfc7 in main (argc=4, argv=0x7fbfffe8a8)
>>>> at ../../../../orte/tools/orterun/orterun.c:291
>>>> #5 0x0000002a95c0c017 in __libc_start_main () from /lib64/
>>>> libc.so.6
>>>> #6 0x000000004000bf2a in _start ()
>>>> (gdb)
>>>> within mpirun
>>>>
>>>> orte_sds_base_module here is Null...
>>>> This is without persistent orted; Just mpirun...
>>>>
>>>> CU,
>>>> ray
>>>>
>>>>
>>>> On Thursday 18 August 2005 16:57, Nathan DeBardeleben wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> FYI, this only happens when I let OMPI compile 64bit on Linux.
>>>>> When I
>>>>> throw in there CFLAGS=FFLAGS=CXXFLAGS=-m32 orted, my myriad of
>>>>> test
>>>>> codes, mpirun, registry subscription codes, and JNI all work like
>>>>> a champ.
>>>>> Something's wrong with the 64bit it appears to me.
>>>>>
>>>>> -- Nathan
>>>>> Correspondence
>>>>> ------------------------------------------------------------------
>>>>> --
>>>>> -
>>>>> Nathan DeBardeleben, Ph.D.
>>>>> Los Alamos National Laboratory
>>>>> Parallel Tools Team
>>>>> High Performance Computing Environments
>>>>> phone: 505-667-3428
>>>>> email: ndebard_at_[hidden]
>>>>> ------------------------------------------------------------------
>>>>> --
>>>>> -
>>>>>
>>>>> Tim S. Woodall wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Nathan,
>>>>>>
>>>>>> I'll try to reproduce this sometime this week - but I'm pretty
>>>>>> swamped.
>>>>>> Is Greg also seeing the same behavior?
>>>>>>
>>>>>> Thanks,
>>>>>> Tim
>>>>>>
>>>>>> Nathan DeBardeleben wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> To expand on this further, orte_init() seg faults on both
>>>>>>> bluesteel
>>>>>>> (32bit linux) and sparkplug (64bit linux) equally. The required
>>>>>>> condition is that orted must be running first (which of
>>>>>>> course we
>>>>>>> require for our work - a persistent orte daemon and registry).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> [bluesteel]~/ptp > ./dump_info
>>>>>>>> Segmentation fault
>>>>>>>> [bluesteel]~/ptp > gdb dump_info
>>>>>>>> GNU gdb 6.1
>>>>>>>> Copyright 2004 Free Software Foundation, Inc.
>>>>>>>> GDB is free software, covered by the GNU General Public
>>>>>>>> License, and
>>>>>>>> you are
>>>>>>>> welcome to change it and/or distribute copies of it under
>>>>>>>> certain
>>>>>>>> conditions.
>>>>>>>> Type "show copying" to see the conditions.
>>>>>>>> There is absolutely no warranty for GDB. Type "show warranty"
>>>>>>>> for
>>>>>>>> details.
>>>>>>>> This GDB was configured as "x86_64-suse-linux"...Using host
>>>>>>>> libthread_db library "/lib64/tls/libthread_db.so.1".
>>>>>>>>
>>>>>>>> (gdb) run
>>>>>>>> Starting program: /home/ndebard/ptp/dump_info
>>>>>>>>
>>>>>>>> Program received signal SIGSEGV, Segmentation fault.
>>>>>>>> 0x0000000000000000 in ?? ()
>>>>>>>> (gdb) where
>>>>>>>> #0 0x0000000000000000 in ?? ()
>>>>>>>> #1 0x000000000045997d in orte_init_stage1 () at
>>>>>>>> orte_init_stage1.c:419
>>>>>>>> #2 0x00000000004156a7 in orte_system_init () at
>>>>>>>> orte_system_init.c:38
>>>>>>>> #3 0x00000000004151c7 in orte_init () at orte_init.c:46
>>>>>>>> #4 0x0000000000414cbb in main (argc=1, argv=0x7fbffff298) at
>>>>>>>> dump_info.c:185
>>>>>>>> (gdb)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> -- Nathan
>>>>>>> Correspondence
>>>>>>> ----------------------------------------------------------------
>>>>>>> --
>>>>>>> ---
>>>>>>> Nathan DeBardeleben, Ph.D.
>>>>>>> Los Alamos National Laboratory
>>>>>>> Parallel Tools Team
>>>>>>> High Performance Computing Environments
>>>>>>> phone: 505-667-3428
>>>>>>> email: ndebard_at_[hidden]
>>>>>>> ----------------------------------------------------------------
>>>>>>> --
>>>>>>> ---
>>>>>>>
>>>>>>> Nathan DeBardeleben wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Just to clarify:
>>>>>>>> 1: no orted started (meaning the MPIrun or registry programs
>>>>>>>> will
>>>>>>>> start one by themselves) causes those programs to lock up.
>>>>>>>> 2: starting orted by hand (trying to get these programs to
>>>>>>>> connect to
>>>>>>>> a centralized one) causes the connecting programs to seg fault.
>>>>>>>>
>>>>>>>> -- Nathan
>>>>>>>> Correspondence
>>>>>>>> ---------------------------------------------------------------
>>>>>>>> --
>>>>>>>> ----
>>>>>>>> Nathan DeBardeleben, Ph.D.
>>>>>>>> Los Alamos National Laboratory
>>>>>>>> Parallel Tools Team
>>>>>>>> High Performance Computing Environments
>>>>>>>> phone: 505-667-3428
>>>>>>>> email: ndebard_at_[hidden]
>>>>>>>> ---------------------------------------------------------------
>>>>>>>> --
>>>>>>>> ----
>>>>>>>>
>>>>>>>> Nathan DeBardeleben wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> So I dropped an .ompi_ignore into that directory,
>>>>>>>>> reconfigured, and
>>>>>>>>> compile worked (yay!).
>>>>>>>>> However, not a lot of progress: mpirun locks up, all my
>>>>>>>>> registry test
>>>>>>>>> programs lock up as well. If I start the orted by hand, then
>>>>>>>>> any of my
>>>>>>>>>
>>>>>>>>> registry calling programs cause segfault:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> [sparkplug]~/ptp > gdb sub_test
>>>>>>>>>> GNU gdb 6.1
>>>>>>>>>> Copyright 2004 Free Software Foundation, Inc.
>>>>>>>>>> GDB is free software, covered by the GNU General Public
>>>>>>>>>> License, and
>>>>>>>>>> you are
>>>>>>>>>> welcome to change it and/or distribute copies of it under
>>>>>>>>>> certain
>>>>>>>>>> conditions.
>>>>>>>>>> Type "show copying" to see the conditions.
>>>>>>>>>> There is absolutely no warranty for GDB. Type "show
>>>>>>>>>> warranty" for
>>>>>>>>>> details.
>>>>>>>>>> This GDB was configured as "x86_64-suse-linux"...Using host
>>>>>>>>>> libthread_db library "/lib64/tls/libthread_db.so.1".
>>>>>>>>>>
>>>>>>>>>> (gdb) run
>>>>>>>>>> Starting program: /home/ndebard/ptp/sub_test
>>>>>>>>>>
>>>>>>>>>> Program received signal SIGSEGV, Segmentation fault.
>>>>>>>>>> 0x0000000000000000 in ?? ()
>>>>>>>>>> (gdb) where
>>>>>>>>>> #0 0x0000000000000000 in ?? ()
>>>>>>>>>> #1 0x00000000004598a5 in orte_init_stage1 () at
>>>>>>>>>> orte_init_stage1.c:419 #2 0x00000000004155cf in
>>>>>>>>>> orte_system_init ()
>>>>>>>>>> at orte_system_init.c:38 #3 0x00000000004150ef in orte_init
>>>>>>>>>> () at
>>>>>>>>>> orte_init.c:46
>>>>>>>>>> #4 0x00000000004148a1 in main (argc=1, argv=0x7fbffff178) at
>>>>>>>>>> sub_test.c:60
>>>>>>>>>> (gdb)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> Yes, I recompiled everything.
>>>>>>>>>
>>>>>>>>> Here's an example of me trying something a little more
>>>>>>>>> complicated
>>>>>>>>> (which I believe locks up for the same reason - something
>>>>>>>>> borked with
>>>>>>>>> the registry interaction).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> [sparkplug]~/ompi-test > bjssub -s 10000 -n 10 -i bash
>>>>>>>>>>> Waiting for interactive job nodes.
>>>>>>>>>>> (nodes 18 16 17 18 19 20 21 22 23 24 25)
>>>>>>>>>>> Starting interactive job.
>>>>>>>>>>> NODES=16,17,18,19,20,21,22,23,24,25
>>>>>>>>>>> JOBID=18
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> so i got my nodes
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> ndebard_at_sparkplug:~/ompi-test> export
>>>>>>>>>>> OMPI_MCA_ptl_base_exclude=sm
>>>>>>>>>>> ndebard_at_sparkplug:~/ompi-test> export
>>>>>>>>>>> OMPI_MCA_pls_bproc_seed_priority=101
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> and set these envvars like we need to use Greg's bproc,
>>>>>>>>>> without the
>>>>>>>>>> 2nd export the machine's load maxes and locks up.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> ndebard_at_sparkplug:~/ompi-test> bpstat
>>>>>>>>>>> Node(s) Status Mode
>>>>>>>>>>> User Group 100-128 down
>>>>>>>>>>> ---------- root root 0-15
>>>>>>>>>>> up ---x------ vchandu vchandu
>>>>>>>>>>> 16-25 up ---
>>>>>>>>>>> x------
>>>>>>>>>>> ndebard ndebard
>>>>>>>>>>> 26-27 up ---
>>>>>>>>>>> x------
>>>>>>>>>>> root root 28-30 up
>>>>>>>>>>> ---x--x--x root root ndebard_at_sparkplug:~/ompi-test>
>>>>>>>>>>> env | grep
>>>>>>>>>>> NODES
>>>>>>>>>>> NODES=16,17,18,19,20,21,22,23,24,25
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> yes, i really have the nodes
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> ndebard_at_sparkplug:~/ompi-test> mpicc -o test-mpi test-mpi.c
>>>>>>>>>>> ndebard_at_sparkplug:~/ompi-test>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> recompile for good measure
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> ndebard_at_sparkplug:~/ompi-test> ls /tmp/openmpi-sessions-
>>>>>>>>>>> ndebard*
>>>>>>>>>>> /bin/ls: /tmp/openmpi-sessions-ndebard*: No such file or
>>>>>>>>>>> directory
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> proof that there's no left over old directory
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> ndebard_at_sparkplug:~/ompi-test> mpirun -np 1 test-mpi
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> it never responds at this point - but I can kill it with ^C.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> mpirun: killing job...
>>>>>>>>>>> Killed
>>>>>>>>>>> ndebard_at_sparkplug:~/ompi-test>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>> -- Nathan
>>>>>>>>> Correspondence
>>>>>>>>> --------------------------------------------------------------
>>>>>>>>> --
>>>>>>>>> -----
>>>>>>>>> Nathan DeBardeleben, Ph.D.
>>>>>>>>> Los Alamos National Laboratory
>>>>>>>>> Parallel Tools Team
>>>>>>>>> High Performance Computing Environments
>>>>>>>>> phone: 505-667-3428
>>>>>>>>> email: ndebard_at_[hidden]
>>>>>>>>> --------------------------------------------------------------
>>>>>>>>> --
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>> Jeff Squyres wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Is this what Tim Prins was working on?
>>>>>>>>>>
>>>>>>>>>> On Aug 16, 2005, at 5:21 PM, Tim S. Woodall wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> I'm not sure why this is even building... Is someone
>>>>>>>>>>> working on this?
>>>>>>>>>>> I thought we had .ompi_ignore files in this directory.
>>>>>>>>>>>
>>>>>>>>>>> Tim
>>>>>>>>>>>
>>>>>>>>>>> Nathan DeBardeleben wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> So I'm seeing all these nice emails about people
>>>>>>>>>>>> developing on OMPI
>>>>>>>>>>>> today yet I can't get it to compile. Am I out here in
>>>>>>>>>>>> limbo on this
>>>>>>>>>>>> or
>>>>>>>>>>>> are others in the same boat? The errors I'm seeing are
>>>>>>>>>>>> about some
>>>>>>>>>>>> bproc
>>>>>>>>>>>> code calling undefined functions and they are linked again
>>>>>>>>>>>> below.
>>>>>>>>>>>>
>>>>>>>>>>>> -- Nathan
>>>>>>>>>>>> Correspondence
>>>>>>>>>>>> -----------------------------------------------------------
>>>>>>>>>>>> --
>>>>>>>>>>>> -------
>>>>>>>>>>>> - Nathan DeBardeleben, Ph.D.
>>>>>>>>>>>> Los Alamos National Laboratory
>>>>>>>>>>>> Parallel Tools Team
>>>>>>>>>>>> High Performance Computing Environments
>>>>>>>>>>>> phone: 505-667-3428
>>>>>>>>>>>> email: ndebard_at_[hidden]
>>>>>>>>>>>> -----------------------------------------------------------
>>>>>>>>>>>> --
>>>>>>>>>>>> -------
>>>>>>>>>>>> -
>>>>>>>>>>>>
>>>>>>>>>>>> Nathan DeBardeleben wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Back from training and trying to test this but now OMPI
>>>>>>>>>>>>> doesn't
>>>>>>>>>>>>> compile
>>>>>>>>>>>>>
>>>>>>>>>>>>> at all:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> gcc -DHAVE_CONFIG_H -I. -I. -I../../../../include
>>>>>>>>>>>>>> -I../../../../include -I../../../.. -I../../../..
>>>>>>>>>>>>>> -I../../../../include -I../../../../opal -I../../../../
>>>>>>>>>>>>>> orte
>>>>>>>>>>>>>> -I../../../../ompi -g -Wall -Wundef -Wno-long-long -
>>>>>>>>>>>>>> Wsign-compare
>>>>>>>>>>>>>> -Wmissing-prototypes -Wstrict-prototypes -Wcomment -
>>>>>>>>>>>>>> pedantic
>>>>>>>>>>>>>> -Werror-implicit-function-declaration -fno-strict-
>>>>>>>>>>>>>> aliasing -MT
>>>>>>>>>>>>>> ras_lsf_bproc.lo -MD -MP -MF .deps/ras_lsf_bproc.Tpo -c
>>>>>>>>>>>>>> ras_lsf_bproc.c -o ras_lsf_bproc.o
>>>>>>>>>>>>>> ras_lsf_bproc.c: In function
>>>>>>>>>>>>>> `orte_ras_lsf_bproc_node_insert':
>>>>>>>>>>>>>> ras_lsf_bproc.c:32: error: implicit declaration of
>>>>>>>>>>>>>> function
>>>>>>>>>>>>>> `orte_ras_base_node_insert'
>>>>>>>>>>>>>> ras_lsf_bproc.c: In function
>>>>>>>>>>>>>> `orte_ras_lsf_bproc_node_query':
>>>>>>>>>>>>>> ras_lsf_bproc.c:37: error: implicit declaration of
>>>>>>>>>>>>>> function
>>>>>>>>>>>>>> `orte_ras_base_node_query'
>>>>>>>>>>>>>> make[4]: *** [ras_lsf_bproc.lo] Error 1
>>>>>>>>>>>>>> make[4]: Leaving directory
>>>>>>>>>>>>>> `/home/ndebard/ompi/orte/mca/ras/lsf_bproc'
>>>>>>>>>>>>>> make[3]: *** [all-recursive] Error 1
>>>>>>>>>>>>>> make[3]: Leaving directory `/home/ndebard/ompi/orte/mca/
>>>>>>>>>>>>>> ras'
>>>>>>>>>>>>>> make[2]: *** [all-recursive] Error 1
>>>>>>>>>>>>>> make[2]: Leaving directory `/home/ndebard/ompi/orte/mca'
>>>>>>>>>>>>>> make[1]: *** [all-recursive] Error 1
>>>>>>>>>>>>>> make[1]: Leaving directory `/home/ndebard/ompi/orte'
>>>>>>>>>>>>>> make: *** [all-recursive] Error 1
>>>>>>>>>>>>>> [sparkplug]~/ompi >
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Clean SVN checkout this morning with configure:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> [sparkplug]~/ompi > ./configure --enable-static --
>>>>>>>>>>>>>> disable-shared
>>>>>>>>>>>>>> --without-threads --prefix=/home/ndebard/local/ompi
>>>>>>>>>>>>>> --with-devel-headers
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> -- Nathan
>>>>>>>>>>>>> Correspondence
>>>>>>>>>>>>> ----------------------------------------------------------
>>>>>>>>>>>>> --
>>>>>>>>>>>>> -------
>>>>>>>>>>>>> -- Nathan DeBardeleben, Ph.D.
>>>>>>>>>>>>> Los Alamos National Laboratory
>>>>>>>>>>>>> Parallel Tools Team
>>>>>>>>>>>>> High Performance Computing Environments
>>>>>>>>>>>>> phone: 505-667-3428
>>>>>>>>>>>>> email: ndebard_at_[hidden]
>>>>>>>>>>>>> ----------------------------------------------------------
>>>>>>>>>>>>> --
>>>>>>>>>>>>> -------
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> Brian Barrett wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is now fixed in SVN. You should no longer need the
>>>>>>>>>>>>>> --build=i586... hack to compile 32 bit code on Opterons.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Brian
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Aug 12, 2005, at 3:17 PM, Brian Barrett wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Aug 12, 2005, at 3:13 PM, Nathan DeBardeleben wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We've got a 64bit Linux (SUSE) box here. For a
>>>>>>>>>>>>>>>> variety of
>>>>>>>>>>>>>>>> reasons (Java, JNI, linking in with OMPI libraries,
>>>>>>>>>>>>>>>> etc which I
>>>>>>>>>>>>>>>> won't get into)
>>>>>>>>>>>>>>>> I need to compile OMPI 32 bit (or get 64bit versions
>>>>>>>>>>>>>>>> of a lot of
>>>>>>>>>>>>>>>> other
>>>>>>>>>>>>>>>> libraries).
>>>>>>>>>>>>>>>> I get various compile errors when I try different
>>>>>>>>>>>>>>>> things, but
>>>>>>>>>>>>>>>> first
>>>>>>>>>>>>>>>> let
>>>>>>>>>>>>>>>> me explain the system we have:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> <snip>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This goes on and on and on actually. And the 'is
>>>>>>>>>>>>>>>> incompatible
>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>> i386:x86-64 output' looks to be repeated for every
>>>>>>>>>>>>>>>> line before
>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>> error which actually caused the Make to bomb.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Any suggestions at all? Surely someone must have
>>>>>>>>>>>>>>>> tried to force
>>>>>>>>>>>>>>>> OMPI to
>>>>>>>>>>>>>>>> build in 32bit mode on a 64bit machine.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I don't think anyone has tried to build 32 bit on an
>>>>>>>>>>>>>>> Opteron,
>>>>>>>>>>>>>>> which is the cause of the problems...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think I know how to fix this, but won't happen until
>>>>>>>>>>>>>>> later in
>>>>>>>>>>>>>>> the weekend. I can't think of a good workaround until
>>>>>>>>>>>>>>> then.
>>>>>>>>>>>>>>> Well, one possibility is to set the target like you
>>>>>>>>>>>>>>> were doing
>>>>>>>>>>>>>>> and disable ROMIO. Actually, you'll also need to
>>>>>>>>>>>>>>> disable
>>>>>>>>>>>>>>> Fortran 77. So something like:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ./configure [usual options] --build=i586-suse-linux --
>>>>>>>>>>>>>>> disable-io-
>>>>>>>>>>>>>>> romio --disable-f77
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> might just do the trick.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Brian Barrett
>>>>>>>>>>>>>>> Open MPI developer
>>>>>>>>>>>>>>> http://www.open-mpi.org/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> devel mailing list
>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> devel mailing list
>>>>>>>>> devel_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> devel_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>>
>>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>>
>>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>