Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Beowulf cluster and openmpi
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-11-05 08:55:44


Sorry for delayed response - had to dig into this a little since it
has been so long since I wrote the bproc support code.

The problem here is with how you named your nodes. On bproc clusters,
the backend nodes are normally named with just a number. Our system
therefore expects to see node names such as "0", "1", etc. because
when we tell bproc where to launch, all it will accept is a node
"number".

So the bproc system has no way to launch on a node name given as an IP
address.

What you need to do is alter your hostfile to provide node numbers
from your cluster. Alternatively, what OMPI is actually looking for is
an envar called "NODES" that contains a comma-separated list of nodes.
So you could just add to your environment something like:

export NODES=1,2,3,4

or whatever syntax is appropriate for your shell. Note that the bproc
master is usually node=0.

Ralph

On Nov 3, 2008, at 3:50 PM, Rima Chaudhuri wrote:

> I added the option for -hostfile machinefile where the machinefile is
> a file with the IP of the nodes:
> #host names
> 192.168.0.100 slots=2
> 192.168.0.101 slots=2
> 192.168.0.102 slots=2
> 192.168.0.103 slots=2
> 192.168.0.104 slots=2
> 192.168.0.105 slots=2
> 192.168.0.106 slots=2
> 192.168.0.107 slots=2
> 192.168.0.108 slots=2
> 192.168.0.109 slots=2
>
>
> [rchaud_at_helios amber10]$ ./step1
> --------------------------------------------------------------------------
> A daemon (pid 29837) launched by the bproc PLS component on node 192
> died
> unexpectedly so we are aborting.
>
> This may be because the daemon was unable to find all the needed
> shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to
> have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --------------------------------------------------------------------------
> [helios.structure.uic.edu:29836] [0,0,0] ORTE_ERROR_LOG: Error in file
> pls_bproc.c at line 717
> [helios.structure.uic.edu:29836] [0,0,0] ORTE_ERROR_LOG: Error in file
> pls_bproc.c at line 1164
> [helios.structure.uic.edu:29836] [0,0,0] ORTE_ERROR_LOG: Error in file
> rmgr_urm.c at line 462
> [helios.structure.uic.edu:29836] mpirun: spawn failed with errno=-1
>
> I used bpsh to see if the master and one of the nodes n8 could see the
> $LD_LIBRARY_PATH, and it does..
>
> [rchaud_at_helios amber10]$ echo $LD_LIBRARY_PATH
> /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/lib
>
> [rchaud_at_helios amber10]$ bpsh n8 echo $LD_LIBRARY_PATH
> /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/lib
>
> thanks!
>
>
> On Mon, Nov 3, 2008 at 3:14 PM, <users-request_at_[hidden]> wrote:
>> Send users mailing list submissions to
>> users_at_[hidden]
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> or, via email, send a message with subject or body 'help' to
>> users-request_at_[hidden]
>>
>> You can reach the person managing the list at
>> users-owner_at_[hidden]
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of users digest..."
>>
>>
>> Today's Topics:
>>
>> 1. Re: Problems installing in Cygwin - Problem with GCC 3.4.4
>> (Jeff Squyres)
>> 2. switch from mpich2 to openMPI <newbie question> (PattiMichelle)
>> 3. Re: users Digest, Vol 1055, Issue 2 (Ralph Castain)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Mon, 3 Nov 2008 15:52:22 -0500
>> From: Jeff Squyres <jsquyres_at_[hidden]>
>> Subject: Re: [OMPI users] Problems installing in Cygwin - Problem
>> with
>> GCC 3.4.4
>> To: "Gustavo Seabra" <gustavo.seabra_at_[hidden]>
>> Cc: Open MPI Users <users_at_[hidden]>
>> Message-ID: <A016B8C4-510B-4FD2-AD3B-A1B6440508F5_at_[hidden]>
>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>>
>> On Nov 3, 2008, at 3:36 PM, Gustavo Seabra wrote:
>>
>>>> For your fortran issue, the Fortran 90 interface needs the
>>>> Fortran 77
>>>> interface. So you need to supply an F77 as well (the output from
>>>> configure
>>>> should indicate that the F90 interface was disabled because the F77
>>>> interface was disabled).
>>>
>>> Is that what you mean (see below)?
>>
>> Ah yes -- that's another reason the f90 interface could be disabled:
>> if configure detects that the f77 and f90 compilers are not link-
>> compatible.
>>
>>> I thought the g95 compiler could
>>> deal with F77 as well as F95... If so, could I just pass F77='g95'?
>>
>> That would probably work (F77=g95). I don't know the g95 compiler at
>> all, so I don't know if it also accepts Fortran-77-style codes. But
>> if it does, then you're set. Otherwise, specify a different F77
>> compiler that is link compatible with g95 and you should be good.
>>>>> I looked in some places in the OpenMPI code, but I couldn't find
>>>>> "max" being redefined anywhere, but I may be looking in the wrong
>>>>> places. Anyways, the only way of found of compiling OpenMPI was a
>>>>> very
>>>>> ugly hack: I have to go into those files and remove the "std::"
>>>>> before
>>>>> the "max". With that, it all compiled cleanly.
>>>>
>>>> I'm not sure I follow -- I don't see anywhere in OMPI where we use
>>>> std::max.
>>>> What areas did you find that you needed to change?
>>>
>>> These files are part of the standard C++ headers. In my case, they
>>> sit in:
>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits
>>
>> Ah, I see.
>>
>>> In principle, the problems that comes from those files would mean
>>> that
>>> the OpenMPI source has some macro redefining max, but that's what I
>>> could not find :-(
>>
>> Gotcha. I don't think we are defining a "max" macro anywhere in the
>> ompi_info source or related header files. :-(
>>
>>>> No. We don't really maintain the "make check" stuff too well.
>>>
>>> Oh well... What do you use for testing the implementation?
>>
>>
>> We have a whole pile of MPI tests in a private SVN repository. The
>> repository is only private because it contains a lot of other
>> people's
>> [public] MPI test suites and benchmarks, and we never looked into
>> redistribution rights for their software. There's nothing really
>> secret about it -- we just haven't bothered to look into the IP
>> issues. :-)
>>
>> We use the MPI Testing Tool (MTT) for nightly regression across the
>> community:
>>
>> http://www.open-mpi.org/mtt/
>>
>> We have weekday and weekend testing schedules. M-Th we do nightly
>> tests; F-Mon morning, we do a long weekend schedule. This weekend,
>> for example, we ran about 675k regression tests:
>>
>> http://www.open-mpi.org/mtt/index.php?do_redir=875
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>>
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Mon, 03 Nov 2008 12:59:59 -0800
>> From: PattiMichelle <miche1_at_[hidden]>
>> Subject: [OMPI users] switch from mpich2 to openMPI <newbie question>
>> To: users_at_[hidden], patti.sheaffer_at_[hidden]
>> Message-ID: <490F664F.4000000_at_[hidden]>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> I just found out I need to switch from mpich2 to openMPI for some
>> code
>> I'm running. I noticed that it's available in an openSuSE repo (I'm
>> using openSuSE 11.0 x86_64 on a TYAN 32-processor Opteron 8000
>> system),
>> but when I was using mpich2 I seemed to have better luck compiling it
>> from code. This is the line I used:
>>
>> # $ F77=/path/to/g95 F90=/path/to/g95 ./configure
>> --prefix=/some/place/mpich2-install
>>
>> But usually I left the "--prefix=" off and just let it install to
>> it's
>> default... which is /usr/local/bin and that's nice because it's
>> already
>> in the PATH and very usable. I guess my question is whether or not
>> the
>> defaults and configuration syntax have stayed the same in openMPI. I
>> also could use a "quickstart" guide for a non-programming user
>> (e.g., I
>> think I have to start a daemon before running parallelized programs).
>>
>> THANKS!!!
>> PattiM.
>> -------------- next part --------------
>> HTML attachment scrubbed and removed
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Mon, 3 Nov 2008 14:14:36 -0700
>> From: Ralph Castain <rhc_at_[hidden]>
>> Subject: Re: [OMPI users] users Digest, Vol 1055, Issue 2
>> To: Open MPI Users <users_at_[hidden]>
>> Message-ID: <2FBDF4DC-B2DF-4486-A644-0F18C96E8EB2_at_[hidden]>
>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>>
>> The problem is that you didn't specify or allocate any nodes for the
>> job. At the least, you need to tell us what nodes to use via a
>> hostfile.
>>
>> Alternatively, are you using a resource manager to assign the nodes?
>> OMPI didn't see anything from one, but it could be that we just
>> didn't
>> see the right envar.
>>
>> Ralph
>>
>> On Nov 3, 2008, at 1:39 PM, Rima Chaudhuri wrote:
>>
>>> Thanks a lot Ralph!
>>> I corrected the no_local to nolocal and now when I try to execute
>>> the
>>> script step1 (pls find it attached)
>>> [rchaud_at_helios amber10]$ ./step1
>>> [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Not
>>> available
>>> in file ras_bjs.c at line 247
>>> --------------------------------------------------------------------------
>>> There are no available nodes allocated to this job. This could be
>>> because
>>> no nodes were found or all the available nodes were already used.
>>>
>>> Note that since the -nolocal option was given no processes can be
>>> launched on the local node.
>>> --------------------------------------------------------------------------
>>> [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily
>>> out of resource in file base/rmaps_base_support_fns.c at line 168
>>> [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily
>>> out of resource in file rmaps_rr.c at line 402
>>> [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily
>>> out of resource in file base/rmaps_base_map_job.c at line 210
>>> [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily
>>> out of resource in file rmgr_urm.c at line 372
>>> [helios.structure.uic.edu:16335] mpirun: spawn failed with errno=-3
>>>
>>>
>>>
>>> If I use the script without the --nolocal option, I get the
>>> following error:
>>> [helios.structure.uic.edu:20708] [0,0,0] ORTE_ERROR_LOG: Not
>>> available
>>> in file ras_bjs.c at line 247
>>>
>>>
>>> thanks,
>>>
>>>
>>> On Mon, Nov 3, 2008 at 2:04 PM, <users-request_at_[hidden]> wrote:
>>>> Send users mailing list submissions to
>>>> users_at_[hidden]
>>>>
>>>> To subscribe or unsubscribe via the World Wide Web, visit
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> or, via email, send a message with subject or body 'help' to
>>>> users-request_at_[hidden]
>>>>
>>>> You can reach the person managing the list at
>>>> users-owner_at_[hidden]
>>>>
>>>> When replying, please edit your Subject line so it is more specific
>>>> than "Re: Contents of users digest..."
>>>>
>>>>
>>>> Today's Topics:
>>>>
>>>> 1. Scyld Beowulf and openmpi (Rima Chaudhuri)
>>>> 2. Re: Scyld Beowulf and openmpi (Ralph Castain)
>>>> 3. Problems installing in Cygwin - Problem with GCC 3.4.4
>>>> (Gustavo Seabra)
>>>> 4. Re: MPI + Mixed language coding(Fortran90 + C++) (Jeff Squyres)
>>>> 5. Re: Problems installing in Cygwin - Problem with GCC 3.4.4
>>>> (Jeff Squyres)
>>>>
>>>>
>>>> ----------------------------------------------------------------------
>>>>
>>>> Message: 1
>>>> Date: Mon, 3 Nov 2008 11:30:01 -0600
>>>> From: "Rima Chaudhuri" <rima.chaudhuri_at_[hidden]>
>>>> Subject: [OMPI users] Scyld Beowulf and openmpi
>>>> To: users_at_[hidden]
>>>> Message-ID:
>>>> <7503b17d0811030930i13acb974kc627983a1d481192_at_[hidden]>
>>>> Content-Type: text/plain; charset=ISO-8859-1
>>>>
>>>> Hello!
>>>> I am a new user of openmpi -- I've installed openmpi 1.2.6 for our
>>>> x86_64 linux scyld beowulf cluster inorder to make it run with
>>>> amber10
>>>> MD simulation package.
>>>>
>>>> The nodes can see the home directory i.e. a bpsh to the nodes works
>>>> fine and lists all the files in the home directory where I have
>>>> both
>>>> openmpi and amber10 installed.
>>>> However if I try to run:
>>>>
>>>> $MPI_HOME/bin/mpirun -no_local=1 -np 4 $AMBERHOME/exe/
>>>> sander.MPI ........
>>>>
>>>> I get the following error:
>>>> [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247
>>>> --------------------------------------------------------------------------
>>>> Failed to find the following executable:
>>>>
>>>> Host: helios.structure.uic.edu
>>>> Executable: -o
>>>>
>>>> Cannot continue.
>>>> --------------------------------------------------------------------------
>>>> [helios.structure.uic.edu:23611] [0,0,0] ORTE_ERROR_LOG: Not
>>>> found in
>>>> file rmgr_urm.c at line 462
>>>> [helios.structure.uic.edu:23611] mpirun: spawn failed with
>>>> errno=-13
>>>>
>>>> any cues?
>>>>
>>>>
>>>> --
>>>> -Rima
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> Message: 2
>>>> Date: Mon, 3 Nov 2008 12:08:36 -0700
>>>> From: Ralph Castain <rhc_at_[hidden]>
>>>> Subject: Re: [OMPI users] Scyld Beowulf and openmpi
>>>> To: Open MPI Users <users_at_[hidden]>
>>>> Message-ID: <91044A7E-ADA5-4B94-AA11-B3C1D9843606_at_[hidden]>
>>>> Content-Type: text/plain; charset=US-ASCII; format=flowed;
>>>> delsp=yes
>>>>
>>>> For starters, there is no "-no_local" option to mpirun. You might
>>>> want
>>>> to look at mpirun --help, or man mpirun.
>>>>
>>>> I suspect the option you wanted was --nolocal. Note that --nolocal
>>>> does not take an argument.
>>>>
>>>> Mpirun is confused by the incorrect option and looking for an
>>>> incorrectly named executable.
>>>> Ralph
>>>>
>>>>
>>>> On Nov 3, 2008, at 10:30 AM, Rima Chaudhuri wrote:
>>>>
>>>>> Hello!
>>>>> I am a new user of openmpi -- I've installed openmpi 1.2.6 for our
>>>>> x86_64 linux scyld beowulf cluster inorder to make it run with
>>>>> amber10
>>>>> MD simulation package.
>>>>>
>>>>> The nodes can see the home directory i.e. a bpsh to the nodes
>>>>> works
>>>>> fine and lists all the files in the home directory where I have
>>>>> both
>>>>> openmpi and amber10 installed.
>>>>> However if I try to run:
>>>>>
>>>>> $MPI_HOME/bin/mpirun -no_local=1 -np 4 $AMBERHOME/exe/
>>>>> sander.MPI ........
>>>>>
>>>>> I get the following error:
>>>>> [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line
>>>>> 247
>>>>> --------------------------------------------------------------------------
>>>>> Failed to find the following executable:
>>>>>
>>>>> Host: helios.structure.uic.edu
>>>>> Executable: -o
>>>>>
>>>>> Cannot continue.
>>>>> --------------------------------------------------------------------------
>>>>> [helios.structure.uic.edu:23611] [0,0,0] ORTE_ERROR_LOG: Not found
>>>>> in
>>>>> file rmgr_urm.c at line 462
>>>>> [helios.structure.uic.edu:23611] mpirun: spawn failed with
>>>>> errno=-13
>>>>>
>>>>> any cues?
>>>>>
>>>>>
>>>>> --
>>>>> -Rima
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> Message: 3
>>>> Date: Mon, 3 Nov 2008 14:53:55 -0500
>>>> From: "Gustavo Seabra" <gustavo.seabra_at_[hidden]>
>>>> Subject: [OMPI users] Problems installing in Cygwin - Problem with
>>>> GCC
>>>> 3.4.4
>>>> To: "Open MPI Users" <users_at_[hidden]>
>>>> Message-ID:
>>>> <f79359b60811031153l5591e0f8j49a7e4d9fb02eea3_at_[hidden]>
>>>> Content-Type: text/plain; charset=ISO-8859-1
>>>>
>>>> Hi everyone,
>>>>
>>>> Here's a "progress report"... more questions in the end :-)
>>>>
>>>> Finally, I was *almost* able to compile OpenMPI in Cygwin using the
>>>> following configure command:
>>>>
>>>> ./configure --prefix=/home/seabra/local/openmpi-1.3b1 \
>>>> --with-mpi-param_check=always --with-threads=posix \
>>>> --enable-mpi-threads --disable-io-romio \
>>>> --enable-mca-no-
>>>> build=memory_mallopt,maffinity,paffinity \
>>>> --enable-contrib-no-build=vt \
>>>> FC=g95 'FFLAGS=-O0 -fno-second-underscore' CXX=g++
>>>>
>>>> I then had a very weird error during compilation of
>>>> ompi/tools/ompi_info/params.cc. (See below).
>>>>
>>>> The lines causing the compilation errors are:
>>>>
>>>> vector.tcc:307: const size_type __len = __old_size +
>>>> std::max(__old_size, __n);
>>>> vector.tcc:384: const size_type __len = __old_size +
>>>> std::max(__old_size, __n);
>>>> stl_bvector.h:522: const size_type __len = size() +
>>>> std::max(size(), __n);
>>>> stl_bvector.h:823: const size_type __len = size() +
>>>> std::max(size(), __n);
>>>>
>>>> (Notice that those are from the standard gcc libraries.)
>>>>
>>>> After googling it for a while, I could find that this error is
>>>> caused
>>>> because, at come point, the source code being compiled redefined
>>>> the
>>>> "max" function with a macro, g++ cannot recognize the "std::max"
>>>> that
>>>> happens in those lines and only "sees" a (...), thus printing that
>>>> cryptic complaint.
>>>>
>>>> I looked in some places in the OpenMPI code, but I couldn't find
>>>> "max" being redefined anywhere, but I may be looking in the wrong
>>>> places. Anyways, the only way of found of compiling OpenMPI was a
>>>> very
>>>> ugly hack: I have to go into those files and remove the "std::"
>>>> before
>>>> the "max". With that, it all compiled cleanly.
>>>>
>>>> I did try running the tests in the 'tests' directory (with 'make
>>>> check'), and I didn't get any alarming message, except that in some
>>>> cases (class, threads, peruse) it printed "All 0 tests passed". I
>>>> got
>>>> and "All (n) tests passed" (n>0) for asm and datatype.
>>>>
>>>> Can anybody comment on the meaning of those test results? Should
>>>> I be
>>>> alarmed with the "All 0 tests passed" messages?
>>>>
>>>> Finally, in the absence of big red flags (that I noticed), I went
>>>> ahead and tried to compile my program. However, as soon as
>>>> compilation
>>>> starts, I get the following:
>>>>
>>>> /local/openmpi/openmpi-1.3b1/bin/mpif90 -c -O3 -fno-second-
>>>> underscore
>>>> -ffree-form -o constants.o _constants.f
>>>> --------------------------------------------------------------------------
>>>> Unfortunately, this installation of Open MPI was not compiled with
>>>> Fortran 90 support. As such, the mpif90 compiler is non-
>>>> functional.
>>>> --------------------------------------------------------------------------
>>>> make[1]: *** [constants.o] Error 1
>>>> make[1]: Leaving directory `/home/seabra/local/amber11/src/sander'
>>>> make: *** [parallel] Error 2
>>>>
>>>> Notice that I compiled OpenMPI with g95, so there *should* be
>>>> Fortran95 support... Any ideas on what could be going wrong?
>>>>
>>>> Thank you very much,
>>>> Gustavo.
>>>>
>>>> ======================================
>>>> Error in the compilation of params.cc
>>>> ======================================
>>>> $ g++ -DHAVE_CONFIG_H -I. -I../../../opal/include
>>>> -I../../../orte/include -I../../../ompi/include
>>>> -I../../../opal/mca/paffinity/linux/plpa/src/libplpa
>>>> -DOMPI_CONFIGURE_USER="\"seabra\"" -
>>>> DOMPI_CONFIGURE_HOST="\"ACS02\""
>>>> -DOMPI_CONFIGURE_DATE="\"Sat Nov 1 20:44:32 EDT 2008\""
>>>> -DOMPI_BUILD_USER="\"$USER\"" -DOMPI_BUILD_HOST="\"`hostname`\""
>>>> -DOMPI_BUILD_DATE="\"`date`\"" -DOMPI_BUILD_CFLAGS="\"-O3 -DNDEBUG
>>>> -finline-functions -fno-strict-aliasing \""
>>>> -DOMPI_BUILD_CPPFLAGS="\"-I../../.. -D_REENTRANT\""
>>>> -DOMPI_BUILD_CXXFLAGS="\"-O3 -DNDEBUG -finline-functions \""
>>>> -DOMPI_BUILD_CXXCPPFLAGS="\"-I../../.. -D_REENTRANT\""
>>>> -DOMPI_BUILD_FFLAGS="\"-O0 -fno-second-underscore\""
>>>> -DOMPI_BUILD_FCFLAGS="\"\"" -DOMPI_BUILD_LDFLAGS="\"-export-dynamic
>>>> \"" -DOMPI_BUILD_LIBS="\"-lutil \""
>>>> -DOMPI_CC_ABSOLUTE="\"/usr/bin/gcc\""
>>>> -DOMPI_CXX_ABSOLUTE="\"/usr/bin/g++\""
>>>> -DOMPI_F77_ABSOLUTE="\"/usr/bin/g77\""
>>>> -DOMPI_F90_ABSOLUTE="\"/usr/local/bin/g95\""
>>>> -DOMPI_F90_BUILD_SIZE="\"small\"" -I../../.. -D_REENTRANT -O3
>>>> -DNDEBUG -finline-functions -MT param.o -MD -MP -MF $depbase.Tpo
>>>> -c
>>>> -o param.o param.cc
>>>> In file included from /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c+
>>>> +/
>>>> vector:72,
>>>> from ../../../ompi/tools/ompi_info/ompi_info.h:24,
>>>> from param.cc:43:
>>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h:
>>>> In
>>>> member function `void std::vector<bool,
>>>> _Alloc>::_M_insert_range(std::_Bit_iterator, _ForwardIterator,
>>>> _ForwardIterator, std::forward_iterator_tag)':
>>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h:
>>>> 522:
>>>> error: expected unqualified-id before '(' token
>>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h:
>>>> In
>>>> member function `void std::vector<bool,
>>>> _Alloc>::_M_fill_insert(std::_Bit_iterator, size_t, bool)':
>>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h:
>>>> 823:
>>>> error: expected unqualified-id before '(' token
>>>> In file included from /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c+
>>>> +/
>>>> vector:75,
>>>> from ../../../ompi/tools/ompi_info/ompi_info.h:24,
>>>> from param.cc:43:
>>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/vector.tcc: In
>>>> member function `void std::vector<_Tp,
>>>> _Alloc>::_M_fill_insert(__gnu_cxx::__normal_iterator<typename
>>>> _Alloc::pointer, std::vector<_Tp, _Alloc> >, size_t, const _Tp&)':
>>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/vector.tcc:307:
>>>> error: expected unqualified-id before '(' token
>>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/vector.tcc: In
>>>> member function `void std::vector<_Tp,
>>>> _Alloc>::_M_range_insert(__gnu_cxx::__normal_iterator<typename
>>>> _Alloc::pointer, std::vector<_Tp, _Alloc> >, _ForwardIterator,
>>>> _ForwardIterator, std::forward_iterator_tag)':
>>>> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/vector.tcc:384:
>>>> error: expected unqualified-id before '(' token
>>>>
>>>>
>>>> --
>>>> Gustavo Seabra
>>>> Postdoctoral Associate
>>>> Quantum Theory Project - University of Florida
>>>> Gainesville - Florida - USA
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> Message: 4
>>>> Date: Mon, 3 Nov 2008 14:54:25 -0500
>>>> From: Jeff Squyres <jsquyres_at_[hidden]>
>>>> Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 +
>>>> C+
>>>> +)
>>>> To: Open MPI Users <users_at_[hidden]>
>>>> Message-ID: <45698801-0857-466F-A19D-C529F72D4A18_at_[hidden]>
>>>> Content-Type: text/plain; charset=US-ASCII; format=flowed;
>>>> delsp=yes
>>>>
>>>> Can you replicate the scenario in smaller / different cases?
>>>>
>>>> - write a sample plugin in C instead of C++
>>>> - write a non-MPI Fortran application that loads your C++
>>>> application
>>>> - ...?
>>>>
>>>> In short, *MPI* shouldn't be interfering with Fortran/C++ common
>>>> blocks. Try taking MPI out of the picture and see if that makes
>>>> the
>>>> problem go away.
>>>>
>>>> Those are pretty much shots in the dark, but I don't know where to
>>>> go,
>>>> either -- try random things until you find what you want.
>>>>
>>>>
>>>> On Nov 3, 2008, at 3:51 AM, Rajesh Ramaya wrote:
>>>>
>>>>> Helllo Jeff, Gustavo, Mi
>>>>> Thank for the advice. I am familiar with the difference in the
>>>>> compiler code generation for C, C++ & FORTRAN. I even tried to
>>>>> look
>>>>> at some of the common block symbols. The name of the symbol
>>>>> remains
>>>>> the same. The only difference that I observe is in FORTRAN
>>>>> compiled
>>>>> *.o 0000000000515bc0 B aux7loc_ and the C++ compiled code U
>>>>> aux7loc_ the memory is not allocated as it has been declared as
>>>>> extern in C++. When the executable loads the shared library it
>>>>> finds
>>>>> all the undefined symbols. Atleast if it did not manage to find a
>>>>> single symbol it prints undefined symbol error.
>>>>> I am completely stuck up and do not know how to continue further.
>>>>>
>>>>> Thanks,
>>>>> Rajesh
>>>>>
>>>>> From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]
>>>>> ]
>>>>> On Behalf Of Mi Yan
>>>>> Sent: samedi 1 novembre 2008 23:26
>>>>> To: Open MPI Users
>>>>> Cc: 'Open MPI Users'; users-bounces_at_[hidden]
>>>>> Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90
>>>>> + C
>>>>> ++)
>>>>>
>>>>> So your tests show:
>>>>> 1. "Shared library in FORTRAN + MPI executable in FORTRAN" works.
>>>>> 2. "Shared library in C++ + MPI executable in FORTRAN " does not
>>>>> work.
>>>>>
>>>>> It seems to me that the symbols in C library are not really
>>>>> recognized by FORTRAN executable as you thought. What compilers
>>>>> did
>>>>> yo use to built OpenMPI?
>>>>>
>>>>> Different compiler has different convention to handle symbols.
>>>>> E.g.
>>>>> if there is a variable "var_foo" in your FORTRAN code, some FORTRN
>>>>> compiler will save "var_foo_" in the object file by default; if
>>>>> you
>>>>> want to access "var_foo" in C code, you actually need to refer
>>>>> "var_foo_" in C code. If you define "var_foo" in a module in the
>>>>> FORTAN compiler, some FORTRAN compiler may append the module
>>>>> name to
>>>>> "var_foo".
>>>>> So I suggest to check the symbols in the object files generated by
>>>>> your FORTAN and C compiler to see the difference.
>>>>>
>>>>> Mi
>>>>> <image001.gif>"Rajesh Ramaya" <rajesh.ramaya_at_[hidden]>
>>>>>
>>>>>
>>>>> "Rajesh Ramaya" <rajesh.ramaya_at_[hidden]>
>>>>> Sent by: users-bounces_at_[hidden]
>>>>> 10/31/2008 03:07 PM
>>>>>
>>>>> Please respond to
>>>>> Open MPI Users <users_at_[hidden]>
>>>>> <image002.gif>
>>>>> To
>>>>> <image003.gif>
>>>>> "'Open MPI Users'" <users_at_[hidden]>, "'Jeff Squyres'" <jsquyres_at_[hidden]
>>>>>>
>>>>> <image002.gif>
>>>>> cc
>>>>> <image003.gif>
>>>>> <image002.gif>
>>>>> Subject
>>>>> <image003.gif>
>>>>> Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++)
>>>>>
>>>>> <image003.gif>
>>>>> <image003.gif>
>>>>>
>>>>> Hello Jeff Squyres,
>>>>> Thank you very much for the immediate reply. I am able to
>>>>> successfully
>>>>> access the data from the common block but the values are zero.
>>>>> In my
>>>>> algorithm I even update a common block but the update made by the
>>>>> shared
>>>>> library is not taken in to account by the executable. Can you
>>>>> please
>>>>> be very
>>>>> specific how to make the parallel algorithm aware of the data?
>>>>> Actually I am
>>>>> not writing any MPI code inside? It's the executable (third party
>>>>> software)
>>>>> who does that part. All that I am doing is to compile my code with
>>>>> MPI c
>>>>> compiler and add it in the LD_LIBIRARY_PATH.
>>>>> In fact I did a simple test by creating a shared library using a
>>>>> FORTRAN
>>>>> code and the update made to the common block is taken in to
>>>>> account
>>>>> by the
>>>>> executable. Is there any flag or pragma that need to be activated
>>>>> for mixed
>>>>> language MPI?
>>>>> Thank you once again for the reply.
>>>>>
>>>>> Rajesh
>>>>>
>>>>> -----Original Message-----
>>>>> From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]
>>>>> ]
>>>>> On
>>>>> Behalf Of Jeff Squyres
>>>>> Sent: vendredi 31 octobre 2008 18:53
>>>>> To: Open MPI Users
>>>>> Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90
>>>>> + C
>>>>> ++)
>>>>>
>>>>> On Oct 31, 2008, at 11:57 AM, Rajesh Ramaya wrote:
>>>>>
>>>>>> I am completely new to MPI. I have a basic question concerning
>>>>>> MPI and mixed language coding. I hope any of you could help me
>>>>>> out.
>>>>>> Is it possible to access FORTRAN common blocks in C++ in a MPI
>>>>>> compiled code. It works without MPI but as soon I switch to MPI
>>>>>> the
>>>>>> access of common block does not work anymore.
>>>>>> I have a Linux MPI executable which loads a shared library at
>>>>>> runtime and resolves all undefined symbols etc The shared
>>>>>> library
>>>>>> is written in C++ and the MPI executable in written in FORTRAN.
>>>>>> Some
>>>>>> of the input that the shared library looking for are in the
>>>>>> Fortran
>>>>>> common blocks. As I access those common blocks during runtime the
>>>>>> values are not initialized. I would like to know if what I am
>>>>>> doing is possible ?I hope that my problem is clear......
>>>>>
>>>>>
>>>>> Generally, MPI should not get in the way of sharing common blocks
>>>>> between Fortran and C/C++. Indeed, in Open MPI itself, we share a
>>>>> few
>>>>> common blocks between Fortran and the main C Open MPI
>>>>> implementation.
>>>>>
>>>>> What is the exact symptom that you are seeing? Is the application
>>>>> failing to resolve symbols at run-time, possibly indicating that
>>>>> something hasn't instantiated a common block? Or are you able to
>>>>> successfully access the data from the common block, but it doesn't
>>>>> have the values you expect (e.g., perhaps you're seeing all
>>>>> zeros)?
>>>>>
>>>>> If the former, you might want to check your build procedure. You
>>>>> *should* be able to simply replace your C++ / F90 compilers with
>>>>> mpicxx and mpif90, respectively, and be able to build an MPI
>>>>> version
>>>>> of your app. If the latter, you might need to make your parallel
>>>>> algorithm aware of what data is available in which MPI process --
>>>>> perhaps not all the data is filled in on each MPI process...?
>>>>>
>>>>> --
>>>>> Jeff Squyres
>>>>> Cisco Systems
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>> --
>>>> Jeff Squyres
>>>> Cisco Systems
>>>>
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> Message: 5
>>>> Date: Mon, 3 Nov 2008 15:04:47 -0500
>>>> From: Jeff Squyres <jsquyres_at_[hidden]>
>>>> Subject: Re: [OMPI users] Problems installing in Cygwin - Problem
>>>> with
>>>> GCC 3.4.4
>>>> To: Open MPI Users <users_at_[hidden]>
>>>> Message-ID: <8E364B51-6726-4533-ADE2-AEA266380DCC_at_[hidden]>
>>>> Content-Type: text/plain; charset=US-ASCII; format=flowed;
>>>> delsp=yes
>>>>
>>>> On Nov 3, 2008, at 2:53 PM, Gustavo Seabra wrote:
>>>>
>>>>> Finally, I was *almost* able to compile OpenMPI in Cygwin using
>>>>> the
>>>>> following configure command:
>>>>>
>>>>> ./configure --prefix=/home/seabra/local/openmpi-1.3b1 \
>>>>> --with-mpi-param_check=always --with-threads=posix \
>>>>> --enable-mpi-threads --disable-io-romio \
>>>>> --enable-mca-no-
>>>>> build=memory_mallopt,maffinity,paffinity \
>>>>> --enable-contrib-no-build=vt \
>>>>> FC=g95 'FFLAGS=-O0 -fno-second-underscore' CXX=g++
>>>>
>>>> For your fortran issue, the Fortran 90 interface needs the
>>>> Fortran 77
>>>> interface. So you need to supply an F77 as well (the output from
>>>> configure should indicate that the F90 interface was disabled
>>>> because
>>>> the F77 interface was disabled).
>>>>
>>>>> I then had a very weird error during compilation of
>>>>> ompi/tools/ompi_info/params.cc. (See below).
>>>>>
>>>>> The lines causing the compilation errors are:
>>>>>
>>>>> vector.tcc:307: const size_type __len = __old_size +
>>>>> std::max(__old_size, __n);
>>>>> vector.tcc:384: const size_type __len = __old_size +
>>>>> std::max(__old_size, __n);
>>>>> stl_bvector.h:522: const size_type __len = size() +
>>>>> std::max(size(), __n);
>>>>> stl_bvector.h:823: const size_type __len = size() +
>>>>> std::max(size(), __n);
>>>>>
>>>>> (Notice that those are from the standard gcc libraries.)
>>>>>
>>>>> After googling it for a while, I could find that this error is
>>>>> caused
>>>>> because, at come point, the source code being compiled redefined
>>>>> the
>>>>> "max" function with a macro, g++ cannot recognize the "std::max"
>>>>> that
>>>>> happens in those lines and only "sees" a (...), thus printing that
>>>>> cryptic complaint.
>>>>>
>>>>> I looked in some places in the OpenMPI code, but I couldn't find
>>>>> "max" being redefined anywhere, but I may be looking in the wrong
>>>>> places. Anyways, the only way of found of compiling OpenMPI was a
>>>>> very
>>>>> ugly hack: I have to go into those files and remove the "std::"
>>>>> before
>>>>> the "max". With that, it all compiled cleanly.
>>>>
>>>> I'm not sure I follow -- I don't see anywhere in OMPI where we use
>>>> std::max. What areas did you find that you needed to change?
>>>>
>>>>> I did try running the tests in the 'tests' directory (with 'make
>>>>> check'), and I didn't get any alarming message, except that in
>>>>> some
>>>>> cases (class, threads, peruse) it printed "All 0 tests passed". I
>>>>> got
>>>>> and "All (n) tests passed" (n>0) for asm and datatype.
>>>>>
>>>>> Can anybody comment on the meaning of those test results? Should I
>>>>> be
>>>>> alarmed with the "All 0 tests passed" messages?
>>>>
>>>> No. We don't really maintain the "make check" stuff too well.
>>>>
>>>> --
>>>> Jeff Squyres
>>>> Cisco Systems
>>>>
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> End of users Digest, Vol 1055, Issue 2
>>>> **************************************
>>>>
>>>
>>>
>>>
>>> --
>>> -Rima
>>> <step1>_______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> End of users Digest, Vol 1055, Issue 4
>> **************************************
>>
>
>
>
> --
> -Rima
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users