Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Eric Thibodeau (kyron_at_[hidden])
Date: 2007-02-19 01:09:00


Hi Jeff,

        I just tried with 1.2b4r13690 and the problem is still present. Only nottable differance is that CTRL-C gave me orterun: killing job... but stuck there untill I hit CTRL-\..if it has any bearing on the issue. Again, the command line was:

orterun -np 11 ./perftest-1.3c/mpptest -max_run_time 1800 -bisect -size 0 4096 1 -gnuplot -fname HyperTransport/Global_bisect_0_4096_1.gpl

(only difference is that I had 11 procs instead of 9 available)

Le vendredi 16 février 2007 06:50, Jeff Squyres a écrit :
> Could you try one of the later nightly 1.2 tarballs? We just fixed a
> shared memory race condition, for example:
>
> http://www.open-mpi.org/nightly/v1.2/
>
>
> On Feb 16, 2007, at 12:12 AM, Eric Thibodeau wrote:
>
> > Hello devs,
> >
> > Thought I would let you know there seems to be a problem with
> > 1.2b3r13112 when running the "bisection" test on a Tyan VX50
> > machine (the 8 DualCore model with 32Gigs of RAM).
> >
> > OpenMPI was compiled with (as seen from config.log):
> > configure:116866: running /bin/sh './configure' CFLAGS="-O3 -
> > DNDEBUG -finline-functions -fno-strict-aliasing -pthread"
> > CPPFLAGS=" " FFLAGS="" LDFLAGS=" " --enable-shared --disable-
> > static --prefix=/export/livia/home/parallel/eric/openmpi_x86_64 --
> > with-mpi=open_mpi --cache-file=/dev/null --srcdir=.
> >
> > MPPTEST (1.3c) was compiled with:
> > ./configure --with-mpi=$HOME/openmpi_`uname -m`
> >
> > ...which, for some reason, works fine on that system that doesn't
> > have any other MPI implementation (ie: doesn't have LAM-MPI as per
> > this thread).
> >
> > Then I ran a few tests but this one ran for over it's allowed time
> > (1800 seconds and was going over 50minutes...) and was up to 16Gigs
> > of RAM:
> >
> > orterun -np 9 ./perftest-1.3c/mpptest -max_run_time 1800 -bisect -
> > size 0 4096 1 -gnuplot -fname HyperTransport/
> > Global_bisect_0_4096_1.gpl
> >
> > I had to CTRL-\ the process as CTRL-C wasn't sufficient. 2 mpptest
> > processes and 1 orterun process were using 100% CPU ou of of the 16
> > cores.
> >
> > If any of this can be indicative of an OpenMPI bug and if I can
> > help in tracking it down, don't hesitate to ask for details.
> >
> > And, finally, Anthony, thanks for the MPICC and --with-mpich
> > pointers, I will try those to simplify the build process!
> >
> > Eric
> >
> > Le jeudi 15 février 2007 19:51, Anthony Chan a écrit :
> >>
> >> As long as mpicc is working, try configuring mpptest as
> >>
> >> mpptest/configure MPICC=<OpenMPI-install-dir>/bin/mpicc
> >>
> >> or
> >>
> >> mpptest/configure --with-mpich=<OpenMPI-install-dir>
> >>
> >> A.Chan
> >>
> >> On Thu, 15 Feb 2007, Eric Thibodeau wrote:
> >>
> >>> Hi Jeff,
> >>>
> >>> Thanks for your response, I eventually figured it out, here is the
> >>> only way I got mpptest to compile:
> >>>
> >>> export LD_LIBRARY_PATH="$HOME/openmpi_`uname -m`/lib"
> >>> CC="$HOME/openmpi_`uname -m`/bin/mpicc" ./configure --with-
> >>> mpi="$HOME/openmpi_`uname -m`"
> >>>
> >>> And, yes I know I should use the mpicc wrapper and all (I do
> >>> RTFM :P ) but
> >>> mpptest is less than cooperative and hasn't been updated lately
> >>> AFAIK.
> >>>
> >>> I'll keep you posted on some results as I get some results out
> >>> (testing
> >>> TCP/IP as well as the HyperTransport on a Tyan Beast). Up to now,
> >>> LAM-MPI
> >>> seems less efficient at async communications and shows no
> >>> improovments
> >>> with persistant communications under TCP/IP. OpenMPI, on the
> >>> other hand,
> >>> seems more efficient using persistant communications when in a
> >>> HyperTransport (shmem) environment... I know I am crossing many test
> >>> boudaries but I will post some PNGs of my results (as well as how
> >>> I got to
> >>> them ;)
> >>>
> >>> Eric
> >>>
> >>> On Thu, 15 Feb 2007, Jeff Squyres wrote:
> >>>
> >>>> I think you want to add $HOME/openmpi_`uname -m`/lib to your
> >>>> LD_LIBRARY_PATH. This should allow executables created by mpicc
> >>>> (or
> >>>> any derivation thereof, such as extracting flags via showme) to
> >>>> find
> >>>> the Right shared libraries.
> >>>>
> >>>> Let us know if that works for you.
> >>>>
> >>>> FWIW, we do recommend using the wrapper compilers over
> >>>> extracting the
> >>>> flags via --showme whenever possible (it's just simpler and
> >>>> should do
> >>>> what you need).
> >>>>
> >>>>
> >>>> On Feb 15, 2007, at 3:38 PM, Eric Thibodeau wrote:
> >>>>
> >>>>> Hello all,
> >>>>>
> >>>>>
> >>>>> I have been attempting to compile mpptest on my nodes in vain.
> >>>>> Here
> >>>>> is my current setup:
> >>>>>
> >>>>>
> >>>>> Openmpi is in "$HOME/openmpi_`uname -m`" which translates to "/
> >>>>> export/home/eric/openmpi_i686/". I tried the following approaches
> >>>>> (you can see some of these were out of desperation):
> >>>>>
> >>>>>
> >>>>> CFLAGS=`mpicc --showme:compile` LDFLAGS=`mpicc --showme:link` ./
> >>>>> configure
> >>>>>
> >>>>>
> >>>>> Configure fails on:
> >>>>>
> >>>>> checking whether the C compiler works... configure: error: cannot
> >>>>> run C compiled programs.
> >>>>>
> >>>>>
> >>>>> The log shows that:
> >>>>>
> >>>>> ./a.out: error while loading shared libraries: liborte.so.0:
> >>>>> cannot
> >>>>> open shared object file: No such file or directory
> >>>>>
> >>>>>
> >>>>>
> >>>>> CC="/export/home/eric/openmpi_i686/bin/mpicc" ./configure --with-
> >>>>> mpi=$HOME/openmpi_`uname -m`
> >>>>>
> >>>>> Same problems as above...
> >>>>>
> >>>>>
> >>>>> LDFLAGS="$HOME/openmpi_`uname -m`/lib" ./configure --with-mpi=
> >>>>> $HOME/
> >>>>> openmpi_`uname -m`
> >>>>>
> >>>>>
> >>>>> Configure fails on:
> >>>>>
> >>>>> checking for C compiler default output file name... configure:
> >>>>> error: C compiler cannot create executables
> >>>>>
> >>>>>
> >>>>> And...finally (not that all of this was done in the presented
> >>>>> order):
> >>>>>
> >>>>> ./configure --with-mpi=$HOME/openmpi_`uname -m`
> >>>>>
> >>>>>
> >>>>> Which ends with:
> >>>>>
> >>>>>
> >>>>> checking for library containing MPI_Init... no
> >>>>>
> >>>>> configure: error: Could not find MPI library
> >>>>>
> >>>>>
> >>>>> Anyone can help me with this one...?
> >>>>>
> >>>>>
> >>>>> Note that LAM-MPI is also installed on these systems...
> >>>>>
> >>>>>
> >>>>> Eric Thibodeau
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> users_at_[hidden]
> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>
> >>>>
> >>>>
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> users_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>
> >>>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >
> > --
> > Eric Thibodeau
> > Neural Bucket Solutions Inc.
> > T. (514) 736-1436
> > C. (514) 710-0517
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>

-- 
Eric Thibodeau
Neural Bucket Solutions Inc.
T. (514) 736-1436
C. (514) 710-0517