Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Carsten Kutzner (ckutzne_at_[hidden])
Date: 2006-01-10 10:42:50


Hi Graham,

thanks for fixing it so fast! I have attached a 128 CPU (=32 nodes*4
CPUs) slog file that tests the OpenMPI tuned all-to-all for a message
size of 4096 floats (16384 bytes) where the execution times vary
between 0.12 and 0.43 seconds.

Summary (25-run average, timer resolution 0.000001):
      4096 floats took 0.205353 (0.090916) seconds. Min: 0.129327 max: 0.430769

Since the all-to-all works well for 256 and 512 floats with the changes in
the decision function there is only a minor problem for messages >= 4096
floats remaining. One can probably live with that, however it would be
nice to figure out what exactly causes the delays. Do you see this
behaviour on other clusters as well? I tested on 3 different clusters
up to now but they all show the same behaviour (however, they actually
are all connected with a HP2848 switch). Could you perhaps get any hint
from the 32 CPU logfile I sent?

Beste Gruesse,
  Carsten

On Sat, 7 Jan 2006, Graham E Fagg wrote:

> Hi Carsten,
> ops, sorry!. There was a memory bug created by me misusing one my own
> collective topo functions.. which I think was corrupting the MPE logging
> buffers (and who knows what else). Anyway it should be fixed in the next
> nightly build/tarball.
>
> G
> On Fri, 6 Jan 2006, Carsten Kutzner wrote:
>
> > On Fri, 6 Jan 2006, Graham E Fagg wrote:
> >
> >>> Looks like the problem is somewhere in the tuned collectives?
> >>> Unfortunately I need a logfile with exactly those :(
> >>>
> >>> Carsten
> >>
> >> I hope not. Carsten can you send me your configure line (not the whole
> >> log) and any other things your set in your .mca conf file. Is this with
> >> the changed (custom) decision function or the standard one??
> >
> > I get the problems with custom decision function as well as without. Today
> > I downloaded a clean tarball 1.1a1r8626 and changed nothing. I simply
> > configure with
> >
> > $ ./configure --prefix=/home/ckutzne/ompi1.1a1r8626-gcc331
> >
> > Then make all install and that's it. I both tried gcc3.3.1 and gcc4.0.2.
> >
> > Then I install MPE from mpe2.tar.gz with
> > ./configure MPI_CC=/home/ckutzne/ompi1.1a1r8626-gcc331/bin/mpicc \
> > CC=/usr/bin/gcc \
> > MPI_F77=/home/ckutzne/ompi1.1a1r8626-gcc331/bin/mpif77 \
> > F77=/usr/bin/gcc \
> > --prefix=/home/ckutzne/mpe2-ompi1.1a1r8626-gcc331
> > make
> > make install
> > make installcheck --> ok!
> >
> > I did not set anything in an .mca conf file (do I have to?)
> >
> > Carsten
> >
> >
>
>
> Thanks,
> Graham.
> ----------------------------------------------------------------------
> Dr Graham E. Fagg | Distributed, Parallel and Meta-Computing
> Innovative Computing Lab. PVM3.4, HARNESS, FT-MPI, SNIPE & Open MPI
> Computer Science Dept | Suite 203, 1122 Volunteer Blvd,
> University of Tennessee | Knoxville, Tennessee, USA. TN 37996-3450
> Email: fagg_at_[hidden] | Phone:+1(865)974-5790 | Fax:+1(865)974-8296
> Broken complex systems are always derived from working simple systems
> ----------------------------------------------------------------------
>