If you are running on a single node, then btl=openib,sm,self would be
equivalent to btl=sm,self. OMPI is smart enough to know not to use IB if you
are on a single node, and instead uses the shared memory subsystem.
Are you saying that the inclusion of openib is causing a difference in
behavior, even though all procs are on the same node??
Just want to ensure I understand the problem.
On Fri, May 1, 2009 at 11:16 AM, Gus Correa <gus_at_[hidden]> wrote:
> Hi OpenMPI and HPC experts
> This may or may not be the right forum to post this,
> and I am sorry to bother those that think it is not.
> I am trying to run the HPL benchmark on our cluster,
> compiling it with Gnu and linking to
> GotoBLAS (1.26) and OpenMPI (1.3.1),
> both also Gnu-compiled.
> I have got failures that suggest a memory leak when the
> problem size is large, but still within the memory limits
> recommended by HPL.
> The problem only happens when "openib" is among the OpenMPI
> MCA parameters (and the problem size is large).
> Any help is appreciated.
> Here is a description of what happens.
> For starters I am trying HPL on a single node, to get a feeling for
> the right parameters (N & NB, P & Q, etc) on dual-socked quad-core
> AMD Opteron 2376 "Shanghai"
> The HPL recommendation is to use close to 80% of your physical memory,
> to reach top Gigaflop performance.
> Our physical memory on a node is 16GB, and this gives a problem size
> N=40,000 to keep the 80% memory use.
> I tried several block sizes, somewhat correlated to the size of the
> processor cache: NB=64 80 96 128 ...
> When I run HPL with N=20,000 or smaller all works fine,
> and the HPL run completes, regardless of whether "openib"
> is present or not on my MCA parameters.
> However, moving when I move N=40,000, or even N=35,000,
> the run starts OK with NB=64,
> but as NB is switched to larger values
> the total memory use increases in jumps (as shown by Ganglia),
> and becomes uneven across the processors (as shown by "top").
> The problem happens if "openib" is among the MCA parameters,
> but doesn't happen if I remove "openib" from the MCA list and use
> only "sm,self".
> For N=35,000, when NB reaches 96 memory use is already above the physical
> (16GB), having increased from 12.5GB to over 17GB.
> For N=40,000 the problem happens even earlier, with NB=80.
> At this point memory swapping kicks in,
> and eventually the run dies with memory allocation errors:
> T/V N NB P Q Time Gflops
> WR01L2L4 35000 128 8 1 539.66 5.297e+01
> ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0043992 ......
> HPL ERROR from process # 0, on line 172 of function HPL_pdtest:
> >>> [7,0] Memory allocation failed for A, x and b. Skip. <<<
> The code snippet that corresponds to HPL_pdest.c is this,
> although the leak is probably somewhere else:
> * Allocate dynamic memory
> vptr = (void*)malloc( ( (size_t)(ALGO->align) +
> (size_t)(mat.ld+1) * (size_t)(mat.nq) ) *
> sizeof(double) );
> info = (vptr == NULL); info = myrow; info = mycol;
> (void) HPL_all_reduce( (void *)(info), 3, HPL_INT, HPL_max,
> GRID->all_comm );
> if( info != 0 )
> if( ( myrow == 0 ) && ( mycol == 0 ) )
> HPL_pwarn( TEST->outfp, __LINE__, "HPL_pdtest",
> "[%d,%d] %s", info, info,
> "Memory allocation failed for A, x and b. Skip." );
> I found this continued increase in memory use rather strange,
> and suggestive of a memory leak in one of the codes being used.
> Everything (OpenMPI, GotoBLAS, and HPL)
> was compiled using Gnu only (gcc, gfortran, g++).
> I haven't changed anything on the compiler's memory model,
> i.e., I haven't used or changed the "-mcmodel" flag of gcc
> (I don't know if the Makefiles on HPL, GotoBLAS, and OpenMPI use it.)
> No additional load is present on the node,
> other than the OS (Linux CentOS 5.2), HPL is running alone.
> The cluster has Infiniband.
> However, I am running on a single node.
> The surprising thing is that if I run on shared memory only
> (-mca btl sm,self) there is no memory problem,
> the memory use is stable at about 13.9GB,
> and the run completes.
> So, there is a way around to run on a single node.
> (Actually shared memory is presumably the way to go on a single node.)
> However, if I introduce IB (-mca btl openib,sm,self)
> among the MCA btl parameters, then memory use blows up.
> This is bad news for me, because I want to extend the experiment
> to run HPL also across the whole cluster using IB,
> which is actually the ultimate goal of HPL, of course!
> It also suggests that the problem is somehow related to Infiniband,
> maybe hidden under OpenMPI.
> Here is the mpiexec command I use (with and without openib):
> /path/to/openmpi/bin/mpiexec \
> -prefix /the/run/directory \
> -np 8 \
> -mca btl [openib,]sm,self \
> Any help, insights, suggestions, reports of previous experiences,
> are much appreciated.
> Thank you,
> Gus Correa
> users mailing list