On Tue, 2006-11-28 at 10:00 -0700, Li-Ta Lo wrote:
> On Mon, 2006-11-27 at 17:21 -0800, Matt Leininger wrote:
> > On Mon, 2006-11-27 at 16:45 -0800, Matt Leininger wrote:
> > > Has anyone testing OMPI's alltoall at > 2000 MPI tasks? I'm seeing each
> > > MPI task eat up > 1GB of memory (just for OMPI - not the app).
> > I gathered some more data using the alltoall benchmark in mpiBench.
> > mpiBench is pretty smart about how large its buffers are. I set it to
> > use <= 100MB.
> > num nodes num MPI tasks system mem mpibench buffer mem
> > 128 1024 1 GB 65 MB
> > 160 1280 1.2 GB 82 MB
> > 192 1536 1.4 GB 98 MB
> > 224 1792 1.6 GB 57 MB
> > 256 2048 1.6-1.8 GB < 100 MB
> > The 256 node run was killed by the OOM for using too much memory. For
> > all these tests the OMPI alltoall is using 1 GB or more of system
> > memory. I know LANL is looking into optimized alltoall, but is anyone
> > looking into the scalability of the memory footprint?
> I am the one who is looking into those collective communications. Which
> mca/coll are you using for alltoall?
The ompi_info output had some mca/coll information in it. I'm not
sure which mca/coll parameter you are interested in.
> Does the OOM killer kick in when
> calling other collective routines?
I've tested Bcast, Barrier, Allreduce, Gather, Scatter, Reduce,
Allgather, and Alltoall. So far only the Alltoall has this problem.
> If it is a problem caused by SM
> files, all collectives should be affected.
> devel mailing list