Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Josh Aune (luken_at_[hidden])
Date: 2007-08-24 16:18:09

We are using open-mpi on several 1000+ node clusters. We received
several new clusters using the Infiniserve 3.X software stack recently
and are having several problems with the vapi btl (yes, I know, it is
very very old and shouldn't be used. I couldn't agree with you more
but those are my marching orders).

I have a new application that is running into swap for an unknown
reason. If I run and force it to use the tcp btl I don't seem to run
into swap (the job just takes a very very long time). I have tried
restricting the size of the free lists, forcing to use send mode, and
use an open-mpi compiled w/ no memory manager but nothing seems to
help. I've profiled with valgrind --tool=massif and the memtrace
capabilities of ptmalloc but I don't have any smoking guns yet. It is
a fortran app an I don't know anything about debugging fortran memory
problems, can someone point me in the proper direction?