as I already asked in the users list, I was told thats not the
right place to ask, I came across a "missbehaviour" of openmpi
version 1.4.5 and 1.6.5 alike.
the mca_pml_ob1_send function keeps allocating memory in the pml free
list. It does that indefinitly. In my case the list
grew to about 100Gb.
I can controll the maximum using the pml_ob1_free_list_max parameter,
but then the application just stops working when this number of entries
in the list is reached.
The interesting part is that the growth only happens in a single place
in the code, which is RECURSIVE SUBROUTINE.
And the called function is an MPI_ALLREDUCE(... MPI_SUM)
Apparently its not easy to create a test program that shows the same
behaviour, just recursion is not enought.
Is there a mca parameter that allows to limit the total list size
without making the app. stop ?
or is there a way to enforce the lock on the free list entries ?
Thanks for all the help