Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Gleb Natapov (glebn_at_[hidden])
Date: 2005-08-10 05:45:54


Hello Tim,

On Tue, Aug 09, 2005 at 10:22:34AM -0600, Timothy B. Prins wrote:
> If you have anyother ideas of how to do it please let us know.
>
>

I have to confess I don't like current pindown cache implementation much or
perhaps I don't understand it enough.

What I managed to understand from the code is this:

There are three functions:
int mca_mpool_base_insert(void * addr, size_t size,
                          mca_mpool_base_module_t* mpool,
                          void* user_data,
                          mca_mpool_base_registration_t* registration);
int mca_mpool_base_remove(void * base);
mca_mpool_base_chunk_t* mca_mpool_base_find(void* base);

When btl registers memory it inserts registration in global cache by calling
mca_mpool_base_insert() this insertion may shadow registration of the same
memory from another module or even from the same module.

mca_mpool_base_remove() removes address from the cache, but there is no way
module can guaranty that deleted registration belongs to the module calling
remove.

mca_mpool_base_find() returns first registration it encounter in the cache. The
registration may not be the best (biggest) or it may belong to the wrong module
(endpoint is not accessible through it).

Each btl should maintain it's own mru list, but the code is pretty much the same.

The saddest thing is you can't override the interface in your module. It is too
coupled with pml (ob1) and btls. If you don't like the way registration cache
works the only way to fix it is rewrite pml/btl/mpool.

I have some ideas about interface that I want to see, but perhaps it will not
play nice with the way ob1 works now. And remember my view is IB centric and may
be completely wrong for other interconnects. I will be glad to here your
comments.

I think cache should be implemented for each mpool and not single global one.

Three function will be added to mca_mpool_base_module_t:
mpool_insert(mca_mpool_base_module_t, mca_mpool_base_registration_t)
mca_mpool_base_registration_t mpool_find(mca_mpool_base_module_t, void *addr, size_t size)
mpool_put (mca_mpool_base_module_t, mca_mpool_base_registration_t);

Each mpool can override those functions and provide its own cache implementation.
But base implementation will provide default one. The cache will maintain it's
own mru list.

mca_mpool_base_find(void *addr, size_t length) will iterate through mpool list,
will call mpool_find() for each of them and will return list of registration to
pml. pml should call mpool_put() on registration it no longer needs (this is
needed for proper reference counting).

btl will call mpool_insert() after mpool_register() it is possible to merge these
two functions in one.

I have code that manages overlapping registrations and I am porting it to
openmpi now, but without changing the way mpool works it will be not very
useful.

Thanks,

--
			Gleb.