Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r27739 - in trunk: ompi/mca/btl/sm ompi/mca/common/sm ompi/mca/mpool/sm opal/mca/shmem opal/mca/shmem/mmap opal/mca/shmem/posix opal/mca/shmem/sysv opal/mca/shmem/windows
From: Gutierrez, Samuel K (samuel_at_[hidden])
Date: 2013-01-04 17:39:46


Hi George,

Agreed -- I should have referenced the RFC that I sent out last year. Sorry about not reposting/explicitly mentioning the old RFC from about 5 months ago.

I'm willing to sit down with you and others so we can chat further about the change.

Ralph is correct -- the plan is to have only one rank per node send the information required for sm initialization and have the rest consume them.

If required, I'm willing to backout the commit until a better way is formulated.

Thanks,

Sam

________________________________________
From: devel-bounces_at_[hidden] [devel-bounces_at_[hidden]] on behalf of George Bosilca [bosilca_at_[hidden]]
Sent: Friday, January 04, 2013 1:57 PM
To: devel_at_[hidden]
Subject: Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r27739 - in trunk: ompi/mca/btl/sm ompi/mca/common/sm ompi/mca/mpool/sm opal/mca/shmem opal/mca/shmem/mmap opal/mca/shmem/posix opal/mca/shmem/sysv opal/mca/shmem/windows

Sam,

This is a major change and would have deserved an RFC, as it impose a drastic/major non-scalable change (up to now the backend file creation was centralized, not in addition we exchange the data through the modex). A quick look highlight the fact that quite a lot of new modex entries have appeared after this patch. On a 4 proc (2x2) we got more than 20 entries each one of them up to 32 bytes (he list is attached at the end of this email).

Clearly this new approach is significantly less scalable compared with the old one. In the past we had issues adding one single integer per process, I fail to understand how our standards changed so much that now few hundreds bytes per process become acceptable. Moreover, what is the benefit this change provides in exchange of this loss of scalability?

  George.

PS: The exhaustive list of new SM-related modex entries:
[dancer01:01049] [[50563,1],0] db:hash:store: storing key btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer01:01049] [[50563,1],0] db:hash:store: storing key btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],0]
[dancer01:01049] [[50563,1],0] db:hash:store: storing key btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer01:01049] [[50563,1],0] db:hash:store: storing key btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],0]
[dancer01:01049] [[50563,1],0] db:hash:store: storing key btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer02:01720] [[50563,1],1] db:hash:store: storing key btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer02:01720] [[50563,1],1] db:hash:store: storing key btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],1]
[dancer02:01720] [[50563,1],1] db:hash:store: storing key btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer02:01720] [[50563,1],1] db:hash:store: storing key btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],1]
[dancer02:01720] [[50563,1],1] db:hash:store: storing key btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer02:01720] [[50563,1],1] db:hash:store: storing pointer of key btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer02:01720] [[50563,1],1] db:hash:store: storing pointer of key btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],0]
[dancer02:01720] [[50563,1],1] db:hash:store: storing pointer of key btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer02:01720] [[50563,1],1] db:hash:store: storing pointer of key btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],0]
[dancer02:01720] [[50563,1],1] db:hash:store: storing pointer of key btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],0]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],0]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],0]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],0]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer01:01049] [[50563,1],0] db:hash:store: storing pointer of key btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer01:01049] [[50563,1],0] db:hash:store: storing pointer of key btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],1]
[dancer01:01049] [[50563,1],0] db:hash:store: storing pointer of key btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer01:01049] [[50563,1],0] db:hash:store: storing pointer of key btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],1]
[dancer01:01049] [[50563,1],0] db:hash:store: storing pointer of key btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],1]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],1]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],1]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],1]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],1]

On Jan 3, 2013, at 22:52 , svn-commit-mailer_at_[hidden] wrote:

> Author: samuel (Samuel K. Gutierrez)
> Date: 2013-01-03 16:52:20 EST (Thu, 03 Jan 2013)
> New Revision: 27739
> URL: https://svn.open-mpi.org/trac/ompi/changeset/27739
>
> Log:
> sm BTL initialization via modex, as discussed at last year's meeting.
>
> Text files modified:
> trunk/ompi/mca/btl/sm/btl_sm.c | 337 +++++++++++++++++++++--------
> trunk/ompi/mca/btl/sm/btl_sm.h | 60 +++++
> trunk/ompi/mca/btl/sm/btl_sm_component.c | 444 ++++++++++++++++++++++++++++++++++++++-
> trunk/ompi/mca/btl/sm/help-mpi-btl-sm.txt | 6
> trunk/ompi/mca/common/sm/common_sm.c | 92 +++++--
> trunk/ompi/mca/common/sm/common_sm.h | 45 +++
> trunk/ompi/mca/mpool/sm/mpool_sm.h | 17
> trunk/ompi/mca/mpool/sm/mpool_sm_component.c | 111 ++++-----
> trunk/opal/mca/shmem/mmap/shmem_mmap_module.c | 7
> trunk/opal/mca/shmem/posix/shmem_posix_module.c | 9
> trunk/opal/mca/shmem/shmem_types.h | 36 ++
> trunk/opal/mca/shmem/sysv/shmem_sysv_module.c | 11
> trunk/opal/mca/shmem/windows/shmem_windows_module.c | 7
> 13 files changed, 933 insertions(+), 249 deletions(-)

_______________________________________________
devel mailing list
devel_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/devel