Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25627
From: Shiqing Fan (fan_at_[hidden])
Date: 2011-12-14 08:00:11


Hi George,

Right, I was testing RC1 which has this problem. But now it shouldn't
matter.

Thanks,
Shiqing

On 2011-12-14 1:48 PM, George Bosilca wrote:
> Shiqing,
>
> This file seems to be there.
>
> $ pwd
> /home/bosilca/unstable/1.5/ompi
>
> $ svn info opal/mca/shmem/windows/.windows
> Path: opal/mca/shmem/windows/.windows
> Name: .windows
> URL: https://svn.open-mpi.org/svn/ompi/branches/v1.5/opal/mca/shmem/windows/.windows
> Repository Root: https://svn.open-mpi.org/svn/ompi
> Repository UUID: 63e3feb5-37d5-0310-a306-e8a459e722fe
> Revision: 25637
> Node Kind: file
> Schedule: normal
> Last Changed Author: bosilca
> Last Changed Rev: 25626
> Last Changed Date: 2011-12-13 12:20:25 -0500 (Tue, 13 Dec 2011)
> Text Last Updated: 2011-12-13 12:20:35 -0500 (Tue, 13 Dec 2011)
> Checksum: ebb6f0135ecdcf7f79d1120046dfb3e6
>
> george.
>
> On Dec 14, 2011, at 05:36 , Shiqing Fan wrote:
>
>> Hi George,
>>
>> A .windows file seems still missing in opal/mca/shmem/windows/. Could you also svn add it (from the patch in shmem ticket)?
>>
>> It is not a source file, but rather a CMake required configuration file. Probably this change doesn't need another rc. :-) Thanks a lot.
>>
>>
>> Regards,
>> Shiqing
>>
>> On 2011-12-13 10:30 PM, bosilca_at_[hidden] wrote:
>>> Author: bosilca
>>> Date: 2011-12-13 16:30:53 EST (Tue, 13 Dec 2011)
>>> New Revision: 25627
>>> URL: https://svn.open-mpi.org/trac/ompi/changeset/25627
>>>
>>> Log:
>>> Add and remove some of the files needed for the shmem patch.
>>>
>>> Added:
>>> branches/v1.5/ompi/mca/common/sm/common_sm.c
>>> branches/v1.5/ompi/mca/common/sm/common_sm.h
>>> branches/v1.5/ompi/mca/common/sm/common_sm_rml.c
>>> branches/v1.5/ompi/mca/common/sm/common_sm_rml.h
>>> Removed:
>>> branches/v1.5/ompi/mca/common/sm/common_sm_mmap.c
>>> branches/v1.5/ompi/mca/common/sm/common_sm_mmap.h
>>>
>>> Added: branches/v1.5/ompi/mca/common/sm/common_sm.c
>>> ==============================================================================
>>> --- (empty file)
>>> +++ branches/v1.5/ompi/mca/common/sm/common_sm.c 2011-12-13 16:30:53 EST (Tue, 13 Dec 2011)
>>> @@ -0,0 +1,387 @@
>>> +/*
>>> + * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
>>> + * University Research and Technology
>>> + * Corporation. All rights reserved.
>>> + * Copyright (c) 2004-2005 The University of Tennessee and The University
>>> + * of Tennessee Research Foundation. All rights
>>> + * reserved.
>>> + * Copyright (c) 2004-2009 High Performance Computing Center Stuttgart,
>>> + * University of Stuttgart. All rights reserved.
>>> + * Copyright (c) 2004-2005 The Regents of the University of California.
>>> + * All rights reserved.
>>> + * Copyright (c) 2007 Sun Microsystems, Inc. All rights reserved.
>>> + * Copyright (c) 2008-2010 Cisco Systems, Inc. All rights reserved.
>>> + * Copyright (c) 2010-2011 Los Alamos National Security, LLC.
>>> + * All rights reserved.
>>> + * $COPYRIGHT$
>>> + *
>>> + * Additional copyrights may follow
>>> + *
>>> + * $HEADER$
>>> + */
>>> +
>>> +#include "ompi_config.h"
>>> +
>>> +#ifdef HAVE_STRING_H
>>> +#include<string.h>
>>> +#endif
>>> +
>>> +#include "opal/align.h"
>>> +#include "opal/util/argv.h"
>>> +#if OPAL_ENABLE_FT_CR == 1
>>> +#include "opal/runtime/opal_cr.h"
>>> +#endif
>>> +
>>> +#include "orte/util/name_fns.h"
>>> +#include "orte/util/show_help.h"
>>> +#include "orte/runtime/orte_globals.h"
>>> +#include "orte/mca/errmgr/errmgr.h"
>>> +
>>> +#include "ompi/constants.h"
>>> +#include "ompi/mca/dpm/dpm.h"
>>> +#include "ompi/mca/mpool/sm/mpool_sm.h"
>>> +
>>> +#include "common_sm_rml.h"
>>> +
>>> +/* ASSUMING local process homogeneity with respect to all utilized shared memory
>>> + * facilities. that is, if one local process deems a particular shared memory
>>> + * facility acceptable, then ALL local processes should be able to utilize that
>>> + * facility. as it stands, this is an important point because one process
>>> + * dictates to all other local processes which common sm component will be
>>> + * selected based on its own, local run-time test.
>>> + */
>>> +
>>> +OBJ_CLASS_INSTANCE(
>>> + mca_common_sm_module_t,
>>> + opal_object_t,
>>> + NULL,
>>> + NULL
>>> +);
>>> +
>>> +/* list of RML messages that have arrived that have not yet been
>>> + * consumed by the thread who is looking to complete its component
>>> + * initialization based on the contents of the RML message.
>>> + */
>>> +static opal_list_t pending_rml_msgs;
>>> +/* flag indicating whether or not pending_rml_msgs has been initialized */
>>> +static bool pending_rml_msgs_init = false;
>>> +/* lock to protect multiple instances of mca_common_sm_init() from being
>>> + * invoked simultaneously (because of RML usage).
>>> + */
>>> +static opal_mutex_t mutex;
>>> +/* shared memory information used for initialization and setup. */
>>> +static opal_shmem_ds_t shmem_ds;
>>> +/* number of local processes */
>>> +static size_t num_local_procs = 0;
>>> +/* indicates whether or not i'm the lowest named process */
>>> +static bool lowest_local_proc = false;
>>> +
>>> +/* ////////////////////////////////////////////////////////////////////////// */
>>> +/* static utility functions */
>>> +/* ////////////////////////////////////////////////////////////////////////// */
>>> +
>>> +/* ////////////////////////////////////////////////////////////////////////// */
>>> +static mca_common_sm_module_t *
>>> +attach_and_init(const char *file_name,
>>> + size_t size_ctl_structure,
>>> + size_t data_seg_alignment)
>>> +{
>>> + mca_common_sm_module_t *map = NULL;
>>> + mca_common_sm_seg_header_t *seg = NULL;
>>> + unsigned char *addr = NULL;
>>> +
>>> + /* map the file and initialize segment state */
>>> + if (NULL == (seg = (mca_common_sm_seg_header_t *)
>>> + opal_shmem_segment_attach(&shmem_ds))) {
>>> + return NULL;
>>> + }
>>> + opal_atomic_rmb();
>>> +
>>> + /* set up the map object */
>>> + if (NULL == (map = OBJ_NEW(mca_common_sm_module_t))) {
>>> + ORTE_ERROR_LOG(OMPI_ERR_OUT_OF_RESOURCE);
>>> + return NULL;
>>> + }
>>> +
>>> + /* copy information: from ====> to */
>>> + opal_shmem_ds_copy(&shmem_ds,&map->shmem_ds);
>>> +
>>> + /* the first entry in the file is the control structure. the first
>>> + * entry in the control structure is an mca_common_sm_seg_header_t
>>> + * element
>>> + */
>>> + map->module_seg = seg;
>>> +
>>> + addr = ((unsigned char *)seg) + size_ctl_structure;
>>> + /* if we have a data segment (i.e., if 0 != data_seg_alignment),
>>> + * then make it the first aligned address after the control
>>> + * structure. IF THIS HAPPENS, THIS IS A PROGRAMMING ERROR IN
>>> + * OPEN MPI!
>>> + */
>>> + if (0 != data_seg_alignment) {
>>> + addr = OPAL_ALIGN_PTR(addr, data_seg_alignment, unsigned char *);
>>> + /* is addr past end of the shared memory segment? */
>>> + if ((unsigned char *)seg + shmem_ds.seg_size< addr) {
>>> + orte_show_help("help-mpi-common-sm.txt", "mmap too small", 1,
>>> + orte_process_info.nodename,
>>> + (unsigned long)shmem_ds.seg_size,
>>> + (unsigned long)size_ctl_structure,
>>> + (unsigned long)data_seg_alignment);
>>> + return NULL;
>>> + }
>>> + }
>>> +
>>> + map->module_data_addr = addr;
>>> + map->module_seg_addr = (unsigned char *)seg;
>>> +
>>> + /* map object successfully initialized - we can safely increment
>>> + * seg_num_procs_attached_and_inited. this value is used by
>>> + * opal_shmem_unlink.
>>> + */
>>> + (void)opal_atomic_add_size_t(&map->module_seg->seg_num_procs_inited, 1);
>>> + opal_atomic_wmb();
>>> +
>>> + return map;
>>> +}
>>> +
>>> +/* ////////////////////////////////////////////////////////////////////////// */
>>> +mca_common_sm_module_t *
>>> +mca_common_sm_init(ompi_proc_t **procs,
>>> + size_t num_procs,
>>> + size_t size,
>>> + char *file_name,
>>> + size_t size_ctl_structure,
>>> + size_t data_seg_alignment)
>>> +{
>>> + mca_common_sm_module_t *map = NULL;
>>> + bool found_lowest = false;
>>> + size_t p;
>>> + size_t mem_offset;
>>> + ompi_proc_t *temp_proc;
>>> +
>>> + num_local_procs = 0;
>>> + lowest_local_proc = false;
>>> +
>>> + /* o reorder procs array to have all the local procs at the beginning.
>>> + * o look for the local proc with the lowest name.
>>> + * o determine the number of local procs.
>>> + * o ensure that procs[0] is the lowest named process.
>>> + */
>>> + for (p = 0; p< num_procs; ++p) {
>>> + if (OPAL_PROC_ON_LOCAL_NODE(procs[p]->proc_flags)) {
>>> + /* if we don't have a lowest, save the first one */
>>> + if (!found_lowest) {
>>> + procs[0] = procs[p];
>>> + found_lowest = true;
>>> + }
>>> + else {
>>> + /* save this proc */
>>> + procs[num_local_procs] = procs[p];
>>> + /* if we have a new lowest, swap it with position 0
>>> + * so that procs[0] is always the lowest named proc
>>> + */
>>> + if (orte_util_compare_name_fields(ORTE_NS_CMP_ALL,
>>> +&(procs[p]->proc_name),
>>> +&(procs[0]->proc_name))< 0) {
>>> + temp_proc = procs[0];
>>> + procs[0] = procs[p];
>>> + procs[num_local_procs] = temp_proc;
>>> + }
>>> + }
>>> + /* regardless of the comparisons above, we found
>>> + * another proc on the local node, so increment
>>> + */
>>> + ++num_local_procs;
>>> + }
>>> + }
>>> +
>>> + /* if there is less than 2 local processes, there's nothing to do. */
>>> + if (num_local_procs< 2) {
>>> + return NULL;
>>> + }
>>> +
>>> + /* determine whether or not i am the lowest local process */
>>> + lowest_local_proc = (0 == orte_util_compare_name_fields(
>>> + ORTE_NS_CMP_ALL,
>>> + ORTE_PROC_MY_NAME,
>>> +&(procs[0]->proc_name)));
>>> +
>>> + /* lock here to prevent multiple threads from invoking this
>>> + * function simultaneously. the critical section we're protecting
>>> + * is usage of the RML in this block.
>>> + */
>>> + opal_mutex_lock(&mutex);
>>> +
>>> + if (!pending_rml_msgs_init) {
>>> + OBJ_CONSTRUCT(&(pending_rml_msgs), opal_list_t);
>>> + pending_rml_msgs_init = true;
>>> + }
>>> + /* figure out if i am the lowest rank in the group.
>>> + * if so, i will create the shared memory backing store
>>> + */
>>> + if (lowest_local_proc) {
>>> + if (OPAL_SUCCESS == opal_shmem_segment_create(&shmem_ds, file_name,
>>> + size)) {
>>> + map = attach_and_init(file_name, size_ctl_structure,
>>> + data_seg_alignment);
>>> + if (NULL != map) {
>>> + mem_offset = map->module_data_addr -
>>> + (unsigned char *)map->module_seg;
>>> + map->module_seg->seg_offset = mem_offset;
>>> + map->module_seg->seg_size = size - mem_offset;
>>> + opal_atomic_init(&map->module_seg->seg_lock,
>>> + OPAL_ATOMIC_UNLOCKED);
>>> + map->module_seg->seg_inited = 0;
>>> + }
>>> + else {
>>> + /* fail!
>>> + * only invalidate the shmem_ds. doing so will let the rest
>>> + * of the local processes know that the lowest local rank
>>> + * failed to properly initialize the shared memory segment, so
>>> + * they should try to carry on without shared memory support
>>> + */
>>> + OPAL_SHMEM_DS_INVALIDATE(&shmem_ds);
>>> + }
>>> + }
>>> + }
>>> +
>>> + /* send shmem info to the rest of the local procs. */
>>> + if (OMPI_SUCCESS != mca_common_sm_rml_info_bcast(
>>> +&shmem_ds, procs, num_local_procs,
>>> + OMPI_RML_TAG_SM_BACK_FILE_CREATED,
>>> + lowest_local_proc, file_name,
>>> +&(pending_rml_msgs))) {
>>> + goto out;
>>> + }
>>> +
>>> + /* are we dealing with a valid shmem_ds? that is, did the lowest
>>> + * process successfully initialize the shared memory segment?
>>> + */
>>> + if (OPAL_SHMEM_DS_IS_VALID(&shmem_ds)) {
>>> + if (!lowest_local_proc) {
>>> + map = attach_and_init(file_name, size_ctl_structure,
>>> + data_seg_alignment);
>>> + }
>>> + else {
>>> + /* wait until every other participating process has attached to the
>>> + * shared memory segment.
>>> + */
>>> + while (num_local_procs> map->module_seg->seg_num_procs_inited) {
>>> + opal_atomic_rmb();
>>> + }
>>> + opal_shmem_unlink(&shmem_ds);
>>> + }
>>> + }
>>> +
>>> +out:
>>> + opal_mutex_unlock(&mutex);
>>> + return map;
>>> +}
>>> +
>>> +/* ////////////////////////////////////////////////////////////////////////// */
>>> +/**
>>> + * this routine is the same as mca_common_sm_mmap_init() except that
>>> + * it takes an (ompi_group_t *) parameter to specify the peers rather
>>> + * than an array of procs. unlike mca_common_sm_mmap_init(), the
>>> + * group must contain *only* local peers, or this function will return
>>> + * NULL and not create any shared memory segment.
>>> + */
>>> +mca_common_sm_module_t *
>>> +mca_common_sm_init_group(ompi_group_t *group,
>>> + size_t size,
>>> + char *file_name,
>>> + size_t size_ctl_structure,
>>> + size_t data_seg_alignment)
>>> +{
>>> + mca_common_sm_module_t *ret = NULL;
>>> + ompi_proc_t **procs = NULL;
>>> + size_t i;
>>> + size_t group_size;
>>> + ompi_proc_t *proc;
>>> +
>>> + /* if there is less than 2 procs, there's nothing to do */
>>> + if ((group_size = ompi_group_size(group))< 2) {
>>> + goto out;
>>> + }
>>> + else if (NULL == (procs = (ompi_proc_t **)
>>> + malloc(sizeof(ompi_proc_t *) * group_size))) {
>>> + ORTE_ERROR_LOG(OMPI_ERR_OUT_OF_RESOURCE);
>>> + goto out;
>>> + }
>>> + /* make sure that all the procs in the group are local */
>>> + for (i = 0; i< group_size; ++i) {
>>> + proc = ompi_group_peer_lookup(group, i);
>>> + if (!OPAL_PROC_ON_LOCAL_NODE(proc->proc_flags)) {
>>> + goto out;
>>> + }
>>> + procs[i] = proc;
>>> + }
>>> + /* let mca_common_sm_init take care of the rest ... */
>>> + ret = mca_common_sm_init(procs, group_size, size, file_name,
>>> + size_ctl_structure, data_seg_alignment);
>>> +out:
>>> + if (NULL != procs) {
>>> + free(procs);
>>> + }
>>> + return ret;
>>> +}
>>> +
>>> +/* ////////////////////////////////////////////////////////////////////////// */
>>> +/**
>>> + * allocate memory from a previously allocated shared memory
>>> + * block.
>>> + *
>>> + * @param size size of request, in bytes (IN)
>>> + *
>>> + * @retval addr virtual address
>>> + */
>>> +void *
>>> +mca_common_sm_seg_alloc(struct mca_mpool_base_module_t *mpool,
>>> + size_t *size,
>>> + mca_mpool_base_registration_t **registration)
>>> +{
>>> + mca_mpool_sm_module_t *sm_module = (mca_mpool_sm_module_t *)mpool;
>>> + mca_common_sm_seg_header_t* seg = sm_module->sm_common_module->module_seg;
>>> + void *addr;
>>> +
>>> + opal_atomic_lock(&seg->seg_lock);
>>> + if (seg->seg_offset + *size> seg->seg_size) {
>>> + addr = NULL;
>>> + }
>>> + else {
>>> + size_t fixup;
>>> +
>>> + /* add base address to segment offset */
>>> + addr = sm_module->sm_common_module->module_data_addr + seg->seg_offset;
>>> + seg->seg_offset += *size;
>>> +
>>> + /* fix up seg_offset so next allocation is aligned on a
>>> + * sizeof(long) boundry. Do it here so that we don't have to
>>> + * check before checking remaining size in buffer
>>> + */
>>> + if ((fixup = (seg->seg_offset& (sizeof(long) - 1)))> 0) {
>>> + seg->seg_offset += sizeof(long) - fixup;
>>> + }
>>> + }
>>> + if (NULL != registration) {
>>> + *registration = NULL;
>>> + }
>>> + opal_atomic_unlock(&seg->seg_lock);
>>> + return addr;
>>> +}
>>> +
>>> +/* ////////////////////////////////////////////////////////////////////////// */
>>> +int
>>> +mca_common_sm_fini(mca_common_sm_module_t *mca_common_sm_module)
>>> +{
>>> + int rc = OMPI_SUCCESS;
>>> +
>>> + if (NULL != mca_common_sm_module->module_seg) {
>>> + if (OPAL_SUCCESS !=
>>> + opal_shmem_segment_detach(&mca_common_sm_module->shmem_ds)) {
>>> + rc = OMPI_ERROR;
>>> + }
>>> + }
>>> + return rc;
>>> +}
>>> +
>>>
>>> Added: branches/v1.5/ompi/mca/common/sm/common_sm.h
>>> ==============================================================================
>>> --- (empty file)
>>> +++ branches/v1.5/ompi/mca/common/sm/common_sm.h 2011-12-13 16:30:53 EST (Tue, 13 Dec 2011)
>>> @@ -0,0 +1,163 @@
>>> +/*
>>> + * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
>>> + * University Research and Technology
>>> + * Corporation. All rights reserved.
>>> + * Copyright (c) 2004-2005 The University of Tennessee and The University
>>> + * of Tennessee Research Foundation. All rights
>>> + * reserved.
>>> + * Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
>>> + * University of Stuttgart. All rights reserved.
>>> + * Copyright (c) 2004-2005 The Regents of the University of California.
>>> + * All rights reserved.
>>> + * Copyright (c) 2009-2010 Cisco Systems, Inc. All rights reserved.
>>> + * Copyright (c) 2010-2011 Los Alamos National Security, LLC.
>>> + * All rights reserved.
>>> + * $COPYRIGHT$
>>> + *
>>> + * Additional copyrights may follow
>>> + *
>>> + * $HEADER$
>>> + */
>>> +
>>> +#ifndef _COMMON_SM_H_
>>> +#define _COMMON_SM_H_
>>> +
>>> +#include "ompi_config.h"
>>> +
>>> +#include "opal/mca/mca.h"
>>> +#include "opal/class/opal_object.h"
>>> +#include "opal/class/opal_list.h"
>>> +#include "opal/sys/atomic.h"
>>> +#include "opal/mca/shmem/shmem.h"
>>> +
>>> +#include "ompi/mca/mpool/mpool.h"
>>> +#include "ompi/proc/proc.h"
>>> +#include "ompi/group/group.h"
>>> +#include "ompi/mca/btl/base/base.h"
>>> +#include "ompi/mca/btl/base/btl_base_error.h"
>>> +
>>> +BEGIN_C_DECLS
>>> +
>>> +struct mca_mpool_base_module_t;
>>> +
>>> +typedef struct mca_common_sm_seg_header_t {
>>> + /* lock to control atomic access */
>>> + opal_atomic_lock_t seg_lock;
>>> + /* indicates whether or not the segment is ready for use */
>>> + volatile int32_t seg_inited;
>>> + /* number of local processes that are attached to the shared memory segment.
>>> + * this is primarily used as a way of determining whether or not it is safe
>>> + * to unlink the shared memory backing store. for example, once seg_att
>>> + * is equal to the number of local processes, then we can safely unlink.
>>> + */
>>> + volatile size_t seg_num_procs_inited;
>>> + /* offset to next available memory location available for allocation */
>>> + size_t seg_offset;
>>> + /* total size of the segment */
>>> + size_t seg_size;
>>> +} mca_common_sm_seg_header_t;
>>> +
>>> +typedef struct mca_common_sm_module_t {
>>> + /* double link list element */
>>> + opal_list_item_t module_item;
>>> + /* pointer to header embedded in the shared memory segment */
>>> + mca_common_sm_seg_header_t *module_seg;
>>> + /* base address of the segment */
>>> + unsigned char *module_seg_addr;
>>> + /* base address of data segment */
>>> + unsigned char *module_data_addr;
>>> + /* shared memory backing facility object that encapsulates shmem info */
>>> + opal_shmem_ds_t shmem_ds;
>>> +} mca_common_sm_module_t;
>>> +
>>> +OBJ_CLASS_DECLARATION(mca_common_sm_module_t);
>>> +
>>> +/**
>>> + * This routine is used to set up a shared memory segment (whether
>>> + * it's an mmaped file or a SYSV IPC segment). It is assumed that
>>> + * the shared memory segment does not exist before any of the current
>>> + * set of processes try and open it.
>>> + *
>>> + * @param procs - array of (ompi_proc_t *)'s to create this shared
>>> + * memory segment for. This array must be writable; it may be edited
>>> + * (in undefined ways) if the array contains procs that are not on
>>> + * this host. It is assumed that the caller will simply free this
>>> + * array upon return. (INOUT)
>>> + *
>>> + * @param num_procs - length of the procs array (IN)
>>> + *
>>> + * @param size - size of the segment, in bytes (IN)
>>> + *
>>> + * @param name - unique string identifier of this segment (IN)
>>> + *
>>> + * @param size_ctl_structure size of the control structure at
>>> + * the head of the segment. The control structure
>>> + * is assumed to have mca_common_sm_seg_header_t
>>> + * as its first segment (IN)
>>> + *
>>> + * @param data_set_alignment alignment of the data segment. this
>>> + * follows the control structure. If this
>>> + * value if 0, then assume that there will
>>> + * be no data segment following the control
>>> + * structure. (IN)
>>> + *
>>> + * @returnvalue pointer to control structure at head of shared memory segment.
>>> + */
>>> +OMPI_DECLSPEC extern mca_common_sm_module_t *
>>> +mca_common_sm_init(ompi_proc_t **procs,
>>> + size_t num_procs,
>>> + size_t size,
>>> + char *file_name,
>>> + size_t size_ctl_structure,
>>> + size_t data_seg_alignment);
>>> +
>>> +/**
>>> + * This routine is used to set up a shared memory segment (whether
>>> + * it's an mmaped file or a SYSV IPC segment). It is assumed that
>>> + * the shared memory segment does not exist before any of the current
>>> + * set of processes try and open it.
>>> + *
>>> + * This routine is the same as mca_common_sm_mmap_init() except that
>>> + * it takes an (ompi_group_t *) parameter to specify the peers rather
>>> + * than an array of procs. Unlike mca_common_sm_mmap_init(), the
>>> + * group must contain *only* local peers, or this function will return
>>> + * NULL and not create any shared memory segment.
>>> + */
>>> +OMPI_DECLSPEC extern mca_common_sm_module_t *
>>> +mca_common_sm_init_group(ompi_group_t *group,
>>> + size_t size,
>>> + char *file_name,
>>> + size_t size_ctl_structure,
>>> + size_t data_seg_alignment);
>>> +
>>> +/**
>>> + * callback from the sm mpool
>>> + */
>>> +OMPI_DECLSPEC extern void *
>>> +mca_common_sm_seg_alloc(struct mca_mpool_base_module_t *mpool,
>>> + size_t* size,
>>> + mca_mpool_base_registration_t **registration);
>>> +
>>> +/**
>>> + * This function will release all local resources attached to the
>>> + * shared memory segment. We assume that the operating system will
>>> + * release the memory resources when the last process release it.
>>> + *
>>> + * @param mca_common_sm_module - instance that is shared between
>>> + * components that use shared memory.
>>> + *
>>> + * @return OMPI_SUCCESS if everything was okay, otherwise return OMPI_ERROR.
>>> + */
>>> +
>>> +OMPI_DECLSPEC extern int
>>> +mca_common_sm_fini(mca_common_sm_module_t *mca_common_sm_module);
>>> +
>>> +/**
>>> + * instance that is shared between components that use shared memory.
>>> + */
>>> +OMPI_DECLSPEC extern mca_common_sm_module_t *mca_common_sm_module;
>>> +
>>> +END_C_DECLS
>>> +
>>> +#endif /* _COMMON_SM_H_ */
>>> +
>>>
>>> Deleted: branches/v1.5/ompi/mca/common/sm/common_sm_mmap.c
>>> ==============================================================================
>>>
>>> Deleted: branches/v1.5/ompi/mca/common/sm/common_sm_mmap.h
>>> ==============================================================================
>>>
>>> Added: branches/v1.5/ompi/mca/common/sm/common_sm_rml.c
>>> ==============================================================================
>>> --- (empty file)
>>> +++ branches/v1.5/ompi/mca/common/sm/common_sm_rml.c 2011-12-13 16:30:53 EST (Tue, 13 Dec 2011)
>>> @@ -0,0 +1,154 @@
>>> +/*
>>> + * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
>>> + * University Research and Technology
>>> + * Corporation. All rights reserved.
>>> + * Copyright (c) 2004-2011 The University of Tennessee and The University
>>> + * of Tennessee Research Foundation. All rights
>>> + * reserved.
>>> + * Copyright (c) 2004-2009 High Performance Computing Center Stuttgart,
>>> + * University of Stuttgart. All rights reserved.
>>> + * Copyright (c) 2004-2005 The Regents of the University of California.
>>> + * All rights reserved.
>>> + * Copyright (c) 2007 Sun Microsystems, Inc. All rights reserved.
>>> + * Copyright (c) 2008-2010 Cisco Systems, Inc. All rights reserved.
>>> + * Copyright (c) 2010-2011 Los Alamos National Security, LLC.
>>> + * All rights reserved.
>>> + * $COPYRIGHT$
>>> + *
>>> + * Additional copyrights may follow
>>> + *
>>> + * $HEADER$
>>> + */
>>> +
>>> +#include "ompi_config.h"
>>> +
>>> +#ifdef HAVE_STRING_H
>>> +#include<string.h>
>>> +#endif
>>> +
>>> +#include "orte/mca/rml/rml.h"
>>> +#include "orte/util/name_fns.h"
>>> +#include "orte/util/show_help.h"
>>> +#include "orte/runtime/orte_globals.h"
>>> +#include "orte/mca/errmgr/errmgr.h"
>>> +
>>> +#include "ompi/constants.h"
>>> +#include "ompi/mca/dpm/dpm.h"
>>> +#include "ompi/mca/common/sm/common_sm_rml.h"
>>> +
>>> +OBJ_CLASS_INSTANCE(
>>> + mca_common_sm_rml_pending_rml_msg_types_t,
>>> + opal_object_t,
>>> + NULL,
>>> + NULL
>>> +);
>>> +
>>> +/* ////////////////////////////////////////////////////////////////////////// */
>>> +/**
>>> + * this routine assumes that sorted_procs is in the following state:
>>> + * o all the local procs at the beginning.
>>> + * o sorted_procs[0] is the lowest named process.
>>> + */
>>> +int
>>> +mca_common_sm_rml_info_bcast(opal_shmem_ds_t *ds_buf,
>>> + ompi_proc_t **procs,
>>> + size_t num_procs,
>>> + int tag,
>>> + bool bcast_root,
>>> + char *msg_id_str,
>>> + opal_list_t *pending_rml_msgs)
>>> +{
>>> + int rc = OMPI_SUCCESS;
>>> + struct iovec iov[MCA_COMMON_SM_RML_MSG_LEN];
>>> + int iovrc;
>>> + size_t p;
>>> + char msg_id_str_to_tx[OPAL_PATH_MAX];
>>> +
>>> + strncpy(msg_id_str_to_tx, msg_id_str, sizeof(msg_id_str_to_tx) - 1);
>>> +
>>> + /* let the first item be the queueing id name */
>>> + iov[0].iov_base = (ompi_iov_base_ptr_t)msg_id_str_to_tx;
>>> + iov[0].iov_len = sizeof(msg_id_str_to_tx);
>>> + iov[1].iov_base = (ompi_iov_base_ptr_t)ds_buf;
>>> + iov[1].iov_len = sizeof(opal_shmem_ds_t);
>>> +
>>> + /* figure out if i am the root proc in the group.
>>> + * if i am, bcast the message the rest of the local procs.
>>> + */
>>> + if (bcast_root) {
>>> + opal_progress_event_users_increment();
>>> + /* first num_procs items should be local procs */
>>> + for (p = 1; p< num_procs; ++p) {
>>> + iovrc = orte_rml.send(&(procs[p]->proc_name), iov,
>>> + MCA_COMMON_SM_RML_MSG_LEN, tag, 0);
>>> + if ((ssize_t)(iov[0].iov_len + iov[1].iov_len)> iovrc) {
>>> + ORTE_ERROR_LOG(ORTE_ERR_COMM_FAILURE);
>>> + opal_progress_event_users_decrement();
>>> + rc = OMPI_ERROR;
>>> + goto out;
>>> + }
>>> + }
>>> + opal_progress_event_users_decrement();
>>> + }
>>> + else { /* i am NOT the root ("lowest") proc */
>>> + opal_list_item_t *item;
>>> + mca_common_sm_rml_pending_rml_msg_types_t *rml_msg;
>>> + /* because a component query can be performed simultaneously in multiple
>>> + * threads, the RML messages may arrive in any order. so first check to
>>> + * see if we previously received a message for me.
>>> + */
>>> + for (item = opal_list_get_first(pending_rml_msgs);
>>> + opal_list_get_end(pending_rml_msgs) != item;
>>> + item = opal_list_get_next(item)) {
>>> + rml_msg = (mca_common_sm_rml_pending_rml_msg_types_t *)item;
>>> + /* was the message for me? */
>>> + if (0 == strcmp(rml_msg->msg_id_str, msg_id_str)) {
>>> + opal_list_remove_item(pending_rml_msgs, item);
>>> + /* from ==============> to */
>>> + opal_shmem_ds_copy(&rml_msg->shmem_ds, ds_buf);
>>> + OBJ_RELEASE(item);
>>> + break;
>>> + }
>>> + }
>>> + /* if we didn't find a message already waiting, block on receiving from
>>> + * the RML.
>>> + */
>>> + if (opal_list_get_end(pending_rml_msgs) == item) {
>>> + do {
>>> + /* bump up the libevent polling frequency while we're in this
>>> + * RML recv, just to ensure we're checking libevent frequently.
>>> + */
>>> + opal_progress_event_users_increment();
>>> + iovrc = orte_rml.recv(&(procs[0]->proc_name), iov,
>>> + MCA_COMMON_SM_RML_MSG_LEN, tag, 0);
>>> + opal_progress_event_users_decrement();
>>> + if (iovrc< 0) {
>>> + ORTE_ERROR_LOG(ORTE_ERR_RECV_LESS_THAN_POSTED);
>>> + rc = OMPI_ERROR;
>>> + goto out;
>>> + }
>>> + /* was the message for me? if so, we're done */
>>> + if (0 == strcmp(msg_id_str_to_tx, msg_id_str)) {
>>> + break;
>>> + }
>>> + /* if not, put it on the pending list and try again */
>>> + if (NULL == (rml_msg =
>>> + OBJ_NEW(mca_common_sm_rml_pending_rml_msg_types_t)))
>>> + {
>>> + ORTE_ERROR_LOG(OMPI_ERR_OUT_OF_RESOURCE);
>>> + rc = OMPI_ERROR;
>>> + goto out;
>>> + }
>>> + /* not for me, so place on list */
>>> + /* from ========> to */
>>> + opal_shmem_ds_copy(ds_buf,&rml_msg->shmem_ds);
>>> + memcpy(rml_msg->msg_id_str, msg_id_str_to_tx, OPAL_PATH_MAX);
>>> + opal_list_append(pending_rml_msgs,&(rml_msg->super));
>>> + } while(1);
>>> + }
>>> + }
>>> +
>>> +out:
>>> + return rc;
>>> +}
>>> +
>>>
>>> Added: branches/v1.5/ompi/mca/common/sm/common_sm_rml.h
>>> ==============================================================================
>>> --- (empty file)
>>> +++ branches/v1.5/ompi/mca/common/sm/common_sm_rml.h 2011-12-13 16:30:53 EST (Tue, 13 Dec 2011)
>>> @@ -0,0 +1,65 @@
>>> +/*
>>> + * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
>>> + * University Research and Technology
>>> + * Corporation. All rights reserved.
>>> + * Copyright (c) 2004-2005 The University of Tennessee and The University
>>> + * of Tennessee Research Foundation. All rights
>>> + * reserved.
>>> + * Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
>>> + * University of Stuttgart. All rights reserved.
>>> + * Copyright (c) 2004-2005 The Regents of the University of California.
>>> + * All rights reserved.
>>> + * Copyright (c) 2009-2010 Cisco Systems, Inc. All rights reserved.
>>> + * Copyright (c) 2010-2011 Los Alamos National Security, LLC.
>>> + * All rights reserved.
>>> + * $COPYRIGHT$
>>> + *
>>> + * Additional copyrights may follow
>>> + *
>>> + * $HEADER$
>>> + */
>>> +
>>> +#ifndef _COMMON_SM_RML_H_
>>> +#define _COMMON_SM_RML_H_
>>> +
>>> +#include "ompi_config.h"
>>> +
>>> +#include "opal/mca/mca.h"
>>> +#include "opal/class/opal_object.h"
>>> +#include "opal/class/opal_list.h"
>>> +#include "opal/mca/shmem/base/base.h"
>>> +#include "opal/mca/shmem/shmem.h"
>>> +
>>> +#include "ompi/proc/proc.h"
>>> +#include "ompi/mca/common/sm/common_sm.h"
>>> +
>>> +#define MCA_COMMON_SM_RML_MSG_LEN 2
>>> +
>>> +BEGIN_C_DECLS
>>> +
>>> +/**
>>> + * items on the pending_rml_msgs list
>>> + */
>>> +typedef struct mca_common_sm_rml_pending_rml_msg_types_t {
>>> + opal_list_item_t super;
>>> + char msg_id_str[OPAL_PATH_MAX];
>>> + opal_shmem_ds_t shmem_ds;
>>> +} mca_common_sm_rml_pending_rml_msg_types_t;
>>> +
>>> +/**
>>> + * routine used to send common sm initialization information to all local
>>> + * processes in procs.
>>> + */
>>> +OMPI_DECLSPEC extern int
>>> +mca_common_sm_rml_info_bcast(opal_shmem_ds_t *ds_buf,
>>> + ompi_proc_t **procs,
>>> + size_t num_procs,
>>> + int tag,
>>> + bool bcast_root,
>>> + char *msg_id_str,
>>> + opal_list_t *pending_rml_msgs);
>>> +
>>> +END_C_DECLS
>>> +
>>> +#endif /* _COMMON_SM_RML_H_*/
>>> +
>>> _______________________________________________
>>> svn-full mailing list
>>> svn-full_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/svn-full
>>>
>>
>> --
>> ---------------------------------------------------------------
>> Shiqing Fan
>> High Performance Computing Center Stuttgart (HLRS)
>> Tel: ++49(0)711-685-87234 Nobelstrasse 19
>> Fax: ++49(0)711-685-65832 70569 Stuttgart
>> http://www.hlrs.de/organization/people/shiqing-fan/
>> email: fan_at_[hidden]
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
---------------------------------------------------------------
Shiqing Fan
High Performance Computing Center Stuttgart (HLRS)
Tel: ++49(0)711-685-87234      Nobelstrasse 19
Fax: ++49(0)711-685-65832      70569 Stuttgart
http://www.hlrs.de/organization/people/shiqing-fan/
email: fan_at_[hidden]