This web mail archive is frozen.
This page is part of a frozen web archive of this mailing list.
You can still navigate around this archive, but know that no new mails
have been added to it since July of 2016.
Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.
We use the feature defined by POSIX mmap where the area should be zero-
filled when the file length is extended. What OS you're using when you
see such problems ?
Just in case, here is a patch that set the beginning of the mmaped
region to zero, in case this is not done automatically. As in most
cases this is an unnecessary overhead, we should find the cases where
we really need this, and possibly conditionally compile it.
--- ompi/mca/common/sm/common_sm_mmap.c (revision 19377)
+++ ompi/mca/common/sm/common_sm_mmap.c (working copy)
@@ -163,6 +163,7 @@
/* initialize the segment - only the first process
to open the file */
+ bzero( map->data_addr, size );
mem_offset = map->data_addr - (unsigned char *)map-
map->map_seg->seg_offset = mem_offset;
map->map_seg->seg_size = size - mem_offset;
On Aug 21, 2008, at 1:22 PM, Terry Dontje wrote:
> I've been seeing an intermittent (once every 4 hours looping on a
> quick initialization program) segv with the following stack trace.
> => mca_btl_sm_add_procs(btl = 0xfffffd7ffdb67ef0, nprocs = 2U,
> procs = 0x591560, peers = 0x591580, reachability =
> 0xfffffd7fffdff000), line 519 in "btl_sm.c"
>  mca_bml_r2_add_procs(nprocs = 2U, procs = 0x591560,
> bml_endpoints = 0x591500, reachable = 0xfffffd7fffdff000), line 222
> in "bml_r2.c"
>  mca_pml_ob1_add_procs(procs = 0x5914c0, nprocs = 2U), line 248
> in "pml_ob1.c"
>  ompi_mpi_init(argc = 1, argv = 0xfffffd7fffdff318, requested =
> 0, provided = 0xfffffd7fffdff234), line 651 in "ompi_mpi_init.c"
>  PMPI_Init(argc = 0xfffffd7fffdff2ec, argv = 0xfffffd7fffdff2e0),
> line 90 in "pinit.c"
>  main(argc = 1, argv = 0xfffffd7fffdff318), line 82 in "buffer.c"
> I believe the problem is that mca_btl_sm_component.shm_fifo[j]
> contains uninitialized data causes the loop on line 504 in btl_sm.c
> to think that a remote rank has set its fifo address.
> Has anyone else seen the above happening?
> devel mailing list
- application/pkcs7-signature attachment: smime.p7s