IIRC, bzero is a gnu-ism. We should probably use memset instead.
On Aug 21, 2008, at 5:40 AM, George Bosilca wrote:
> We use the feature defined by POSIX mmap where the area should be
> zero-filled when the file length is extended. What OS you're using
> when you see such problems ?
> Just in case, here is a patch that set the beginning of the mmaped
> region to zero, in case this is not done automatically. As in most
> cases this is an unnecessary overhead, we should find the cases
> where we really need this, and possibly conditionally compile it.
> Index: ompi/mca/common/sm/common_sm_mmap.c
> --- ompi/mca/common/sm/common_sm_mmap.c (revision 19377)
> +++ ompi/mca/common/sm/common_sm_mmap.c (working copy)
> @@ -163,6 +163,7 @@
> /* initialize the segment - only the first process
> to open the file */
> + bzero( map->data_addr, size );
> mem_offset = map->data_addr - (unsigned char *)map-
> map->map_seg->seg_offset = mem_offset;
> map->map_seg->seg_size = size - mem_offset;
> On Aug 21, 2008, at 1:22 PM, Terry Dontje wrote:
>> I've been seeing an intermittent (once every 4 hours looping on a
>> quick initialization program) segv with the following stack trace.
>> => mca_btl_sm_add_procs(btl = 0xfffffd7ffdb67ef0, nprocs = 2U,
>> procs = 0x591560, peers = 0x591580, reachability =
>> 0xfffffd7fffdff000), line 519 in "btl_sm.c"
>>  mca_bml_r2_add_procs(nprocs = 2U, procs = 0x591560,
>> bml_endpoints = 0x591500, reachable = 0xfffffd7fffdff000), line 222
>> in "bml_r2.c"
>>  mca_pml_ob1_add_procs(procs = 0x5914c0, nprocs = 2U), line 248
>> in "pml_ob1.c"
>>  ompi_mpi_init(argc = 1, argv = 0xfffffd7fffdff318, requested =
>> 0, provided = 0xfffffd7fffdff234), line 651 in "ompi_mpi_init.c"
>>  PMPI_Init(argc = 0xfffffd7fffdff2ec, argv =
>> 0xfffffd7fffdff2e0), line 90 in "pinit.c"
>>  main(argc = 1, argv = 0xfffffd7fffdff318), line 82 in "buffer.c"
>> I believe the problem is that mca_btl_sm_component.shm_fifo[j]
>> contains uninitialized data causes the loop on line 504 in btl_sm.c
>> to think that a remote rank has set its fifo address.
>> Has anyone else seen the above happening?
>> devel mailing list
> devel mailing list