Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Timothy S. Woodall (twoodall_at_[hidden])
Date: 2005-11-08 19:06:50


George,

The BLACS test code was actually calling MPI_Pack to pack the data
into a contigous buffer, and then called MPI_ISend w/ datatype
of PACKED. So, the convertor used by the PML/BTLs treated this as
contiguous data, and allowed the PML/BTL to split it however they
liked...

Your fix should correct this, as a single convertor is used on each
side for pack/unpack. This will also help w/ the buffered send case,
which essentially did the same.

Thanks!
Tim

> I fix the problem we had with BLACS. As it look like everybody
> believe it was a data-type issue I fix it in the DDT engine. However,
> as I explain this morning on the phone conference (and nobody believe
> it) the problem was triggered by the way the convertor was used. For
> me it's an easy fix at the DDT layer that will allow BTL developers
> to pay less attention to the way they pack/unpack data ... but it is
> not the way the DDT was designed.
>
> Here is the explanation of what was wrong inside:
> BLACS create a triangular matrix using an indexed type. The memory
> layout of this data-type is composed by several contiguous buffers
> with some gaps in between. The problem we had was the following:
> 1. on the sender size pack was called with a buffer large enough to
> hold all the data.
> 2. on the receiver side the unpack was called twice with different
> iovecs. Even if the total length of the 2 iovec was the correct
> length it happen that the length of the first one was too short
> making the convertor to stop in the middle of a basic type. And that
> was not the way the convertor was designed to work.
>
> Here are the output of the ddt engine for SM.
>
> First the pack side:
>
> [applebasket.cs.utk.edu:16760] ompi_convertor_generic_simple_pack
> ( 0xbfffc104, {0x2811430, 4560}, 1 )
> [applebasket.cs.utk.edu:16760] unpack start pos_desc 0 count_desc 6
> disp 0
> stack_pos 0 pos_desc -1 count_desc 1 disp 0
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811430, 0xac650,
> 96 ) => space 4560
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811490, 0xac7e0,
> 112 ) => space 4464
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811500, 0xac970,
> 128 ) => space 4352
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811580, 0xacb00,
> 144 ) => space 4224
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811610, 0xacc90,
> 160 ) => space 4080
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28116b0, 0xace20,
> 176 ) => space 3920
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811760, 0xacfb0,
> 192 ) => space 3744
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811820, 0xad140,
> 208 ) => space 3552
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28118f0, 0xad2d0,
> 224 ) => space 3344
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28119d0, 0xad460,
> 240 ) => space 3120
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811ac0, 0xad5f0,
> 256 ) => space 2880
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811bc0, 0xad780,
> 272 ) => space 2624
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811cd0, 0xad910,
> 288 ) => space 2352
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811df0, 0xadaa0,
> 304 ) => space 2064
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2811f20, 0xadc30,
> 320 ) => space 1760
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812060, 0xaddc0,
> 336 ) => space 1440
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x28121b0, 0xadf50,
> 352 ) => space 1104
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812310, 0xae0e0,
> 368 ) => space 752
> [applebasket.cs.utk.edu:16760] pack 1. memcpy( 0x2812480, 0xae270,
> 384 ) => space 384
> [applebasket.cs.utk.edu:16760] pack end_loop count 1 stack_pos 0
> pos_desc 19 disp 0 space 0
>
> As you can see there is one pack operation with a buffer of 4560
> bytes ... exactly the size of the whole data. Even if the pack pay
> attention to not cut a basic type in the middle, in this particular
> case it has enough data to do it's job correctly.
>
> The receiver side look a little bit different:
>
> [applebasket.cs.utk.edu:16758] ompi_convertor_generic_simple_unpack
> ( 0x280bf04, {0x229e15c, 956}, 1 )
> [applebasket.cs.utk.edu:16758] unpack start pos_desc 0 count_desc 6
> disp 0
> stack_pos 0 pos_desc -1 count_desc 1 disp 0
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac650, 0x229e15c,
> 96 ) => space 956
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac7e0, 0x229e1bc,
> 112 ) => space 860
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xac970, 0x229e22c,
> 128 ) => space 748
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacb00, 0x229e2ac,
> 144 ) => space 620
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacc90, 0x229e33c,
> 160 ) => space 476
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xace20, 0x229e3dc,
> 176 ) => space 316
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xacfb0, 0x229e48c,
> 128 ) => space 140
> [applebasket.cs.utk.edu:16758] Losing 12 bytes !!!
> [applebasket.cs.utk.edu:16758] unpack save stack stack_pos 1 pos_desc
> 6 count_desc 4 disp 128
> [applebasket.cs.utk.edu:16758] ompi_convertor_generic_simple_unpack
> ( 0x280bf04, {0x229e158, 3604}, 1 )
> [applebasket.cs.utk.edu:16758] unpack start pos_desc 6 count_desc 4
> disp 128
> stack_pos 0 pos_desc -1 count_desc 1 disp 0
> [applebasket.cs.utk.edu:16758] unpack pending from the last unpack 12
> out of 16 bytes
> [applebasket.cs.utk.edu:16758] unpack 1. memcpy( 0xad030, 0x280bf4c,
> 16 ) => space 16
> ... (skipped)
>
> We can see the trace of 2 unpack operations, one with a size of 956
> bytes and the other with 3604. In the middle of the previous text you
> can notice the "Losing 12 bytes !!!" message. The basic type here is
> a long double (16 bytes on this machine) so we definitively stop in
> the middle of a basic type.
>
> A correct usage of the convertor could prevent such problems. Anyway,
> now the convertor will remember such kind of errors and will
> automatically correct them (the cost is just an if in the critical
> path and some extra memory in the convertor struct).
>
> george.
>
> "Half of what I say is meaningless; but I say it so that the other
> half may reach you"
> Kahlil Gibran
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>