I stumbled across a serious bug in the tuned component of Open MPI,
which crashes for example the well-known HPL benchmark in conjunction
with the "native MPI_Bcast() patch" .
The problem is within the function ompi_coll_tuned_bcast_intra_chain(),
which does mainly the following:
ompi_ddt_type_size( datatype, &typelng );
segcount = segsize / typelng;
num_segments = count / segcount;
Whenever you have a constructed type with a size larger than 'segsize'
(16384), you'll get a 'seqcount' of zero and finally a division by zero.