Just to follow up for the web archives -- we discussed this on the
teleconf yesterday and decided that the assert()'s were not the way to
go. Brian was going to hack up a quick check at the end of OB1
add_procs that checks each btl's eager_limit, etc. Terry would expand
this to cover dr and csum.
On Jul 16, 2009, at 10:10 AM, Terry Dontje wrote:
> Another way to do this which I am not sure makes sense is to just add
> sizeof(mca_pml_ob1_hdr_t) to the btl_eager_limit passed into by the
> user. Thus the defining the limit to be specifically for the user
> and not the internal headers
> which the user may not have any inkling about. However, that may lead
> to the user
> to not realize there is a man behind the curtain bumping up the limit
> for the internal headers.
> Terry Dontje wrote:
> > I was playing around with some really silly fragment sizes (sub 72
> > bytes) when I ran into some asserts in the btl_openib_sendi. I
> > the assert to be caused by mca_pml_ob1_send_request_start_btl()
> > calculating the true eager_limit with the following line:
> > size_t eager_limit = btl->btl_eager_limit -
> > If btl_eager_limit ends up being less than the
> > sizeof(mca_pml_ob1_hdr_t) the eager_limit calculated results in a
> > large number and an assert later on in the stack.
> > It seems to me that it would be nice to insert some checks in
> > mca_btl_base_param_register() to make sure btl_eager_limit is >
> > sizeof(mca_pml_ob1_hdr_t). Am I missing a reason why this was not
> > done in the first place?
> > --td
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> devel mailing list