Hi Open MPI developers,
I found another issue in Open MPI.
In MCA_PML_OB1_RECV_FRAG_INIT macro in ompi/mca/pml/ob1/pml_ob1_recvfrag.h
file, we copy a PML header from an arrived message to another buffer,
frag->hdr = *(mca_pml_ob1_hdr_t*)hdr;
On this copy, we cast hdr to mca_pml_ob1_hdr_t, which is a union
of all actual header structs such as mca_pml_ob1_match_hdr_t.
This means we copy the buffer of the size of the largest header
even if the arrived message is smaller than it. This can cause
SEGV if the arrived message is small and it is laid on the bottom
of the page. Actually, my tofu BTL, the BTL component of Fujitsu
MPI for K computer, suffered from this.
The attached patch will be one of possible fixes for this issue.
This fix assume that the arrived header has at least segs.seg_len
bytes. This is always true for current Open MPI code because hdr
equals to segs.seg_addr.pval. There may exist a smarter fix.
MPI development team,