Using the mvapi btl you can now set OMPI_MCA_btl_mvapi_use_srq=1 which
will cause mvapi to use a shared receive queue. This will allow much
better scaling as receives are posted per interface port and not per
queue pair. Note: older versions of mellanox firmware may see a
substantial performance impact on small message latency but the latest
firmware shows only a small cost on the order of 2/10 uSec.