Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] openib btl code review
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-05-14 17:39:44


https://svn.open-mpi.org/trac/ompi/ticket/1285 turned out to be more
complicated than expected (of course). The startup in the openib btl
mixes resource discovery and initialization (vs. doing discovery,
deciding which hcas/ports/lids to use, initializing them, and then
assigning resources to them), which made the whole "putting the BSRQ
receive_queues value in the INI file" a bit more difficult than
expected -- the code got a bit hairy. I would appreciate some more
eyes on this code before I commit it; thanks.
---------

The attached patch solves two problems:

- allow receive_queues to be specified in the INI file
- detect when multiple different receive_queues are specified and
gracefully abort

However, accomplishing these goals ran into multiple difficulties. By
putting receive_queues in the INI file:

1. we may not find the value until we've already traversed multiple HCAs
2. we may find multiple different receive_queues values

But since the openib btl initializes as it discovers each HCA/port/LID
(including the BSRQ data), if we find a new receive_queues value late
in the discovery process, then all the BSRQ data that was previously
initialized will likely be invalid. So I had to pull all the BSRQ
initialization out until after the rest of the discovery /
initialization process.

Additionally, note that if the user specifies the MCA parameter
btl_openib_receive_queues, it trumps whatever was in the INI file. So
in this case, there can never be a receive_queues conflict.

The attached patch does the following (Jon wrote part of this, too):

- some random style cleanup
- fix a few minor memory leaks
- adapt _ini.c to accept the "receive_queues" field in the file
- move 90% of _setup_qps() from _ini.c to _component.c
- move what was left of _setup_qps() into the main
_register_mca_params() function
- adapt init_one_hca() to detect conflicting receive_queues values
from the INI file
- after the _component.c loop calling init_one_hca():
   - call setup_qps() to parse the final receive_queues string value
   - traverse all resulting btls and initialize their HCAs (if they
weren't already): setup some lists and call prepare_hca_for_use()

I tested this code on a dual-HCA system where I artificially put in
differing receive_queues values in the INI file for the two different
types of HCAs that I have and it all seemed to work. But I'd
appreciate some more eyes on the code to sanity check.

If I hear nothing back by COB tomorrow, I'll commit. Thanks.

-- 
Jeff Squyres
Cisco Systems