Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] custom btl
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-03-31 11:52:26

On Mar 31, 2009, at 11:15 AM, Roberto Ammendola wrote:

> Hi all, I am developing a btl module for a custom interconnect board
> (we
> call it apelink, it's an academic project), and I am porting the
> module
> from 1.2 (at which it used to work) to 1.3 branch. Two issues:
> 1) the use of pls_rsh_agent is said to be deprecated. How do I spawn
> the
> jobs using rsh, then?

The "pls" framework was replaced by the "plm" framework. So
"plm_rsh_agent" should work. It defaults to "ssh : rsh" meaning that
it'll look for ssh in your path, if it finds it, it will use it; if
not, it'll look for rsh in your path, if it finds it, it will use it.
If not, it'll fail.

> 2) although compilation is fine, i get a
> [gozer1:18640] mca: base: component_find: "mca_btl_apelink" does not
> appear to be a valid btl MCA dynamic component (ignored)
> already with an ompi_info command. Probably something changed in the
> 1.3
> branch regarding DSO, which I should implement in my btl. Any hint?

This is likely due to dlopen failing with your component -- the most
common reason for this is a missing/unresolvable symbol. There's
unfortunately a bug in libtool that doesn't show you the exact symbol
that is unresolvable -- it instead may give a misleading error such as
"file not found". :-(

The way I have gotten around it before is to edit libltdl and add a
printf. :-( Try this patch -- it compiles for me but I haven't
tested it:

--- opal/libltdl/loaders/dlopen.c.~1~ 2009-03-27 08:06:52.000000000
+++ opal/libltdl/loaders/dlopen.c 2009-03-31 11:50:05.000000000 -0400
@@ -195,6 +195,9 @@

    if (!module)
+ const char *error;
+ LT__GETERROR(error);
+ fprintf(stderr, "Can't dlopen %s: %s\n", filename, error);

Jeff Squyres
Cisco Systems