Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Patrick Jessee (pj_at_[hidden])
Date: 2006-06-30 09:26:21


Jeff,

Thanks for the reply and your attention to this.

>>Can you -- and anyone else in
>>similar circumstances -- let me know how common this scenario is?

I think this depends on the environment. For us and many other ISVs, it
is very common. The build host is almost always physically different
than the target systems, and the target systems usually only have a
subset of the network hardware for which the application was originally
configured (and may have drivers installed in different places). The
application will be configured for all possible interconnects. On the
individual target systems (each with possibly different interconnect
types), the specific interconnect will be selected either by user input
or by auto-detection. Thus, we would build for mx, gm, mvapi, openib,
tcp, sm, ...; however, some systems may have only mx or only mvapi or
neither.

The MCA of OpenMPI seems like it is very well suited for such a
development environment because certain components can be selectively
activated at run-time depending on the system. Your idea of applying
the filter earlier and only opening the desired modules sounds like an
excellent approach.

Thanks for considering the issue. Please let me know if I can provide
any more information.

-Patrick

Jeff Squyres (jsquyres) wrote:

>This is due to the way the OMPI finds and loads modules. What actually
>happens is that OMPI looks for *all* modules of a given type and
>dlopen's them. It then applies the filter of which components are
>desired and dlclose's all the undesired ones. It certainly would be
>better to apply the filter earlier and only open the desired modules.
>
>We actually identified this behavior quite a while ago, but never put a
>high priority on fixing it because we didn't think it would be much of
>an issue (because most people build/run in homogeneous environments).
>But pending resource availability, I agree that this behavior is
>sub-optimal and should be fixed. I'll enter this issue on the bug
>tracker so that we don't forget about it. Can you -- and anyone else in
>similar circumstances -- let me know how common this scenario is?
>
>There is one workaround, however. The MCA parameter
>mca_component_show_load_errors defaults to a "1" value. When it's 1,
>all warnings regarding the loading of components are displayed (i.e.,
>the messages you're seeing). Setting this value to 0 will disable the
>messages. However, you won't see *any* messages about components not
>loading. For example, if you have components that you think should be
>loading but are not, you won't be notified.
>
>That being said, these messages are not usually a concern for end-users,
>however -- they are typically more useful for the OMPI developers. For
>example, if a developer accidentally does something to make a plugin
>un-loadable (e.g., leaves a symbol out), having these messages displayed
>at mpirun time can be *very* useful. Plugins that are shipped in a
>tarball hopefully do not suffer from such issues :-), and usually have
>rpath information compiled in them so even LD_LIBRARY_PATH issues
>shouldn't be much of a problem.
>
>
>
>
>>-----Original Message-----
>>From: users-bounces_at_[hidden]
>>[mailto:users-bounces_at_[hidden]] On Behalf Of Patrick Jessee
>>Sent: Wednesday, June 28, 2006 5:18 PM
>>To: Open MPI Users
>>Subject: [OMPI users] error messages for btl components that
>>aren't loaded
>>
>>
>>Hello. I'm getting some odd error messages in certain situations
>>associated with the btl components (happens with both 1.0.2
>>and 1.1).
>>When certain btl components are NOT loaded, openMPI issues error
>>messages associated with those very components. For
>>instance, consider
>>an application that is built with an openMPI installation that was
>>configured with mvapi and mx (in addition to tcp,sm,self). If that
>>application is taken to a system that does not have mvapi and mx
>>interconnects installed and is explicitly started for TCP by using
>>"--mca btl self,tcp,sm", then the following comes from openMPI:
>>
>>[devi01:01659] mca: base: component_find: unable to open: libvapi.so:
>>cannot open shared object file: No such file or directory (ignored)
>>[devi01:01659] mca: base: component_find: unable to open: libvapi.so:
>>cannot open shared object file: No such file or directory (ignored)
>>[devi01:01659] mca: base: component_find: unable to open:
>>libmyriexpress.so: cannot open shared object file: No such file or
>>directory (ignored)
>>[devi02:31845] mca: base: component_find: unable to open: libvapi.so:
>>cannot open shared object file: No such file or directory (ignored)
>>[devi02:31845] mca: base: component_find: unable to open: libvapi.so:
>>cannot open shared object file: No such file or directory (ignored)
>>[devi02:31845] mca: base: component_find: unable to open:
>>libmyriexpress.so: cannot open shared object file: No such file or
>>directory (ignored)
>>
>>These are not fatal, but they definitely give the wrong
>>impression that
>>something is not right. The "--mca btl self,tcp,sm" option
>>should tell
>>openMPI only to load loopback, tcp, and shared memory components (as
>>these are the only btl components that should be operational on the
>>system). The mvapi and mx components (which need libvapi.so and
>>libmyriexpress.so, respectively), should not be loaded and thus
>>libvapi.so and libmyriexpress.so should not be needed or even
>>searched
>>for. The same thing happens with "--mca btl ^mvapi,mx".
>>Interestingly,
>>even on a system that does have MX, the libmyriexpress.so
>>errors show up
>>if the mx btl component is not loaded.
>>
>>Does anyone know (a) why openMPI is complaining about a
>>shared library
>>from a component that isn't even loaded, and (b) how to avoid the
>>seemingly superfluous error messages? Any help is greatly
>>appreciated.
>>
>>-Patrick
>>
>>
>>
>>
>>
>
>_______________________________________________
>users mailing list
>users_at_[hidden]
>http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>