Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: add support for large counts using derived datatypes
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2013-07-17 11:02:16


On Jul 17, 2013, at 10:48 AM, Nathan Hjelm <hjelmn_at_[hidden]> wrote:

> I must be missing something here. type_size.c contains MPI_Type_size and MPI_Type_size_x and I see all the MPI and PMPI variants in the resulting .so, .dylib, and .a.

If you have a nathan.c file with:

-----
void MPI_foo() { ... }
void MPI_bar() { ... }
-----

This will result in defining both symbols in that nathan.o file, which ends up in libmpi.so.

Then if someone writes a code like this:

-----
int main() {
    MPI_Init();
    MPI_Foo();
    MPI_Bar();
    MPI_Finalize();
    return 0;
}
-----

And then they interpose their own version of MPI_Bar() with their libinterposition.so, *it won't work* (meaning their version of MPI_Bar() won't be called).

This happens because the linker will first see MPI_Foo() in main and resolves it. When it resolves the MPI_Foo symbol, it pulls *all* symbols out of the .o from where MPI_Foo came (i.e., nathan.o in libmpi.so) -- i.e., including MPI_Bar.

So when MPI_Bar goes to get executed, it's *already been resolved* to the one in nathan.o/libmpi.so, not the one from libinterposition.so.

Even worse, if they reversed the order of foo/bar in main, then the linker would likely give you a duplicate symbol error because it will first resolve MPI_Bar from libinterposition.so, and then later resolve MPI_Foo from libmpi.so, but it will also pull MPI_Bar from libmpi.so -- kaboom.

Linkers are insanely complicated.

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/