Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] 1.4.4 .so version numbers
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-04-29 15:56:25


Houston, we have a problem.

lib_mpif90.so had changes for the upcoming 1.4.4 release that requires a .so version bump. Specifically, some MPI F90 bindings used to have some parameters of type INTEGER. In 1.4.4, those parameter types were corrected to be INTEGER(KIND=MPI_ADDRESS_KIND).

 * 1.4.3 value: 0:1:0
 * 1.4.4 value: 1:0:0
   --> bumped current & reset rev because param types on some i/f's changed

Unfortunately, libmpi_f90.so has already been released in v1.5 with the value 1:0:0. So... what do we do?

Before discussing options, let's review a few things:

1. Remember that two different versions of OMPI cannot be installed into the same tree. .so version numbers *help*, but there's still other support files that OMPI does not version. Hence, if you have 2 versions of OMPI, you *must* install them to different installation trees.

2. If you compile your MPI application with OMPI version A, you can run it with OMPI version B (provided that both A and B are ABI-compatible with each other), usually by updating your LD_LIBRARY_PATH.

3. To be clear, you can do something like this:

$ /ompi-vA-install/bin/mpicc ring.c -o ring
$ export LD_LIBRARY_PATH=/ompi-b-install/lib
$ /ompi-vB-install/bin/mpirun -np 4 ring

4. However, if A and B are *not* ABI compatible, the .so version numbers are supposed to protect you such that the above example would not work. When you try to mpirun, you would get an error message from the run-time linker that ring is not compatible with B's libmpi.so (for example).

5. The particular F90 changes that were made were only to the "large" F90 module size, which is not the default (you have to specify --with-f90-module=large to OMPI's configure).

6. Versions of OMPI 1.3.2 are supposed to be ABI compatible with all remaining versions of 1.3.x and all versions of 1.4.x.

-----

So -- with all that in mind -- let's talk about what to do for 1.4.4. I see a few options:

1. Go with 1:0:0 anyway.

   CONSEQUENCE: We have two different versions of libmpi.so out there with 1.0.0 which are not compatible with each other.

   IMPACT: Probably pretty minimal -- not too many people use the "large" F90 bindings. And no one has noticed the wrong bindings that we included <=1.4.3, so it's unlikely that anyone is using these particular interfaces.

2. Go with 0:2:0.

   CONSEQUENCE: This is somewhat of a lie; we're saying we haven't modified the interface. But we did.

   IMPACT: Same as above. A binary using the old/wrong interfaces (e.g., compiled against 1.4.3) could still run-time link against OMPI 1.4.4 and possibly segv because the parameters are different sizes.

3. Not fix the Fortran bindings in 1.4.x -- fix them in 1.5.4.

   CONSEQUENCE: Leave them broken. There's at least one user who would be annoyed by this (i.e., the one who reported the problem to us).

   IMPACT: We can fix this in 1.5.4. We already have many old versions of OMPI that have these broken bindings. What's one more? It might be an easier thing to say "The bindings are fixed in 1.5.4 and higher" rather than "The bindings are fixed in 1.4.x, where x>=4 and 1.5.y, where y>=4".

None of the options are good.

I'm somewhat leaning towards #3.

Opinions?

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/