On 02/01/2011 07:34 PM, Jeff Squyres wrote:
On Feb 1, 2011, at 5:02 PM, Jeffrey A Cummings wrote:

I'm getting a lot of push back from the SysAdmin folks claiming that OpenMPI is closely intertwined with the specific version of the operating system and/or other system software (i.e., Rocks on the clusters).  
I wouldn't say that this is true.  We test across a wide variety of OS's and compilers.  I'm sure that there are particular platforms/environments that can trip up some kind of problem (it's happened before), but in general, Open MPI is pretty portable.

To state my question another way:  Apparently each release of Linux and/or Rocks comes with some version of OpenMPI bundled in.  Is it dangerous in some way to upgrade to a newer version of OpenMPI?  
Not at all.  Others have said it, but I'm one of the developers and I'll reinforce their answers: I regularly have about a dozen different installations of Open MPI on my cluster at any given time (all in different stages of development -- all installed to different prefixes).  I switch between them quite easily by changing my PATH and LD_LIBRARY_PATH (both locally and on remote nodes).
Not to be a lone descenting opinion here is my experience in doing the above.

First if you are always recompiling your application with a specific version of OMPI then I would agree with everything Jeff said above.  That is you can build many versions of OMPI on many linux versions and have them run.

But there are definite pitfalls once you start trying to keep one set of executables and OMPI binaries across different Linux versions.

1.  You may see executables not be able to use OMPI libraries that differ in the first dot number release (eg 1.3 vs 1.4 or 1.5 branches).  We the community try to avoid these incompatibilities as much as possible but it happens on occasion (I think 1.3 to 1.4 is one such occasion).

2.  The system libraries on different linux versions are not always the same.  At Oracle we build a binary distribution of OMPI that we test out on several different versions of Linux.  The key here is building on a machine that is essentially the lowest common denominator of all the system software that exists on the machines one will be running on.  This is essentially why Oracle states a bounded set of OS versions a distribution runs on.  An example of this is there is a component in OMPI that was relying on a version of libbfd that changed significantly between Linux version.  Once we got rid of the usage of that library we were ok.  There are not "a lot" of these instances but the number is not zero. 

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.dontje@oracle.com