On 02/01/2011 07:34 PM, Jeff Squyres wrote:
Not to be a lone descenting opinion here is my experience in doing
On Feb 1, 2011, at 5:02 PM, Jeffrey A Cummings wrote:
I'm getting a lot of push back from the SysAdmin folks claiming that OpenMPI is closely intertwined with the specific version of the operating system and/or other system software (i.e., Rocks on the clusters).
I wouldn't say that this is true. We test across a wide variety of OS's and compilers. I'm sure that there are particular platforms/environments that can trip up some kind of problem (it's happened before), but in general, Open MPI is pretty portable.
To state my question another way: Apparently each release of Linux and/or Rocks comes with some version of OpenMPI bundled in. Is it dangerous in some way to upgrade to a newer version of OpenMPI?
Not at all. Others have said it, but I'm one of the developers and I'll reinforce their answers: I regularly have about a dozen different installations of Open MPI on my cluster at any given time (all in different stages of development -- all installed to different prefixes). I switch between them quite easily by changing my PATH and LD_LIBRARY_PATH (both locally and on remote nodes).
First if you are always recompiling your application with a specific
version of OMPI then I would agree with everything Jeff said above.
That is you can build many versions of OMPI on many linux versions
and have them run.
But there are definite pitfalls once you start trying to keep one
set of executables and OMPI binaries across different Linux
1. You may see executables not be able to use OMPI libraries that
differ in the first dot number release (eg 1.3 vs 1.4 or 1.5
branches). We the community try to avoid these incompatibilities as
much as possible but it happens on occasion (I think 1.3 to 1.4 is
one such occasion).
2. The system libraries on different linux versions are not always
the same. At Oracle we build a binary distribution of OMPI that we
test out on several different versions of Linux. The key here is
building on a machine that is essentially the lowest common
denominator of all the system software that exists on the machines
one will be running on. This is essentially why Oracle states a
bounded set of OS versions a distribution runs on. An example of
this is there is a component in OMPI that was relying on a version
of libbfd that changed significantly between Linux version. Once we
got rid of the usage of that library we were ok. There are not "a
lot" of these instances but the number is not zero.
Terry D. Dontje | Principal Software Engineer
Engineering | +1.781.442.2631
Oracle - Performance
95 Network Drive,
Burlington, MA 01803