On Oct 28, 2010, at 2:18 PM, Ray Muno wrote:
> On 10/22/2010 07:36 AM, Scott Atchley wrote:
>> Looking back at your original message, you say that it works if you use the Myricom supplied mpirun from the Myrinet roll. I wonder if this is a mismatch between libraries on the compute nodes.
>> What do you get if you use your OMPI's mpirun with:
>> $ mpirun -n 1 -H <remote_host> ldd $PWD/<your_binary>
>> I am wondering if ldd find the libraries from your compile or the Myrinet roll.
> OK, a bit of a hiatus trying to get this resolved. Had to tend other
> I do think I had an issue of mixed environments. It is a Rocks 5.3
> test cluster and it had an old version of OpenMPI installed as part of
> the Rocks 5.3 HPC roll. I have no removed the HPC roll. All nodes were
> In the previous setup, we could actually run OpenMPI jobs over MX.
> With all other spurious versions of OpenMPI (and MPICH for that matter)
> removed, I have rebuilt and re-installed, from a fresh source tree,
> OpenMPI 1.4.3. It was built with PGI 10.8 compilers.
> Now, we cannot run with MX at all.
> The install was built with MX.
> $ ompi_info | grep mx
> MCA btl: mx (MCA v2.0, API v2.0, Component v1.4.3)
> MCA mtl: mx (MCA v2.0, API v2.0, Component v1.4.3)
> I can run with TCP, but now I get
> [compute-0-1.local:24863] mca: base: component_find: unable to open
> /share/apps/opt/OpenMPI/1.4.3/PGI/10.8/lib/openmpi/mca_mtl_mx: perhaps a
> missing symbol, or compiled for a different version of Open MPI? (ignored)
> $ ls -l /share/apps/opt/OpenMPI/1.4.3/PGI/10.8/lib/openmpi/mca_mtl_mx*
> -rwxr-xr-x 1 muno muno 1070 Oct 28 12:49
> -rwxr-xr-x 1 muno muno 32044 Oct 28 12:49
> mpirun -v -nolocal -np 96 --x MX_RCACHE=2 -hostfile machines --mca mtl
> mx --mca pml cm cpi.pgi
Does your environment have LD_LIBRARY_PATH set to point to $OMPI/lib and $MX/lib? Does it get set on login? Is $OMPI/bin in your PATH?