Hi Jeff again!
>> (update) it works with "truly" OpenMPI, but it works *not* with SUN
>> Cluster Tools 8.0 (which is also an OpenMPI). So, it seems be an SUN
>> problem and not general problem of openMPI. Sorry for false relating
>> the problem.
>
> Ah, gotcha. I guess my Sun colleagues on this list will need to address
> that. ;-)
I hope!
>> The only trouble we have now are the error messages like
>>
>> --------------------------------------------------------------------------
>>
>> Sorry! You were supposed to get help about:
>> no hca params found
>> from the file:
>> help-mpi-btl-openib.txt
>> But I couldn't find any file matching that name. Sorry!
>> --------------------------------------------------------------------------
>>
>>
>> (the job still runs without problems! :o)
>>
>> if running openmpi from new location, and the old location being
>> removed. (if the old location being also persistense there is no
>> error, so it seems to be an attempt to access to an file on old path).
>
> Doh; that's weird.
>
>> Maybe we have to explicitly pass the OPAL_PREFIX environment variable
>> to all processes?
>
> Hmm. I don't need to do this in my 1.2.7 installation. I do something
> like this (I assume you're using rsh/ssh as a launcher?):
We use zsh as login shell, ssh as communication protocol and an wrapper
to mpiexec which produces an command somewhat like
/opt/MPI/openmpi-1.2.7/linux64/intel/bin/mpiexec -x LD_LIBRARY_PATH -x
PATH -x MPI_NAME --hostfile /tmp/pk224850/26654_at_linuxhtc01/hostfile3564
-n 2 MPI_FastTest.exe
(hostfiles are generated temporarely by our wrapper due of load
balancing, and /opt/MPI/openmpi-1.2.7/linux64/intel/ is the path to our
local installation of OpenMPI... )
You see that we also explicitly order OpenMPI to export environment
variables PATH and LD_LIBRARY_PATH.
If we add an " -x OPAL_PREFIX " flag, and through forces explicitly
forwarding of this environment variable, the error was not occured. So
we mean that this variable is needed to be exported across *all*
systhems in cluster.
It seems, the variable OPAL_PREFIX will *NOT* be automatically exported
to new processes on the local and remote nodes.
Maybe the FAQ in
http://www.open-mpi.org/faq/?category=building#installdirs should be
extended in this mean?
>> Did you (or anyone reading this message) have any contact to SUN
>> developers to point to this circumstance? *Why* do them use hard-coded
>> paths? :o)
>
> I don't know -- this sounds like an issue with the Sun CT 8 build
> process. It could also be a by-product of using the combined 32/64
> feature...? I haven't used that in forever and I don't remember the
> restrictions. Terry/Rolf -- can you comment?
I will write an separate eMail to ct-feedback_at_[hidden]
Best regards,
Paul Kapinos
|