Setting OMPI_MCA_mtl=^psm does work. Thanks very much
for your help.
From: users-bounces_at_[hidden] [users-bounces_at_[hidden]] on behalf of Ralph Castain [rhc_at_[hidden]]
Sent: Monday, October 31, 2011 12:33 PM
To: Open MPI Users
Subject: Re: [OMPI users] Error when calling MPI_Init
On Oct 31, 2011, at 10:27 AM, Weston, Stephen wrote:
> I'm just running it directly from a shell. That may sound crazy,
> but my original problem was trying to install the Rmpi package,
> which is the R interface to MPI. The Rmpi package calls
> MPI_Init when it is loaded, and the package is loaded when it
> is installed, so the installation failed until I installed the package
> using the mpirun command.
> But even after installing Rmpi, it is common for R users to run
> Rmpi programs from an interactive R session using spawned
> workers. And in that case, they aren't using mpirun.
It's okay so long as only one process is being run. Directly launching the MPI procs via rsh/ssh won't work, however, if there is more than one proc in the job.
> A colleague who reads this list pointed out to me that the
> problem is probably because the cluster that I'm using has
> QLogic infiniband cards that apparently require
> OMPI_MCA_orte_precondition_transports to be set. That
> may be the answer to my question.
That was my next question :-)
Your colleague is correct. Alternatively, you can tell OMPI to ignore the psm interface to those cards by either configuring it out (--without-psm) or at run time by setting the envar OMPI_MCA_mtl=^psm
> - Steve
> From: users-bounces_at_[hidden] [users-bounces_at_[hidden]] on behalf of Ralph Castain [rhc_at_[hidden]]
> Sent: Monday, October 31, 2011 12:02 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Error when calling MPI_Init
> How are you running the job without mpirun? Is this under slurm or some other RM?
> On Oct 31, 2011, at 9:46 AM, Weston, Stephen wrote:
>> I'm seeing an error on one of our clusters when executing the
>> MPI_Init function in a program that is _not_ invoked using the
>> mpirun command. The error is:
>> Error obtaining unique transport key from ORTE
>> (orte_precondition_transports not present in the environment).
>> followed by "It looks like MPI_INIT failed for some reason; your
>> parallel process is likely to abort.", etc. Since mpirun sets
>> this environment variable, it's not surprising that it isn't
>> set, but in our other Open MPI installations it doesn't seem
>> necessary for this environment variable to be set.
>> I can work around the problem by setting the
>> "OMPI_MCA_orte_precondition_transports" environment variable
>> before running the program using the command:
>> % eval "export `mpirun env | grep OMPI_MCA_orte_precondition_transports`"
>> But I'm very curious what is causing this error, since it only
>> happens on one of our clusters. Could this indicate a problem
>> with the way we configured Open MPI when we installed it?
>> Any pointers on how to further investigate this issue would be
>> - Steve Weston
>> P.S. I'm using Open MPI 1.4.3 on a Linux cluster using CentOS
>> release 5.5. It happens in any MPI program that I execute
>> without mpirun.
>> users mailing list
> users mailing list
> users mailing list
users mailing list