On Tue, Jul 27, 2010 at 7:29 PM, Gus Correa <gus_at_[hidden]> wrote:
> Hi Cristobal
> Does it run only on the head node alone?
> (Fuego? Agua? Acatenango?)
> Try to put only the head node on the hostfile and execute with mpiexec.
--> i will try only with the head node, and post results back
> This may help sort out what is going on.
> Hopefully it will run on the head node.
> Also, do you have Infinband connecting the nodes?
> The error messages refer to the openib btl (i.e. Infiniband),
> and complains of
no we are just using normal network 100MBit/s , since i am just testing yet.
> "perhaps a missing symbol, or compiled for a different
> version of Open MPI?".
> It sounds as a mixup of versions/builds.
--> i agree, somewhere there must be the remains of the older version
> Did you configure/build OpenMPI from source, or did you install
> it with apt-get?
> It may be easier/less confusing to install from source.
> If you did, what configure options did you use?
-->i installed from source,
./configure --prefix=/opt/openmpi-1.4.2 --with-sge --without-xgid
> Also, as for the OpenMPI runtime environment,
> it is not enough to set it on
> the command line, because it will be effective only on the head node.
> You need to either add them to the PATH and LD_LIBRARY_PATH
> on your .bashrc/.cshrc files (assuming these files and your home directory
> are *also* shared with the nodes via NFS),
> or use the --prefix option of mpiexec to point to the OpenMPI main
yes, all nodes have their PATH and LD_LIBRARY_PATH set up properly inside
the login scripts ( .bashrc in my case )
> Needless to say, you need to check and ensure that the OpenMPI directory
> (and maybe your home directory, and your work directory) is (are)
> really mounted on the nodes.
--> yes, doublechecked that they are
> I hope this helps,
--> thanks really!
> Gus Correa
> Update: i just reinstalled openMPI, with the same parameters, and it seems
> that the problem has gone, i couldnt test entirely but when i get back to
> lab ill confirm.