Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Using OpenMPI on a network
From: Damien (damien_at_[hidden])
Date: 2012-06-19 13:20:14


There's something else wrong, if that's the Supercomputing Blog tutorial
1 you're running. It works happily without a hostfile. I think you
have some borked paths there.

I don't know why a Windows version is looking for an etc directory for a
hostfile, unless there's some of your previous Cygwin builds lying
around. The etc directory is *Nix thing. Please make sure you've
completely deleted all your old failed OpenMPI builds, code, binaries,
everything. Uninstall any other MPI versions you have tried, OpenMPI,
MPICH, whatever. You need to make absolutely sure you only have one
version. Check your paths in your environment after doing all that and
remove any remaining path references to other MPI versions. You
shouldn't be getting that network error either, if you're running
locally it won't matter if you have a network cable or not. That has to
be fixed before you can do anything on a cluster.

Damien

On 19/06/2012 10:53 AM, VimalMathew_at_[hidden] wrote:
>
> Damien, Shiqing, Jeff?
>
> --
>
> Vimal
>
> *From:*users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]]
> *On Behalf Of *VimalMathew_at_[hidden]
> *Sent:* Monday, June 18, 2012 3:32 PM
> *To:* users_at_[hidden]
> *Subject:* [OMPI users] Using OpenMPI on a network
>
> So I configured and compiled a simple MPI program.
>
> Now the issue is when I try to do the same thing on my computer on a
> corporate network, I get this error:
>
> C:\OpenMPI\openmpi-1.6\installed\bin>mpiexec MPI_Tutorial_1.exe
>
> --------------------------------------------------------------------------
>
> *Open RTE was unable to open the hostfile:*
>
> *C:\OpenMPI\openmpi-1.6\installed\bin/../etc/openmpi-default-hostfile*
>
> *Check to make sure the path and filename are correct.*
>
> *--------------------------------------------------------------------------*
>
> *[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in
> file C:\OpenM*
>
> *PI\openmpi-1.6\orte\mca\ras\base\ras_base_allocate.c at line 200*
>
> *[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in
> file C:\OpenM*
>
> *PI\openmpi-1.6\orte\mca\plm\base\plm_base_launch_support.c at line 99*
>
> *[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in
> file C:\OpenM*
>
> *PI\openmpi-1.6\orte\mca\plm\process\plm_process_module.c at line 996*
>
> **
>
> What network settings should I be using? I'm sure this is because of
> the network because when I unplug the network cable, I get the error
> message I got below.
>
> Thanks,
>
> Vimal
>
> *From:*users-bounces_at_[hidden] <mailto:users-bounces_at_[hidden]>
> [mailto:users-bounces_at_[hidden]]
> <mailto:[mailto:users-bounces_at_[hidden]]> *On Behalf Of *Damien
> *Sent:* Friday, June 15, 2012 3:15 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] Building MPI on Windows
>
> OK, that's what orte_rml_base_select failed means, no TCP connection.
> But you should be able to make OpenMPI & mpiexec work without a
> network if you're just running in local memory. There's probably a
> runtime parameter to set but I don't know what it is. Maybe Jeff or
> Shiqing can weigh in with what that is.
>
> Damien
>
> On 15/06/2012 1:10 PM, VimalMathew_at_[hidden]
> <mailto:VimalMathew_at_[hidden]> wrote:
>
> Just figured it out.
>
> The only thing different from when it ran yesterday to today was I was
> connected to a network. So I connected my laptop to a network and it
> worked again.
>
> Thanks for all your help, Damien!
>
> I'm sure I'm gonna get stuck more along the way so hoping you can help.
>
> --
>
> Vimal
>
> *From:*users-bounces_at_[hidden] <mailto:users-bounces_at_[hidden]>
> [mailto:users-bounces_at_[hidden]] *On Behalf Of *Damien
> *Sent:* Friday, June 15, 2012 2:57 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] Building MPI on Windows
>
> Hmmm. Two things. Can you run helloworldMPI.exe on it's own? It
> should output "Number of threads = 1, My rank = 0"
>
> Also, can you post the output of ompi_info ? I think you might still
> have some path mixups. A successful OpenMPI build with this simple
> program should just work.
>
> If you still have the other OpenMPIs installed from the binaries, you
> might want to try uninstalling all of them and rebooting. Also if you
> rebuilt OpenMPI and helloworldMPI with VS 2010, make sure that
> helloworldMPI is actually linked to those VS2010 OpenMPI libs by
> setting the right lib path in the Linker options. Linking to VS2008
> libs and trying to run with VS2010 dlls/exes could cause problems too.
>
> Damien
>
> On 15/06/2012 11:44 AM, VimalMathew_at_[hidden]
> <mailto:VimalMathew_at_[hidden]> wrote:
>
> Hi Damien,
>
> I installed MS Visual Studio 2010 and tried the whole procedure again
> and it worked!
>
> That's the great news.
>
> Now the bad news is that I'm trying to run the program again using
> mpiexec and it won't!
>
> I get these error messages:
>
> orte_rml_base_select failed
>
> orte_ess_set_name failed, with a bunch of text saying it could be due
> to configuration or environment problems and will make sense only to
> an OpenMPI developer.
>
> Help!
>
> --
>
> Vimal
>
> *From:*users-bounces_at_[hidden] <mailto:users-bounces_at_[hidden]>
> [mailto:users-bounces_at_[hidden]] *On Behalf Of *Damien
> *Sent:* Thursday, June 14, 2012 4:55 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] Building MPI on Windows
>
> You did build the project, right? The helloworldMPI.exe is in the
> Debug directory?
>
> On 14/06/2012 1:49 PM, VimalMathew_at_[hidden]
> <mailto:VimalMathew_at_[hidden]> wrote:
>
> No luck.
>
> Output:
>
> *Microsoft Windows [Version 6.1.7601]*
>
> *Copyright (c) 2009 Microsoft Corporation. All rights reserved.*
>
> **
>
> *C:\Users\...>cd "C:\Users\C9995799\Downloads\helloworldMPI\Debug"*
>
> **
>
> *C:\Users\...\Downloads\helloworldMPI\Debug>mpiexec -n 2
> helloworldMPI.exe*
>
> *--------------------------------------------------------------------------*
>
> *mpiexec was unable to launch the specified application as it could
> not find an e*
>
> *xecutable:*
>
> **
>
> *Executable: helloworldMPI.exe*
>
> *Node: SOUMIWHP5003567*
>
> **
>
> *while attempting to start process rank 0.*
>
> *--------------------------------------------------------------------------*
>
> *2 total processes failed to start*
>
> **
>
> *C:\Users\...\Downloads\helloworldMPI\Debug>*
>
> --
>
> Vimal
>
> *From:*users-bounces_at_[hidden] <mailto:users-bounces_at_[hidden]>
> [mailto:users-bounces_at_[hidden]] *On Behalf Of *Damien
> *Sent:* Thursday, June 14, 2012 3:38 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] Building MPI on Windows
>
> Here's a MPI Hello World project based on your code. It runs fine on
> my machine. You'll need to change the include and lib paths as we
> discussed before to match your paths, and copy those bin files over to
> the Debug directory.
>
> Run it with this to start: "mpiexec -n 1 helloworldMPI.exe". Then
> change the -n 1 to -n x where x is the number of cores you have. Say
> yes to allowing mpiexec firewall access if that comes up.
>
> If this bombs out, there's something wrong on your machine.
>
> Damien
>
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden] <mailto:users_at_[hidden]>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden] <mailto:users_at_[hidden]>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users