Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Using OpenMPI on a network
From: Shiqing Fan (fan_at_[hidden])
Date: 2012-06-22 07:37:20


Hi,

I also noticed this problem (see my reply in the original thread). It
seems that since 1.6, the etc/openmpmi-default-hostfile is checked by
ORTE before running the application. But previous versions on Windows
don't have such problem. But anyway, copying over or creating the empty
file will solve the problem. The fix is already moved into 1.6 branch.

Shiqing
*

*
On 2012-06-19 9:17 PM, VimalMathew_at_[hidden] wrote:
>
> Just finished doing that.
>
> Still getting the same error. How do I make sure there are no old
> builds/files left?
>
> I uninstalled everything to do with MPI, Cygwin, cleared environment
> variables, did the whole Windows build again and then did the
> supercomputing tutorial.
>
> --
>
> Vimal
>
> *From:*users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]]
> *On Behalf Of *Damien
> *Sent:* Tuesday, June 19, 2012 1:20 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] Using OpenMPI on a network
>
> There's something else wrong, if that's the Supercomputing Blog
> tutorial 1 you're running. It works happily without a hostfile. I
> think you have some borked paths there.
>
> I don't know why a Windows version is looking for an etc directory for
> a hostfile, unless there's some of your previous Cygwin builds lying
> around. The etc directory is *Nix thing. Please make sure you've
> completely deleted all your old failed OpenMPI builds, code, binaries,
> everything. Uninstall any other MPI versions you have tried, OpenMPI,
> MPICH, whatever. You need to make absolutely sure you only have one
> version. Check your paths in your environment after doing all that
> and remove any remaining path references to other MPI versions. You
> shouldn't be getting that network error either, if you're running
> locally it won't matter if you have a network cable or not. That has
> to be fixed before you can do anything on a cluster.
>
> Damien
>
> On 19/06/2012 10:53 AM, VimalMathew_at_[hidden]
> <mailto:VimalMathew_at_[hidden]> wrote:
>
> Damien, Shiqing, Jeff?
>
> --
>
> Vimal
>
> *From:*users-bounces_at_[hidden]
> <mailto:users-bounces_at_[hidden]>
> [mailto:users-bounces_at_[hidden]] *On Behalf Of
> *VimalMathew_at_[hidden] <mailto:VimalMathew_at_[hidden]>
> *Sent:* Monday, June 18, 2012 3:32 PM
> *To:* users_at_[hidden] <mailto:users_at_[hidden]>
> *Subject:* [OMPI users] Using OpenMPI on a network
>
> So I configured and compiled a simple MPI program.
>
> Now the issue is when I try to do the same thing on my computer on
> a corporate network, I get this error:
>
> C:\OpenMPI\openmpi-1.6\installed\bin>mpiexec MPI_Tutorial_1.exe
>
> --------------------------------------------------------------------------
>
> *Open RTE was unable to open the hostfile:*
>
> *C:\OpenMPI\openmpi-1.6\installed\bin/../etc/openmpi-default-hostfile*
>
> *Check to make sure the path and filename are correct.*
>
> *--------------------------------------------------------------------------*
>
> *[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found
> in file C:\OpenM*
>
> *PI\openmpi-1.6\orte\mca\ras\base\ras_base_allocate.c at line 200*
>
> *[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found
> in file C:\OpenM*
>
> *PI\openmpi-1.6\orte\mca\plm\base\plm_base_launch_support.c at
> line 99*
>
> *[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found
> in file C:\OpenM*
>
> *PI\openmpi-1.6\orte\mca\plm\process\plm_process_module.c at line 996*
>
> **
>
> What network settings should I be using? I'm sure this is because
> of the network because when I unplug the network cable, I get the
> error message I got below.
>
> Thanks,
>
> Vimal
>
> *From:*users-bounces_at_[hidden]
> <mailto:users-bounces_at_[hidden]>
> [mailto:users-bounces_at_[hidden]]
> <mailto:[mailto:users-bounces_at_[hidden]]> *On Behalf Of *Damien
> *Sent:* Friday, June 15, 2012 3:15 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] Building MPI on Windows
>
> OK, that's what orte_rml_base_select failed means, no TCP
> connection. But you should be able to make OpenMPI & mpiexec work
> without a network if you're just running in local memory. There's
> probably a runtime parameter to set but I don't know what it is.
> Maybe Jeff or Shiqing can weigh in with what that is.
>
> Damien
>
> On 15/06/2012 1:10 PM, VimalMathew_at_[hidden]
> <mailto:VimalMathew_at_[hidden]> wrote:
>
> Just figured it out.
>
> The only thing different from when it ran yesterday to today was I
> was connected to a network. So I connected my laptop to a network
> and it worked again.
>
> Thanks for all your help, Damien!
>
> I'm sure I'm gonna get stuck more along the way so hoping you can
> help.
>
> --
>
> Vimal
>
> *From:*users-bounces_at_[hidden]
> <mailto:users-bounces_at_[hidden]>
> [mailto:users-bounces_at_[hidden]] *On Behalf Of *Damien
> *Sent:* Friday, June 15, 2012 2:57 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] Building MPI on Windows
>
> Hmmm. Two things. Can you run helloworldMPI.exe on it's own? It
> should output "Number of threads = 1, My rank = 0"
>
> Also, can you post the output of ompi_info ? I think you might
> still have some path mixups. A successful OpenMPI build with this
> simple program should just work.
>
> If you still have the other OpenMPIs installed from the binaries,
> you might want to try uninstalling all of them and rebooting.
> Also if you rebuilt OpenMPI and helloworldMPI with VS 2010, make
> sure that helloworldMPI is actually linked to those VS2010 OpenMPI
> libs by setting the right lib path in the Linker options. Linking
> to VS2008 libs and trying to run with VS2010 dlls/exes could cause
> problems too.
>
> Damien
>
> On 15/06/2012 11:44 AM, VimalMathew_at_[hidden]
> <mailto:VimalMathew_at_[hidden]> wrote:
>
> Hi Damien,
>
> I installed MS Visual Studio 2010 and tried the whole procedure
> again and it worked!
>
> That's the great news.
>
> Now the bad news is that I'm trying to run the program again using
> mpiexec and it won't!
>
> I get these error messages:
>
> orte_rml_base_select failed
>
> orte_ess_set_name failed, with a bunch of text saying it could be
> due to configuration or environment problems and will make sense
> only to an OpenMPI developer.
>
> Help!
>
> --
>
> Vimal
>
> *From:*users-bounces_at_[hidden]
> <mailto:users-bounces_at_[hidden]>
> [mailto:users-bounces_at_[hidden]] *On Behalf Of *Damien
> *Sent:* Thursday, June 14, 2012 4:55 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] Building MPI on Windows
>
> You did build the project, right? The helloworldMPI.exe is in the
> Debug directory?
>
> On 14/06/2012 1:49 PM, VimalMathew_at_[hidden]
> <mailto:VimalMathew_at_[hidden]> wrote:
>
> No luck.
>
> Output:
>
> *Microsoft Windows [Version 6.1.7601]*
>
> *Copyright (c) 2009 Microsoft Corporation. All rights reserved.*
>
> **
>
> *C:\Users\...>cd "C:\Users\C9995799\Downloads\helloworldMPI\Debug"*
>
> **
>
> *C:\Users\...\Downloads\helloworldMPI\Debug>mpiexec -n 2
> helloworldMPI.exe*
>
> *--------------------------------------------------------------------------*
>
> *mpiexec was unable to launch the specified application as it
> could not find an e*
>
> *xecutable:*
>
> **
>
> *Executable: helloworldMPI.exe*
>
> *Node: SOUMIWHP5003567*
>
> **
>
> *while attempting to start process rank 0.*
>
> *--------------------------------------------------------------------------*
>
> *2 total processes failed to start*
>
> **
>
> *C:\Users\...\Downloads\helloworldMPI\Debug>*
>
> --
>
> Vimal
>
> *From:*users-bounces_at_[hidden]
> <mailto:users-bounces_at_[hidden]>
> [mailto:users-bounces_at_[hidden]] *On Behalf Of *Damien
> *Sent:* Thursday, June 14, 2012 3:38 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] Building MPI on Windows
>
> Here's a MPI Hello World project based on your code. It runs fine
> on my machine. You'll need to change the include and lib paths as
> we discussed before to match your paths, and copy those bin files
> over to the Debug directory.
>
> Run it with this to start: "mpiexec -n 1 helloworldMPI.exe".
> Then change the -n 1 to -n x where x is the number of cores you
> have. Say yes to allowing mpiexec firewall access if that comes up.
>
> If this bombs out, there's something wrong on your machine.
>
> Damien
>
>
>
>
>
> _______________________________________________
>
> users mailing list
>
> users_at_[hidden] <mailto:users_at_[hidden]>
>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
>
> _______________________________________________
>
> users mailing list
>
> users_at_[hidden] <mailto:users_at_[hidden]>
>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
>
> _______________________________________________
>
> users mailing list
>
> users_at_[hidden] <mailto:users_at_[hidden]>
>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
---------------------------------------------------------------
Shiqing Fan
High Performance Computing Center Stuttgart (HLRS)
Tel: ++49(0)711-685-87234      Nobelstrasse 19
Fax: ++49(0)711-685-65832      70569 Stuttgart
http://www.hlrs.de/organization/people/shiqing-fan/
email: fan_at_[hidden]