Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] tcp connectivity OS X and 1.3.3
From: Gus Correa (gus_at_[hidden])
Date: 2009-08-12 17:47:45

Hi Jody

Jody Klymak wrote:
> On Aug 11, 2009, at 18:55 PM, Gus Correa wrote:
>> Did you wipe off the old directories before reinstalling?
> Check.
>> I prefer to install on a NFS mounted directory,
> Check
>> Have you tried to ssh from node to node on all possible pairs?
> check - fixed this today, works fine with the spawning user...
>> How could you roll back to 1.1.5,
>> now that you overwrote the directories?
> Oh, I still have it on another machine off the cluster in
> /usr/local/openmpi. Will take just 5 mintues to reinstall.
>> Launching jobs with Torque is way much better than
>> using barebones mpirun.
>> And you don't want to stay behind with the OpenMPI versions
>> and improvements either.
> Sure, but I'd like the jobs to be able to run at all..
> Is there any sense in rolling back to to 1.2.3 since that is known to
> work with OS X (its the one that comes with 10.5)? My only guess at
> this point is other OS X users are using non-tcpip communication, and
> the tcp stuff just doesn't work in 1.3.3.

Our production jobs are running with OpenMPI 1.3.2 on Infinband.
We have Linux clusters, not Mac OS X.
However, I ran OpenMPI 1.3.2 over TCP/IP on Gigabit Ethernet,
with HPL and other codes with no problem.
A lot of people use TCP/IP and GigE on Linux.
If anything, the problem would be specific to TCP/IP on Mac OS X.

Have you checked the system logs
(/var/log/messages or the Mac OS X equivalent)
on the nodes where the jobs fail?
Maybe the show some clue about what is going on.

In case you need to roll back to OpenMPI 1.2.X,
you may still get Torque support.
The oldest OpenMPi version I installed was 1.2.7,
and it had Torque support.
I found references to Torque support in OpenMPI as far back as
1.0.3, hence 1.2.3 should have it.

Gus Correa

> Thanks, Jody
> --
> Jody Klymak
> _______________________________________________
> users mailing list
> users_at_[hidden]