I'm having trouble getting OpenMPI to set the working directory properly when running jobs on a Linux cluster.  I made a test program (at end of post) that recreates the problem pretty well by just printing out the results of getcwd().  Here's output both with and without using -wdir:

(merle):~$ cd test
(merle):test$ mpirun -np 2 test
before MPI_Init:
PWD: /home/tgamblin
getcwd: /home/tgamblin
before MPI_Init:
PWD: /home/tgamblin
getcwd: /home/tgamblin
after MPI_Init:
PWD: /home/tgamblin
getcwd: /home/tgamblin
after MPI_Init:
PWD: /home/tgamblin
getcwd: /home/tgamblin
(merle):test$ mpirun -np 2 -wdir /home/tgamblin/test test
before MPI_Init:
PWD: /home/tgamblin
getcwd: /home/tgamblin
before MPI_Init:
PWD: /home/tgamblin
getcwd: /home/tgamblin
after MPI_Init:
PWD: /home/tgamblin
getcwd: /home/tgamblin
after MPI_Init:
PWD: /home/tgamblin
getcwd: /home/tgamblin

Shouldn't these print out /home/tgamblin/test?  Also, this is even stranger:

(merle):test$ mpirun -np 2 pwd
/home/tgamblin/test
/home/tgamblin/test

I feel like my program should output the same thing as pwd.

I'm using OpenMPI 1.2.6, and the cluster has 8 nodes, with 2-by dual-core woodcrests each (total 32 cores).  There are 2 tcp networks on this cluster, one that the head node uses to talk to the compute nodes and one (Gigabit) network that the compute nodes can reach each other (but not the head node) on.  I have "btl_tcp_if_include = eth2" in my mca params file to keep the compute nodes using the fast interconnect to talk to each other, and I've pasted ifconfig output for the head node and for one compute node below.  Also, if it helps, the home directories on this machine are mounted via autofs.

This is causing problems b/c I'm using apps that look for the config file in the working directory. Please let me know if you guys have any idea what's going on.

Thanks!
-Todd


TEST PROGRAM:
#include "mpi.h"
#include <cstdlib>
#include <iostream>
#include <sstream>
using namespace std;

void testdir(const char*where) {
  char buf[1024];
  getcwd(buf, 1024);

  ostringstream tmp;
  tmp << where << ":" << endl
      << "\tPWD:\t"<< getenv("PWD") << endl
      << "\tgetcwd:\t"<< getenv("PWD") << endl;
  cout << tmp.str();
}

int main(int argc, char **argv) {
  testdir("before MPI_Init");
  MPI_Init(&argc, &argv);
  testdir("after MPI_Init");
  MPI_Finalize();
}

HEAD NODE IFCONFIG:
eth0      Link encap:Ethernet  HWaddr 00:18:8B:2F:3D:90  
          inet addr:10.6.1.1  Bcast:10.6.1.255  Mask:255.255.255.0
          inet6 addr: fe80::218:8bff:fe2f:3d90/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1579250319 errors:0 dropped:0 overruns:0 frame:0
          TX packets:874273636 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:2361367146846 (2.1 TiB)  TX bytes:85373933521 (79.5 GiB)
          Interrupt:169 Memory:f4000000-f4011100 

eth0:1    Link encap:Ethernet  HWaddr 00:18:8B:2F:3D:90  
          inet addr:10.6.2.1  Bcast:10.6.2.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:169 Memory:f4000000-f4011100 

eth1      Link encap:Ethernet  HWaddr 00:18:8B:2F:3D:8E  
          inet addr:152.54.1.21  Bcast:152.54.3.255  Mask:255.255.252.0
          inet6 addr: fe80::218:8bff:fe2f:3d8e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:14436523 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7357596 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:2354451258 (2.1 GiB)  TX bytes:2218390772 (2.0 GiB)
          Interrupt:169 Memory:f8000000-f8011100 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:540889623 errors:0 dropped:0 overruns:0 frame:0
          TX packets:540889623 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:63787539844 (59.4 GiB)  TX bytes:63787539844 (59.4 GiB)


COMPUTE NODE IFCONFIG:
(compute-0-0):~$ ifconfig
eth0      Link encap:Ethernet  HWaddr 00:13:72:FA:42:ED  
          inet addr:10.6.1.254  Bcast:10.6.1.255  Mask:255.255.255.0
          inet6 addr: fe80::213:72ff:fefa:42ed/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:200637 errors:0 dropped:0 overruns:0 frame:0
          TX packets:165336 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:187105568 (178.4 MiB)  TX bytes:26263945 (25.0 MiB)
          Interrupt:169 Memory:f8000000-f8011100 

eth2      Link encap:Ethernet  HWaddr 00:15:17:0E:9E:68  
          inet addr:10.6.2.254  Bcast:10.6.2.255  Mask:255.255.255.0
          inet6 addr: fe80::215:17ff:fe0e:9e68/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:20 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1280 (1.2 KiB)  TX bytes:590 (590.0 b)
          Base address:0xdce0 Memory:fc3e0000-fc400000 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:65 errors:0 dropped:0 overruns:0 frame:0
          TX packets:65 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:4376 (4.2 KiB)  TX bytes:4376 (4.2 KiB)