Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] " MPI can not open file?"
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-04-07 09:03:06


OMPI doesn't do anything wrt your file, so it can only be a question
of (a) is your file on the remote machine, and (b) what directory it
is in relative to where your process starts.

Try just running pwd with mpirun and see what directory you are in.
The you can ssh to that node and do an "ls" and see if the file is
there.

On Apr 7, 2009, at 6:21 AM, Bernhard Knapp wrote:

> Dear Ralph and other users
>
> I tried both versions with the relative path and with the -wdir
> option but in both cases the error is still the same. Additionally I
> tried to simply start the job in my home directory but it does not
> help either ... any other ideas?
>
> thx
> Bernhard
>
>
> [bknapp_at_quoVadis04 testSet]$ mpirun -np 8 -machinefile /home/bknapp/
> scripts/machinefile.txt mdrun -np 8 -nice 0 -s gromacsRuns/testSet/
> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr -o gromacsRuns/testSet/
> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.trr -c gromacsRuns/testSet/
> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.pdb -g gromacsRuns/testSet/
> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.log -e gromacsRuns/testSet/
> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.edr -v
>
> [bknapp_at_quoVadis04 testSet]$ mpirun -np 8 -machinefile /home/bknapp/
> scripts/machinefile.txt mdrun -np 8 -nice 0 -s
> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr -o
> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.trr -c
> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.pdb -g
> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.log -e
> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.edr -v -wdir /home/bknapp/
> gromacsRuns/testSet/
>
>
>
> Back Off! I just backed up
> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.log to ./
> #1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.log.15#
> Getting Loaded...
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode -1.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
>
> -------------------------------------------------------
> Program mdrun, VERSION 4.0.3
> Source code file: gmxfio.c, line: 736
>
> Can not open file:
> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr
> -------------------------------------------------------
>
> "My Brothers are Protons (Protons!), My Sisters are Neurons
> (Neurons)" (Gogol Bordello)
>
> Error on node 0, will try to stop all the nodes
> Halting parallel program mdrun on CPU 0 out of 8
>
> gcq#318: "My Brothers are Protons (Protons!), My Sisters are Neurons
> (Neurons)" (Gogol Bordello)
>
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 0 with PID 4313 on
> node 192.168.0.103 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
>
>
>
>
>
>
>
> Ralph wrote:
>
> I assume you are running in a non-managed environment and so are using
> ssh for your launcher? Could you tell us what version of OMPI you are
> using?
>
> The problem is that ssh drops you in your home directory, not your
> current working directory. Thus, the path to any file you specify must
> be relative to your home directory. Alternatively, you can specify the
> desired current working directory on the mpirun cmd line. Do a "man
> mpirun" to find the specific option.
>
> I'd have to check, but we may have corrected this in recent versions
> (or a soon-to-be-released one) so that we automatically move you to
> the cwd after the daemon is started. However, I know that we didn't do
> that in some earlier versions - perhaps in the 1.2.x series as well.
>
> Ralph
>
> On Apr 7, 2009, at 5:05 AM, Bernhard Knapp wrote:
>
>> Hi
>>
>> I am trying to get a parallel job of the gromacs software started.
>> MPI seems to boot fine but unfortunately it seems not to be able to
>> open a specified file although it is definitly in the directory
>> where the job is started. I also changed the file permissions to 777
>> but it does not affect the result. Any suggestions?
>>
>> cheers
>> Bernhard
>>
>>
>> [bknapp_at_quoVadis04 testSet]$ mpirun -np 8 -machinefile /home/
>> bknapp/
>> scripts/machinefile.txt mdrun -np 8 -nice 0 -s
>> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr -o
>> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.trr -c
>> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.pdb -g
>> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.log -e
>> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.edr -v
>> bknapp_at_192.168.0.103's password:
>> NNODES=8, MYRANK=1, HOSTNAME=quoVadis04
>> NNODES=8, MYRANK=3, HOSTNAME=quoVadis04
>> NNODES=8, MYRANK=7, HOSTNAME=quoVadis04
>> NNODES=8, MYRANK=0, HOSTNAME=quoVadis03
>> NNODES=8, MYRANK=4, HOSTNAME=quoVadis03
>> NNODES=8, MYRANK=6, HOSTNAME=quoVadis03
>> NODEID=4 argc=16
>> NNODES=8, MYRANK=2, HOSTNAME=quoVadis03
>> NODEID=1 argc=16
>> NODEID=3 argc=16
>> NODEID=7 argc=16
>> NODEID=2 argc=16
>> NODEID=6 argc=16
>> NODEID=0 argc=16
>>
>>
> --------------------------------------------------------------------------
>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>> with errorcode -1.
>>
>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>> You may or may not see output from other processes, depending on
>> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
>>
>> -------------------------------------------------------
>> Program mdrun, VERSION 4.0.3
>> Source code file: gmxfio.c, line: 736
>>
>> Can not open file:
>> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr
>> -------------------------------------------------------
>>
>> "I Need a Little Poison" (Throwing Muses)
>>
>> Error on node 0, will try to stop all the nodes
>> Halting parallel program mdrun on CPU 0 out of 8
>>
>> gcq#108: "I Need a Little Poison" (Throwing Muses)
>>
> --------------------------------------------------------------------------
>> mpirun has exited due to process rank 0 with PID 3777 on
>> node 192.168.0.103 exiting without calling "finalize". This may
>> have caused other processes in the application to be
>> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
>>
>>
>>
>>
>>
>> [bknapp_at_quoVadis04 testSet]$ ll
>> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr
>> -rwxrwxrwx 1 bknapp bknapp 6118424 2009-03-13 09:44
>> 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users