Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] 1.3 and --preload-files and --preload-binary
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2009-01-23 10:34:51


The preload-binary problem had to do with how we were resolving
relative path names before moving files. While fixing these bugs I
also cleaned up some error reporting mechanisms.

I believe that I have fixed both the --preload-binary and --preload-
files options in the trunk (r20331). If you want to test the patch
before the release I attached it to the ticket. The patch should
apply cleanly to the v1.3 release and SVN branch.
   https://svn.open-mpi.org/trac/ompi/ticket/1770

Let me know if you run into any more problems with this
functionality. There are a few places that still need to be cleaned
up, but I think this should work much better for you now. I'll file a
request to have this moved into the v1.3.1 release.

Thanks,
Josh

On Jan 22, 2009, at 1:49 PM, Doug Reeder wrote:

> Josh,
>
> It sounds like . is not in your path. That would prevent mpirun
> from seeing the binary in the current directory.
>
> Doug Reeder
> On Jan 22, 2009, at 10:48 AM, Josh Hursey wrote:
>
>> As a followup.
>>
>> I can confirm that --preload-files is not working as it should.
>>
>> I was able to use --preload-binary with a full path to the binary
>> without a problem though. The following commands worked fine
>> (where /tmp is not mounted on all machines):
>> shell$ mpirun -np 2 --preload-binary /tmp/hello
>> shell$ mpirun -np 2 -s /tmp/hello
>>
>> However if I referred directly to the binary in the current
>> directory I saw the same failure:
>> shell$ cd /tmp
>> shell$ mpirun -np 2 -s hello
>> ---------------------------------------------------------------------
>> -----
>> mpirun was unable to launch the specified application as it could
>> not find an executable:
>>
>> Executable: hello
>> Node: odin101
>>
>> while attempting to start process rank 0.
>> ---------------------------------------------------------------------
>> -----
>>
>>
>> I'll keep digging into this bug, and let you know when I have a
>> fix. I filed a ticket (below) that you can use to track the
>> progress on this bug.
>> https://svn.open-mpi.org/trac/ompi/ticket/1770
>>
>> Thanks again for the bug report, I'll try to resolve this soon.
>>
>> Josh
>>
>> On Jan 22, 2009, at 10:49 AM, Josh Hursey wrote:
>>
>>> The warning is to be expected if the file already exists on the
>>> remote side. Open MPI has a policy not to replace the file if it
>>> already exists.
>>>
>>> The segv is concerning. :/
>>>
>>> I will take a look and see if I can diagnose what is going on
>>> here. Probably in the next day or two.
>>>
>>> Thanks for the bug report,
>>> Josh
>>>
>>> On Jan 22, 2009, at 10:11 AM, Geoffroy Pignot wrote:
>>>
>>>> Hello,
>>>>
>>>> As you can notice , I am trying the work done on this new
>>>> release. preload-files and preload-binary options are very
>>>> interesting to me because I work on a cluster without any shared
>>>> space between nodes.
>>>> I tried those basically , but no success . You will find below
>>>> the error messages.
>>>> If I did things wrong, would it be possible to get simple
>>>> examples showing how these options work.
>>>>
>>>> Thanks
>>>>
>>>> Geoffroy
>>>>
>>>> /tmp/openmpi-1.3/bin/mpirun --preload-files hello.c --hostfile /
>>>> tmp/hostlist -np 2 hostname
>>>> -------------------------------------------------------------------
>>>> -------
>>>> WARNING: Could not preload specified file: File already exists.
>>>>
>>>> Fileset: /tmp/hello.c
>>>> Host: compil03
>>>>
>>>> Will continue attempting to launch the process.
>>>>
>>>> -------------------------------------------------------------------
>>>> -------
>>>> [compil03:26657] filem:rsh: get(): Failed to preare the request
>>>> structure (-1)
>>>> -------------------------------------------------------------------
>>>> -------
>>>> WARNING: Could not preload the requested files and directories.
>>>>
>>>> Fileset:
>>>> Fileset: hello.c
>>>>
>>>> Will continue attempting to launch the process.
>>>>
>>>> -------------------------------------------------------------------
>>>> -------
>>>> [compil03:26657] [[13938,0],0] ORTE_ERROR_LOG: Error in file
>>>> base/odls_base_state.c at line 127
>>>> [compil03:26657] [[13938,0],0] ORTE_ERROR_LOG: Error in file
>>>> base/odls_base_default_fns.c at line 831
>>>> [compil03:26657] *** Process received signal ***
>>>> [compil03:26657] Signal: Segmentation fault (11)
>>>> [compil03:26657] Signal code: Address not mapped (1)
>>>> [compil03:26657] Failing at address: 0x395eb15000
>>>> [compil03:26657] [ 0] /lib64/tls/libpthread.so.0 [0x395f80c420]
>>>> [compil03:26657] [ 1] /lib64/tls/libc.so.6(memcpy+0x3f)
>>>> [0x395ed718df]
>>>> [compil03:26657] [ 2] /tmp/openmpi-1.3/lib64/libopen-pal.so.0
>>>> [0x2a956b0a10]
>>>> [compil03:26657] [ 3] /tmp/openmpi-1.3/lib64/libopen-rte.so.0
>>>> (orte_odls_base_default_launch_local+0x55c) [0x2a955809cc]
>>>> [compil03:26657] [ 4] /tmp/openmpi-1.3/lib64/openmpi/
>>>> mca_odls_default.so [0x2a963655f2]
>>>> [compil03:26657] [ 5] /tmp/openmpi-1.3/lib64/libopen-rte.so.0
>>>> (orte_daemon_cmd_processor+0x57d) [0x2a9557812d]
>>>> [compil03:26657] [ 6] /tmp/openmpi-1.3/lib64/libopen-pal.so.0
>>>> [0x2a956b9828]
>>>> [compil03:26657] [ 7] /tmp/openmpi-1.3/lib64/libopen-pal.so.0
>>>> (opal_progress+0xb0) [0x2a956ae820]
>>>> [compil03:26657] [ 8] /tmp/openmpi-1.3/lib64/libopen-rte.so.0
>>>> (orte_plm_base_launch_apps+0x1ed) [0x2a95584e7d]
>>>> [compil03:26657] [ 9] /tmp/openmpi-1.3/lib64/openmpi/
>>>> mca_plm_rsh.so [0x2a95c3ed98]
>>>> [compil03:26657] [10] /tmp/openmpi-1.3/bin/mpirun [0x403330]
>>>> [compil03:26657] [11] /tmp/openmpi-1.3/bin/mpirun [0x402ad3]
>>>> [compil03:26657] [12] /lib64/tls/libc.so.6(__libc_start_main
>>>> +0xdb) [0x395ed1c4bb]
>>>> [compil03:26657] [13] /tmp/openmpi-1.3/bin/mpirun [0x402a2a]
>>>> [compil03:26657] *** End of error message ***
>>>> Segmentation fault
>>>>
>>>> And it's not better with --preload-binary . a.out_32
>>>>
>>>> compil03% /tmp/openmpi-1.3/bin/mpirun -s --hostfile /tmp/
>>>> hostlist -wdir /tmp -np 2 a.out_32
>>>> -------------------------------------------------------------------
>>>> -------
>>>> mpirun was unable to launch the specified application as it
>>>> could not find an executable:
>>>>
>>>> Executable: a.out_32
>>>> Node: compil02
>>>>
>>>> while attempting to start process rank 1.
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users