Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] About openmpi-mpirun
From: Min Zhu (min.zhu_at_[hidden])
Date: 2009-12-17 11:56:23


Hi, Jeroen,

Thanks a lot. Unfortunately I don't think I have got mpirun.lsf. The Dell company only asked us to use openmpi-mpirun as a wrapper script.

Cheers,

Min Zhu

-----Original Message-----
From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On Behalf Of Jeroen Kleijer
Sent: 17 December 2009 16:49
To: Open MPI Users
Subject: Re: [OMPI users] About openmpi-mpirun

Hi Min,

Sorry for the mixup but it's been a while since I've actually used LSF.
I've had a look at my notes and to use mpirun with LSF you should give
something like this:

bsub -a openmpi -n 16 "mpirun.lsf -x PATH -x LD_LIBRARY_PATH -x
MPI_BUFFER_SIZE \" ulimit -s unlimited ; ./wrf.exe \" "

(at least I think you should :) )

The mpirun.lsf is a wrapper provided by LSF and the -a openmpi tells
it to set the necessary openmpi environment varibales etc.

Kind regards,

Jeroen Kleijer

On Thu, Dec 17, 2009 at 5:32 PM, Min Zhu <min.zhu_at_[hidden]> wrote:
> Hi,
>
> This time the OUT file is
> --------------------------------
> Sender: LSF System <lavaadmin_at_compute-01>
> Subject: Job 667: <openmpi-mpirun "/bin/sh -c 'ulimit -s unlimited ; ./wrf.exe ' " > Exited
>
> Job <openmpi-mpirun "/bin/sh -c 'ulimit -s unlimited ; ./wrf.exe ' " > was submitted from host <seamus> by user <mzh>.
> Job was executed on host(s) <8*compute-01>, in queue <normal>, as user <mzh>.
>                            <8*compute-12>
> </home/mzh> was used as the home directory.
> </home/mzh/wrf-intel/test/em_real> was used as the working directory.
> Started at Thu Dec 17 18:27:58 2009
> Results reported at Thu Dec 17 18:28:04 2009
>
> Your job looked like:
>
> ------------------------------------------------------------
> # LSBATCH: User input
> openmpi-mpirun "/bin/sh -c 'ulimit -s unlimited ; ./wrf.exe ' "
> ------------------------------------------------------------
>
> Exited with exit code 2.
>
> Resource usage summary:
>
>    CPU time   :      0.07 sec.
>
> The output (if any) follows:
>
>
>
> PS:
>
> Read file <ERR> for stderr output of this job.
> -------------------------------------
>
> ERR file is,
> --------------------------------
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> -s: -c: line 0: unexpected EOF while looking for matching `''
> -s: -c: line 1: syntax error: unexpected end of file
> ------------------------------
>
> Cheers,
>
> Min Zhu
>
> -----Original Message-----
> From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On Behalf Of Jeroen Kleijer
> Sent: 17 December 2009 16:24
> To: Open MPI Users
> Subject: Re: [OMPI users] About openmpi-mpirun
>
> Hi Min,
>
> Seems like the command ulimit was executed 16 times but after that
> (the "; ./wrf.exe") was ignored.
> Could you give the following a try:
>
> bsub -e ERR -o OUT -n 16 "openmpi-mpirun \"/bin/sh -c 'ulimit -s
> unlimited ; ./wrf.exe ' \" "
>
> Somewhere, quoting goes wrong and I'm trying to figure out where....
>
> Kind regards,
>
> Jeroen Kleijer
>
> On Thu, Dec 17, 2009 at 5:09 PM, Min Zhu <min.zhu_at_[hidden]> wrote:
>> Hi, Jeroen,
>>
>> Here is the OUT file, ERR file is empty.
>>
>> --------------------------------------------------------------------------------------
>> Sender: LSF System <lavaadmin_at_compute-10>
>> Subject: Job 662: <openmpi-mpirun /bin/sh -c 'ulimit -s unlimited; ./wrf.exe ' > Done
>>
>> Job <openmpi-mpirun /bin/sh -c 'ulimit -s unlimited; ./wrf.exe ' > was submitted from host <seamus> by user <mzh>.
>> Job was executed on host(s) <8*compute-10>, in queue <normal>, as user <mzh>.
>>                            <8*compute-11>
>> </home/mzh> was used as the home directory.
>> </home/mzh/wrf-intel/test/run1> was used as the working directory.
>> Started at Thu Dec 17 17:37:09 2009
>> Results reported at Thu Dec 17 17:37:15 2009
>>
>> Your job looked like:
>>
>> ------------------------------------------------------------
>> # LSBATCH: User input
>> openmpi-mpirun /bin/sh -c 'ulimit -s unlimited; ./wrf.exe '
>> ------------------------------------------------------------
>>
>> Successfully completed.
>>
>> Resource usage summary:
>>
>>    CPU time   :      0.05 sec.
>>
>> The output (if any) follows:
>>
>> unlimited
>> unlimited
>> unlimited
>> unlimited
>> unlimited
>> unlimited
>> unlimited
>> unlimited
>> unlimited
>> unlimited
>> unlimited
>> unlimited
>> unlimited
>> unlimited
>> unlimited
>> unlimited
>>
>>
>> PS:
>>
>> Read file <ERR> for stderr output of this job.
>>
>> --------------------------------------------------
>>
>> Thanks,
>>
>> Min Zhu
>>
>>
>> -----Original Message-----
>> From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On Behalf Of Jeroen Kleijer
>> Sent: 17 December 2009 16:05
>> To: Open MPI Users
>> Subject: Re: [OMPI users] About openmpi-mpirun
>>
>> Hi Min
>>
>> Did you get any type of error message in the ERR or OUT files?
>> I don't have mpirun installed in the environment at the moment but
>> giving the following:
>>
>> bsub -q interq -I "ssh <hostname> /bin/sh -c 'ulimit -s unlimited ;
>> /bin/hostname ' "
>>
>> seems to work for me, so I'm kind of curious what the error message is
>> you're seeing.
>>
>> Kind regards,
>>
>> Jeroen Kleijer
>>
>> On Thu, Dec 17, 2009 at 4:41 PM, Min Zhu <min.zhu_at_[hidden]> wrote:
>>> Hi, Jeroen,
>>>
>>> Thanks for your reply. I tried the command bsub -e ERR -o OUT -n 16 "openmpi-mpirun /bin/sh -c 'ulimit -s unlimited; ./wrf.exe ' " and wrf.exe not executed.
>>>
>>> Cheers,
>>>
>>> Min Zhu
>>>
>>> -----Original Message-----
>>> From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On Behalf Of Jeroen Kleijer
>>> Sent: 17 December 2009 15:34
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] About openmpi-mpirun
>>>
>>> It's just that the "'s on the command line get parsed by LSF / bash
>>> (or whatever shell you use)
>>>
>>> If you wish to use it without the script you can give this a try:
>>> bsub -e ERR -o OUT -n 16 "openmpi-mpirun /bin/sh -c 'ulimit -s
>>> unlimited; ./wrf.exe ' "
>>>
>>> This causes to pass the whole string "openmpi-mpirun ...." to be
>>> passed as a single string / command to LSF.
>>> The second line between the single quotes is then passed as a single
>>> argument to /bin/sh which is run by openmpi-mpirun.
>>>
>>> Kind regards,
>>>
>>> Jeroen Kleijer
>>>
>>> On Thu, Dec 17, 2009 at 4:03 PM, Min Zhu <min.zhu_at_[hidden]> wrote:
>>>> Hi, Jeff,
>>>>
>>>> Your script method works for me. Thank you very much,
>>>>
>>>> Cheers,
>>>>
>>>> Min Zhu
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On
>>>> Behalf Of Jeff Squyres
>>>> Sent: 17 December 2009 14:56
>>>> To: Open MPI Users
>>>> Subject: Re: [OMPI users] About openmpi-mpirun
>>>>
>>>> This might be something you need to talk to Platform about...?
>>>>
>>>> Another option would be to openmpi-mpirun a script that is just a few
>>>> lines long:
>>>>
>>>> #!/bin/sh
>>>> ulimit -s unlimited
>>>> ./wrf.exe
>>>>
>>>>
>>>>
>>>> On Dec 17, 2009, at 9:40 AM, Min Zhu wrote:
>>>>
>>>>> Hi, Jeff,
>>>>>
>>>>> Thanks. For bsub -e ERR -o OUT -n 16 openmpi-mpirun /bin/sh -c "ulimit
>>>>> -s unlimited; ./wrf.exe", I tried and wrf.exe doesn't executed.
>>>>>
>>>>> Here is the content of openmpi-mpirun file, so maybe something needs
>>>> to
>>>>> be changed?
>>>>>
>>>>> ----------------------------------------------
>>>>> #!/bin/sh
>>>>> #
>>>>> #  Copyright (c) 2007 Platform Computing
>>>>> #
>>>>> # This script is a wrapper for openmpi mpirun
>>>>> # it generates the machine file based on the hosts
>>>>> # given to it by Lava.
>>>>> #
>>>>>
>>>>> usage() {
>>>>>         cat <<USEEOF
>>>>> USAGE:  $0
>>>>>         This command is a wrapper for mpirun (openmpi).  It can
>>>>>         only be run within Lava using bsub e.g.
>>>>>                 bsub -n # "$0 -np # {my mpi command and args}"
>>>>>
>>>>>         The wrapper will automatically generate the
>>>>>         machinefile used by mpirun.
>>>>>
>>>>>         NOTE:  The list of hosts cannot exceed 4KBytes.
>>>>> USEEOF
>>>>> }
>>>>>
>>>>> if [ x"${LSB_JOBFILENAME}" = x -o x"${LSB_HOSTS}" = x ]; then
>>>>>     usage
>>>>>     exit -1
>>>>> fi
>>>>>
>>>>> MYARGS=$*
>>>>> WORKDIR=`dirname ${LSB_JOBFILENAME}`
>>>>> MACHFILE=${WORKDIR}/mpi_machines
>>>>> ARGLIST=${WORKDIR}/mpi_args
>>>>>
>>>>> # Check if mpirun is in the PATH
>>>>> T=`which mpirun`
>>>>> if [ $? -ne 0 ]; then
>>>>>     echo "Error:  mpirun is not in your PATH."
>>>>>     exit -2
>>>>> fi
>>>>>
>>>>> echo "${MYARGS}" > ${ARGLIST}
>>>>> T=`grep -- -machinefile ${ARGLIST} |wc -l`
>>>>> if [ $T -gt 0 ]; then
>>>>>     echo "Error:  Do not provide the machinefile for mpirun."
>>>>>     echo "        It is generated automatically for you."
>>>>>     exit -3
>>>>> fi
>>>>>
>>>>> # Make the open-mpi machine file
>>>>> echo "${LSB_HOSTS}" > ${MACHFILE}.lst
>>>>> tr '\/ ' '\r\n' < ${MACHFILE}.lst > ${MACHFILE}
>>>>>
>>>>> MPIRUN=`which --skip-alias mpirun`
>>>>> ${MPIRUN} -x LD_LIBRARY_PATH -machinefile ${MACHFILE} ${MYARGS}
>>>>>
>>>>> exit $?
>>>>>
>>>>> ----------------------------------------------
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Min Zhu
>>>>>
>>>>> -----Original Message-----
>>>>> From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]]
>>>> On
>>>>> Behalf Of Jeff Squyres
>>>>> Sent: 17 December 2009 14:29
>>>>> To: Open MPI Users
>>>>> Subject: Re: [OMPI users] About openmpi-mpirun
>>>>>
>>>>> On Dec 17, 2009, at 9:15 AM, Min Zhu wrote:
>>>>>
>>>>> > Thanks for your reply. Yes, your mpirun command works for me. But I
>>>>> need to use bsub job scheduler. I wonder why
>>>>> > bsub -e ERR -o OUT -n 16 openmpi-mpirun "/bin/sh -c ulimit -s
>>>>> unlimited; ./wrf.exe" doesn't work.
>>>>>
>>>>> Try with different quoting...?  I don't know the details of the
>>>>> openmpi-mpirun script, but perhaps it's trying to exec the whole
>>>> quoted
>>>>> string as a single executable (which doesn't exist).  Perhaps:
>>>>>
>>>>> bsub -e ERR -o OUT -n 16 openmpi-mpirun /bin/sh -c "ulimit -s
>>>> unlimited;
>>>>> ./wrf.exe"
>>>>>
>>>>> That's a (somewhat educated) guess...
>>>>>
>>>>> --
>>>>>
>>>>> Jeff Squyres
>>>>> jsquyres_at_[hidden]
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> CONFIDENTIALITY NOTICE: This e-mail, including any attachments,
>>>> contains information that may be confidential, and is protected by
>>>> copyright. It is directed to the intended recipient(s) only.  If you
>>>> have received this e-mail in error please e-mail the sender by replying
>>>> to this message, and then delete the e-mail. Unauthorised disclosure,
>>>> publication, copying or use of this e-mail is prohibited.  Any
>>>> communication of a personal nature in this e-mail is not made by or on
>>>> behalf of any RES group company. E-mails sent or received may be
>>>> monitored to ensure compliance with the law, regulation and/or our
>>>> policies.
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Jeff Squyres
>>>> jsquyres_at_[hidden]
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> CONFIDENTIALITY NOTICE: This e-mail, including any attachments, contains information that may be confidential, and is protected by copyright. It is directed to the intended recipient(s) only.  If you have received this e-mail in error please e-mail the sender by replying to this message, and then delete the e-mail. Unauthorised disclosure, publication, copying or use of this e-mail is prohibited.  Any communication of a personal nature in this e-mail is not made by or on behalf of any RES group company. E-mails sent or received may be monitored to ensure compliance with the law, regulation and/or our policies.
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> CONFIDENTIALITY NOTICE: This e-mail, including any attachments, contains information that may be confidential, and is protected by copyright. It is directed to the intended recipient(s) only.  If you have received this e-mail in error please e-mail the sender by replying to this message, and then delete the e-mail. Unauthorised disclosure, publication, copying or use of this e-mail is prohibited.  Any communication of a personal nature in this e-mail is not made by or on behalf of any RES group company. E-mails sent or received may be monitored to ensure compliance with the law, regulation and/or our policies.
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> CONFIDENTIALITY NOTICE: This e-mail, including any attachments, contains information that may be confidential, and is protected by copyright. It is directed to the intended recipient(s) only.  If you have received this e-mail in error please e-mail the sender by replying to this message, and then delete the e-mail. Unauthorised disclosure, publication, copying or use of this e-mail is prohibited.  Any communication of a personal nature in this e-mail is not made by or on behalf of any RES group company. E-mails sent or received may be monitored to ensure compliance with the law, regulation and/or our policies.
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> CONFIDENTIALITY NOTICE: This e-mail, including any attachments, contains information that may be confidential, and is protected by copyright. It is directed to the intended recipient(s) only.  If you have received this e-mail in error please e-mail the sender by replying to this message, and then delete the e-mail. Unauthorised disclosure, publication, copying or use of this e-mail is prohibited.  Any communication of a personal nature in this e-mail is not made by or on behalf of any RES group company. E-mails sent or received may be monitored to ensure compliance with the law, regulation and/or our policies.
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users

CONFIDENTIALITY NOTICE: This e-mail, including any attachments, contains information that may be confidential, and is protected by copyright. It is directed to the intended recipient(s) only. If you have received this e-mail in error please e-mail the sender by replying to this message, and then delete the e-mail. Unauthorised disclosure, publication, copying or use of this e-mail is prohibited. Any communication of a personal nature in this e-mail is not made by or on behalf of any RES group company. E-mails sent or received may be monitored to ensure compliance with the law, regulation and/or our policies.