Jeff,
An update of what I did. Apparently, one of my lab mates installed another version of OpenMPI manually and it clashed with the OpenMPI I installed from the Ubuntu repository. I manually identified the files installed and deleted them. After I installed OpenMPI from Ubuntu repository, my "mpirun.openmpi" works!
Jeff,
Thanks for the suggestion. Been looking into it and although, I installed the same OpenMPI version. But somehow, another software (Discovery Studio) was installed on birg-desktop-10, causing the mpirun to be messed up (since Discovery Studio also install some kind of mpirun/mpiexec). I type "mpirun.openmpi --version" on birg-desktop-10, the output is:
####################################################
birg@birg-desktop-10:~$ mpirun.openmpi --version
mpirun.openmpi: symbol lookup error: mpirun.openmpi: undefined symbol: orted_cmd_line
####################################################
and when I type on other machine
####################################################
birg@birg-frontnode:~/Desktop/nfs_shared$ mpirun.openmpi --version
mpirun.openmpi (OpenRTE) 1.4.1
Report bugs to http://www.open-mpi.org/community/help/
####################################################
I am now uninstalling Discovery Studio and see whether it works or not.
Thanks again.--On Thu, Jul 15, 2010 at 7:15 PM, Jeff Squyres <jsquyres@cisco.com> wrote:
This usually means that you have mis-matched versions of Open MPI across your machines. Double check that you have the same version of Open MPI installed on all the machines that you'll be running (e.g., perhaps birg-desktop-10 has a different version?).
> _______________________________________________
On Jul 15, 2010, at 5:18 AM, TH Chew wrote:
> Hi all,
>
> I am setting up a 7+1 nodes cluster for MD simulation, specifically using GROMACS. I am using Ubuntu Lucid 64-bit on all machines. Installed gromacs, gromacs-openmpi, and gromacs-mpich from the repository. MPICH version of gromacs runs fine without any error. However, when I ran OpenMPI version of gromacs by
>
> ###########################################################################
> mpirun.openmpi -np 8 -wdir /home/birg/Desktop/nfs/ -hostfile ~/Desktop/mpi_settings/hostfile mdrun_mpi.openmpi -v
> ###########################################################################
>
> an error occur, something like this
>
> ###########################################################################
> [birg-desktop-10:02101] Error: unknown option "--daemonize"
> Usage: orted [OPTION]...
> Start an Open RTE Daemon
>
> --bootproxy <arg0> Run as boot proxy for <job-id>
> -d|--debug Debug the OpenRTE
> -d|--spin Have the orted spin until we can connect a debugger
> to it
> --debug-daemons Enable debugging of OpenRTE daemons
> --debug-daemons-file Enable debugging of OpenRTE daemons, storing output
> in files
> --gprreplica <arg0> Registry contact information.
> -h|--help This help message
> --mpi-call-yield <arg0>
> Have MPI (or similar) applications call yield when
> idle
> --name <arg0> Set the orte process name
> --no-daemonize Don't daemonize into the background
> --nodename <arg0> Node name as specified by host/resource
> description.
> --ns-nds <arg0> set sds/nds component to use for daemon (normally
> not needed)
> --nsreplica <arg0> Name service contact information.
> --num_procs <arg0> Set the number of process in this job
> --persistent Remain alive after the application process
> completes
> --report-uri <arg0> Report this process' uri on indicated pipe
> --scope <arg0> Set restrictions on who can connect to this
> universe
> --seed Host replicas for the core universe services
> --set-sid Direct the orted to separate from the current
> session
> --tmpdir <arg0> Set the root for the session directory tree
> --universe <arg0> Set the universe name as
> username@hostname:universe_name for this
> application
> --vpid_start <arg0> Set the starting vpid for this job
> --------------------------------------------------------------------------
> A daemon (pid 5598) died unexpectedly with status 251 while attempting
> to launch so we are aborting.
>
> There may be more information reported by the environment (see above).
>
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun.openmpi noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun.openmpi was unable to cleanly terminate the daemons on the nodes shown
> below. Additional manual cleanup may be required - please refer to
> the "orte-clean" tool for assistance.
> --------------------------------------------------------------------------
> birg-desktop-04 - daemon did not report back when launched
> birg-desktop-07 - daemon did not report back when launched
> birg-desktop-10 - daemon did not report back when launched
> ###########################################################################
>
> It is strange that it only happen on one of the compute node (birg-desktop-10). If I remove birg-desktop-10 from the hostfile, I can run OpenMPI gromacs successfully. Any idea?
>
> Thanks.
>
> --
> Regards,
> THChew
> users mailing list
> users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
jsquyres@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
Regards,
THChew