Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Adams Samuel D Contr AFRL/HEDR (Samuel.Adams.ctr_at_[hidden])
Date: 2006-04-12 13:14:02


I got it to work, but it didn't have anything to do with the environment
variables on the shell.

I am running CentOS 4.2, and the system has everything compiled with the GCC
3.4, and it also has GCC 4.0 installed. I was building ompi with GCC 4.0,
and I think it was having trouble with loading dynamic libraries since the
system is build with 3.4. I am not sure if that is exactly the case, but I
recompiled ompi with GCC 3.4, and only used gfortran for FC. After that
things seemed to be working properly.

Was my guess correct, or do you know the real reason why this is?

Also, does ompi have something similar to "lamboot" and "recon", or is the
only option is adding --hostfile or --host a,b to the mpirun command?

Sam Adams
General Dynamics - Network Systems
Phone: 210.536.5945

-----Original Message-----
From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On
Behalf Of Michael Kluskens
Sent: Monday, April 10, 2006 1:03 PM
To: Open MPI Users
Subject: Re: [OMPI users] job running question

You need to confirm that /etc/bashrc is actually being read in that
environment, bash is a little different on which files get read
depending on whether you login interactively or not.

Also, I don't think ~/.bashrc is read on a noninteractive login.

Michael

On Apr 10, 2006, at 1:06 PM, Adams Samuel D Contr AFRL/HEDR wrote:

> I put in /etc/bashrc and opened a new shell, but I still am not
> seeing any
> core files.
>
> Sam Adams
> General Dynamics - Network Systems
> Phone: 210.536.5945
>
> -----Original Message-----
> From: users-bounces_at_[hidden] [mailto:users-bounces_at_open-
> mpi.org] On
> Behalf Of Pavel Shamis (Pasha)
> Sent: Monday, April 10, 2006 8:56 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] job running question
>
> Mpirun opens separate shell on each machine/node, so the "ulimit" will
> not be available in new sheel. I think if you will add "ulimit -c
> unlimited" to you default shell configuration file (~/.bashrc in BASH
> case ant ~/.tcshrc in TCSH/CSH case) you will find your core files :)
>
> Regards,
> Pavel Shamis (Pasha)
>
> Adams Samuel D Contr AFRL/HEDR wrote:
>> I set bash to have unlimited size core files like this:
>>
>> $ ulimit -c unlimited
>>
>> But, it was not dropping core files for some reason when I was
>> running
> with
>> mpirun. Just to make sure it would do what I expected, I wrote a
>> little C
>> program that was kind of like this
>>
>> int ptr = 4;
>> fprintf(stderr,"bad! %s\n", (char*)ptr);
>>
>> That would give a segmentation fault. It dropped a core file like
>> you
> would
>> expect. Am I missing something?
>>
>> Sam Adams
>> General Dynamics - Network Systems
>> Phone: 210.536.5945
>>
>> -----Original Message-----
>> From: users-bounces_at_[hidden] [mailto:users-bounces_at_open-
>> mpi.org] On
>> Behalf Of Jeff Squyres (jsquyres)
>> Sent: Saturday, April 08, 2006 6:25 AM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] job running question
>>
>> Some process is exiting on a segv -- are you getting any corefiles?
>>
>> If not, can you increase your coredumpsize to unlimited? This should
>> let you get a corefile; can you send the backtrace from that
>> corefile?
>>
>>
>>> -----Original Message-----
>>> From: users-bounces_at_[hidden]
>>> [mailto:users-bounces_at_[hidden]] On Behalf Of Adams Samuel
>>> D Contr AFRL/HEDR
>>> Sent: Friday, April 07, 2006 11:53 AM
>>> To: 'users_at_[hidden]'
>>> Subject: [OMPI users] job running question
>>>
>>> We are trying to build a new cluster running OpenMPI. We
>>> were previous
>>> running LAM-MPI. To run jobs we would do the following:
>>>
>>> $ lamboot lam-host-file
>>> $ mpirun C program
>>>
>>> I am not sure if this works more or less the same way with
>>> ompi. We were
>>> trying to run it like this:
>>>
>>> $ [james.parker_at_Cent01 FORTRAN]$ mpirun --np 2 f_5x5 localhost
>>> mpirun noticed that job rank 1 with PID 0 on node "localhost"
>>> exited on
>>> signal 11.
>>> [Cent01.brooks.afmc.ds.af.mil:16124] ERROR: A daemon on node
>>> localhost
>>> failed to start as expected.
>>> [Cent01.brooks.afmc.ds.af.mil:16124] ERROR: There may be more
>>> information
>>> available from
>>> [Cent01.brooks.afmc.ds.af.mil:16124] ERROR: the remote shell
>>> (see above).
>>> [Cent01.brooks.afmc.ds.af.mil:16124] The daemon received a signal
>>> 11.
>>> 1 additional process aborted (not shown)
>>> [james.parker_at_Cent01 FORTRAN]$
>>>
>>> We have ompi installed to /usr/local, and these are our environment
>>> variables:
>>>
>>> [james.parker_at_Cent01 FORTRAN]$ export
>>> declare -x COLORTERM="gnome-terminal"
>>> declare -x
>>> DBUS_SESSION_BUS_ADDRESS="unix:abstract=/tmp/dbus-sfzFctmRFS"
>>> declare -x DESKTOP_SESSION="default"
>>> declare -x DISPLAY=":0.0"
>>> declare -x GDMSESSION="default"
>>> declare -x GNOME_DESKTOP_SESSION_ID="Default"
>>> declare -x GNOME_KEYRING_SOCKET="/tmp/keyring-x8WQ1E/socket"
>>> declare -x
>>> GTK_RC_FILES="/etc/gtk/gtkrc:/home/BROOKS-2K/james.parker/.gtk
>>> rc-1.2-gnome2"
>>> declare -x G_BROKEN_FILENAMES="1"
>>> declare -x HISTSIZE="1000"
>>> declare -x HOME="/home/BROOKS-2K/james.parker"
>>> declare -x HOSTNAME="Cent01"
>>> declare -x INPUTRC="/etc/inputrc"
>>> declare -x KDEDIR="/usr"
>>> declare -x LANG="en_US.UTF-8"
>>> declare -x LD_LIBRARY_PATH="/usr/local/lib:/usr/local/lib/openmpi"
>>> declare -x LESSOPEN="|/usr/bin/lesspipe.sh %s"
>>> declare -x LOGNAME="james.parker"
>>> declare -x
>>> LS_COLORS="no=00:fi=00:di=00;34:ln=00;36:pi=40;33:so=00;35:bd=
>>> 40;33;01:cd=40
>>> ;33;01:or=01;05;37;41:mi=01;05;37;41:ex=00;32:*.cmd=00;32:*.ex
>>> e=00;32:*.com=
>>> 00;32:*.btm=00;32:*.bat=00;32:*.sh=00;32:*.csh=00;32:*.tar=00;
>>> 31:*.tgz=00;31
>>> :*.arj=00;31:*.taz=00;31:*.lzh=00;31:*.zip=00;31:*.z=00;31:*.Z
>>> =00;31:*.gz=00
>>> ;31:*.bz2=00;31:*.bz=00;31:*.tz=00;31:*.rpm=00;31:*.cpio=00;31
>>> :*.jpg=00;35:*
>>> .gif=00;35:*.bmp=00;35:*.xbm=00;35:*.xpm=00;35:*.png=00;35:*.t
>>> if=00;35:"
>>> declare -x MAIL="/var/spool/mail/james.parker"
>>> declare -x
>>> OLDPWD="/home/BROOKS-2K/james.parker/build/SuperLU_DIST_2.0"
>>> declare -x
>>> PATH="/usr/kerberos/bin:/usr/local/bin:/usr/bin:/bin:/usr/X11R
>>> 6/bin:/home/BR
>>> OOKS-2K/james.parker/bin:/usr/local/bin"
>>> declare -x
>>> PERL5LIB="/usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-mul
>>> ti:/usr/lib/pe
>>> rl5/site_perl/5.8.5"
>>> declare -x
>>> PWD="/home/BROOKS-2K/james.parker/build/SuperLU_DIST_2.0/FORTRAN"
>>> declare -x
>>> SESSION_MANAGER="local/Cent01.brooks.afmc.ds.af.mil:/tmp/.ICE-
>>> unix/14516"
>>> declare -x SHELL="/bin/bash"
>>> declare -x SHLVL="2"
>>> declare -x SSH_AGENT_PID="14541"
>>> declare -x SSH_ASKPASS="/usr/libexec/openssh/gnome-ssh-askpass"
>>> declare -x SSH_AUTH_SOCK="/tmp/ssh-JUIxl14540/agent.14540"
>>> declare -x TERM="xterm"
>>> declare -x USER="james.parker"
>>> declare -x WINDOWID="35651663"
>>> declare -x XAUTHORITY="/home/BROOKS-2K/james.parker/.Xauthority"
>>> [james.parker_at_Cent01 FORTRAN]$
>>>
>

_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users